Type erasure in C++

Home » Type erasure in C++

Type erasure in C++

What is it?

Type erasure is the technique used to hide the type meta-data of some data. When we store and manage data in such a way, the compiler is no longer to guess the type of that blob of data.

As an example when we cast some data type into __void *__, we are doing type erasure. After we pass the data through __void *__ the compiler will no longer be able to deduce the type of data pointed by it.

Why and how?

First, let’s get to the why.

  • Have you ever used a function that has to get any possible type?
  • Have you ever used a container that had to contain any possible type?

The answer is most likely yes, even if you don’t remember it! If you ever used __pthread__, __std::thread__, __std::function__, __std::variant__ or ultimately __std::any__ then the answer is YES!

So What is common here leading us to use the same pattern?

Why?

Polymorphism is the answer. In polymorphism, we tread a range of different data in the same way regardless of their detailed implementation. One form of type erasure is when we use inheritance to implement polymorphism.

The moment we pass the address of the derived class to a pointer (or reference) of the base class, we are discarding some type of info about properties specific to the derived class and from there on, the compiler is not able to deduce them. It can only deduce the type of data it saved for runtime (with vtables, etc). And it did so only because through the use of inheritance keywords we guaranteed that they will all have it.

How?  

Let’s consider the most basic form of type erasure: Passing as a void pointer.

First, let’s see what is it supposed to do. Easy! It helps us contain and pass around any type we desire. Now let’s address its shortcomings and solve them.

  1. Type safety is broken
  2. Memory safety is threatened
  3. We actually have no way of checking if the type we are casting is the type we have received

 1. Type safety

				
					class foo
{
private:
    int Id;
    
    ...
};
				
			
Then we have a function that is supposed to receive any type with a void pointer:
				
					void bar(void * Argument)
{
    /* Here we are breaking the type safety */

    static_cast<foo *>(Argument) ...
}
				
			

Here, instead of passing an instance of foo, if we pass any other type, what will happen? The compiler will not be able to detect and prevent us from doing so and we will most likely cause some form of runtime error. In another word, we broke the guarantee of safely using types. Because we can easily, cast any type to any other one in __bar__ function and if we do it wrong, there is no way we can know until we get some exceptions in runtime.

So what we want is to be able to check if we are casting the pointer to the right type. If not, prevent us from using it through some exceptions later on.

The mechanism for runtime info in c++ is through __typeid__ keyword.

				
					std::type_info const &Info = typeid(int);

char const * Name = Info.name();
size_t Hash = Info.hash_code();
				
			

type_info structure for each type is built during compile as static const and typeid only return a const reference to it. The only function we care about right now is __hash_code__. It returns the hash of its name (it is a little bit different than the name we use).

We can use it to store type info alongside data. Later on, we can use it to verify if we are doing the right cast or not. Notice that we cannot use this type of info to know what type to cast to. However, we can use it to know if a certain cast is allowed.

To do so properly, we need to create a container first, let’s call it __Any__ .

				
					class Any
{
private:
    void * Object = nullptr;
    size_t Hash = 0;
};
				
			

So let's put all of what we know together and solve the second problem:

				
					class Any
{
private:
    void * Object = nullptr;
    size_t Hash = 0;

public:
    template <typename T>
    inline T * Cast()
    {
        /* Ensure type safety */
        if(typeid(T).hash_code() != Hash)
            throw std::bad_cast();

        return reinterpret_cast<T *>(Object);
    }
};
				
			

Now we actually need to store the object itself in the container. To do so we need to use dynamic memory (heap) since the size of the object is unknown to us. Here again, we take advantage of templates and we use perfect forwarding to do it with the least amount of copying.

				
					...
public:
   template<typename T>
   Any(T&&t): Object(new T(std::forward<T>(t))), Hash(typeid(T).hash_code()){}

...
				
			

If we take a close look we see that the third issue is automatically fixed as well.

Here we encounter the second problem:

 2. Memory safety

Now that we contain the object and can access it in a type-safe manner, we have to find a workaround for the memory problem. Since here we are storing a void pointer when our object goes out of scope, the resource tied to it will be left dangling without the destructor being called and we get memory leak problems especially if the object owns some heap memory chunks. The obvious idea that comes to mind is: can’t we just store the address to its destructor in a function pointer and call it later on like below?

				
					void (*Destructor)(void*) = &T::~T;
				
			

The problem is that constructors and destructors are special functions and during the compilation, they might be inlined, I.e. they might not exist in the binary! therefore we cannot take their address and store it in a function pointer because they might not exist in runtime!

The solution lies in C++ Lambda expressions also known as the swiss army knife of C++.

C++ Lambdas has so many properties that help solve a lot of hard problems in C++ in a dynamic way. The one we are interested right now is the fact that lambda expressions with an empty capture, can decay to function pointers or in other words, are just normal functions:

				
					void (*FunctionPointer)() = [](){ std::cout << "Hello world\n"; };
				
			

We might not be able to get the address of the destructor but we can force the compiler to create a function like our object’s destructor like this:

				
					void (*Destructor)(void const *) = [](void const * Object){ delete static_cast<T const *>(Object); };

/* And later on in the destructor */

...
    ~Any()
    {
        if (Destructor)
        {
            Destructor(Object);
            Destructor = nullptr;
        }
    }
...
				
			

Now we can call this function pointer on destruction to call the object’s destructor and free up the memory it occupied at the same time.

One thing that you have probably noticed is casting to const void pointer and you might ask since we are mutating the object won’t we get an error? Well, no! Because if so, how do const objects in your normal code clean up their resource when they go out of scope? As said before, constructor and destructors are special functions so the const qualifier actually mean different things to them.

So let's put all of what we know together and solve the second problem:

				
					class Any
{
private:
    void * Object = nullptr;
    void (*Destructor)(void const *) = nullptr;
    size_t Hash = 0;

public:
    template<typename T>
    Any(T&& t) : Object(new T(std::forward<T>(t))),
                 Destructor([](void const * Object){ delete static_cast<T const *>(Object); }),
                 Hash(typeid(T).hash_code()) {}

    ~Any()
    {
        if (Destructor)
        {
            Destructor(Object);
            Destructor = nullptr;
        }
    }

    template <typename T>
    inline T * Cast()
    {
        /* Ensure type safety */
        if(typeid(T).hash_code() != Hash)
            throw std::bad_cast();

        return reinterpret_cast<T *>(Object);
    }
};
				
			

Now, with the help of type erasure, we have a very basic container capable of containing any type we pass it. We can, later on, expand on it and add functionality.

 3. Standard containers

There are a lot of standard C++ containers which use type erasure in different ways to achieve different results. Some of them are listed below:

  • std::function –  Will contain any type having a call operator with the requested signature
  • std::variant – Will contain one of the types specified in the template
  • std::any – Will contain any type

Note that all these containers have some limitations on the types they can get. For example or container can work if this line works:

				
					`template<typename T>
    Any(T&& t) : Object(new T(std::forward<T>(t))), ...
				
			

I.e. if the object is either move constructible or copy constructible, based on the type passed in the construction time of our container. You can find the limitations for standard containers online.

 3. Conclusion

Type erasure is a very useful technique that can add flexibility to our code. A lot of standard containers use this technique. We started from the simplest form, briefly explored it with inheritance, and implemented our container. We can enforce different limitations to these containers to only contain some pre-specified types of objects.

See other articles:

OpenRC

OpenRC is a dependency-based init system that was designed to work with Unix-like computer operating systems. It keeps compatibility with the system-provided init system,

Read More »

Yocto Project

Yocto Project is an open source community project that helps developers to create customized systems based on Linux. It has an accessible toolset that

Read More »
software testing

Software testing

The investigation of artifacts and the behavior of the software under test is known as software testing. It also determines whether the actual results

Read More »
component placement

Component placement  

Component placement is one of the most critical parts of PCB design. First, you must understand the fundamental criteria for arranging components on a

Read More »

Linux System Programming

This article focuses on Linux system calls and other low-level operations, such as the C library’s functions. System programming is the process of creating system software, which

Read More »
emc testing

EMC Testing

When an electrical product is tested for electromagnetic compatibility (EMC) it is determined whether or not it will perform satisfactorily in the electromagnetic environment

Read More »

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.