In the first part of this post, a “classic” example of Polymorphism-like code in C was discussed. Let’s look at some production code I had a chance to refactor. After the refctoring, the code looks like another step forward towards the Polymorphic behavior of real OOP languages. For obvious reasons, I have modified the example such that it is more focused.
The original code used an array of object types. For each one, an array of instances was allocated. The “Type structure” looked like that:
And, there was an array of those “descriptor” objects plus an enum to enumerate the different types:
So, this is a visualization of how this entire thing looked like if NUM_OF_TYPES equals 4:
At times, that entire data structure had to be destroyed and freed. Each object had to be gracefully destroyed by using a function a function that specializes in destroying that specific type. Below are the prototypes of the functions for destroying types A and B. You can see that each function also receives a pointer to the specific type as an argument, so every destroyer function is different.
You can figure out the rest (about a dozen of them).
In order to destroy the entire data structure, a for loop is used that iterates over the type descriptors array, then decends into each descriptor’s element array, iterating over it and destroying each element:
A Polymorphic version of the same code
Using the following transformation, we can create general code that frees the objects without “manual” if-else clauses:
TYPE_DESCRIPTOR will contain a pointer to a common function pointer prototype. Each TYPE_DESCRIPTOR instance’s pointer will be initialized with the proper destructor function. The destruction loop will call the pointed function relevant to that type.
First, we need to define the prototype for ‘a’ destructor function:
Using this general prototype, we can create the specific destructor functions, each dealing with different data structure and logic:
And so on.
Now we can re-define TYPE_DESCRIPTOR:
So now, our data structure looks more like this:
And finally, our loop collapses to be just that:
Pros and Cons
The Polimorphic version of the code is obviously more elegant and removes a lot of code repetition. The original code contains a hidden dependency – the order enum defining the order of the objects, and initialization order of the object_descriptor_array, must be the same! Imagine someone do this one day:
Without changing the initialization of object_descriptor_array appropriately, the code is broken. It will match the OBJECT_TYPE_A destructor with OBJECT_TYPE_B instances, and vice versa.
This cannot happen in the new version of the code
Generally speaking, code that involves pointers to functions may suffer from the side effects like:
- Harder to follow and predict since functionality can change on the go.
- Harder to debug, as you can do less by just reading the code.
- Some static analysis may not be possible (e.g. Stack usage analysis)
So I think it boils down to good measure, like in many cases. In the example above, I think the gain worth the cost. The usage of function pointers is contained, allows several benefits.
Potentially, this can be taken to the next level in which each instance has its own function pointer. It is not possible to have a C array with different element sizes, but if for example a linked list is used, a next level of Polymorphism can be achieved. In general, it starts to feel like closing in on how an OOP “under the hood” implementation may look like. Is it recommended? Not necessarily. It gets more and more complex, and you have to weigh in that comlexity against what you gain. You may end up with something harder to maintain and debug than the code you started with. If you find yourself in that position, you may want to consider C++…