Achieving Clean Code can get harder when you go down to the nuts and bolts. Bit-manipulating code is very common when you go “low level” with your coding, usually with C. It may be needed when you manipulate HW registers with a certain bit structure, when you want to compress data elements to occupy bits instead of double words or bytes, or for a number of other reasons.
Imagine a register that is used to set the speed for an imaginary communication device. The register can be set with RX speed, TX speed, a “go” bit to propagate the changes and a “ready” bit to indicate when our last “set” operation is done propagating and the HW is ready to work with the new settings. More formally, it can be put like this:
The C representation of this register can potentially look like this:
This Can later be used to perform read/write operations on the register, like this:
Write operations can be done similarly (although note that the entire register will be written on each write operation, you cannot write just specific bits).
As far as I know, all modern C compilers support the Bitfield Notation. The register structure above can be expressed in that notation as follows:
Generally speaking, Bit Field notation is superior
When discussing static masks (as opposed to masks that are dynimacally calculated at runtime), in my opinion the Bit Field notation should almost always be used for a Clean Code. Using Bit Field notation has the following advantages over static Mask notation:
- It is by far more readable and clear, and encompasses both the field “isolation” (e.g. instead of masking with COMM_SPEED_REG_TX_SPEED_MASK above) and the field’s shift (e.g. COMM_SPEED_REG_TX_SPEED_SHIFT_R above)
- You get more help from the compiler in case of assignment errors – you cannot assig a 8-bit “char” into a 7-bit field.
- With masks, in case you want to change something (e.g. swap the locations of the RX and TX speed fields above), you may find yourself changing several elements in an inter-dependent way (e.g. if you have two “shift” #define-s similar to COMM_SPEED_REG_TX_SPEED_SHIFT_R above). In general, the mask #define definitions are generally inter-dependent.It is much safer to make changes to struct Bit Field notation.
The register definition struct can be use for typed function argument passing. Again, more complier-assisted protection.
I have encountered a case in which using masks notation may be considered preferable to Bit Fields, but even that not with a noticeable margin, if at all. I wanted to initialize an array of elements, with each element describing a masked write to a register (i.e. only bits set in the mask should be modified in the register). The element structure was roughly that:
Assume an example register structure like COMM_SPEED_REG above, an example one-element array may be initialized like so:
RX_SPEED_FIELD_MASK has to be defined here to 0xFF, and TX_SPEED_FIELD_MASK to 0xFF00 (compare to the struct definition above)
Note that a dangerous inter-dependency exists between the struct definition and the mask definitions here. One way to solve it may be to auto-generate all those definitions from a single source (refer to other posts where I show some auto-generation techniques)
So, in this example, statically defined masks may be harder to avoid.
One constraint in the struct definition above was that in order to have a general data structure that supports many types of registers (and their different structure) we have to stick to the general uint32_t type to describe their value and mask. In a future post I will present a method that can alleviate this pain as well. Given that, the use of statically defined masks gets even more questionable.
Note that across that post I keep saying “statically defined” masks. Of course, if you need to compute a mask dynamically at run-time, Bit Field notation will not help you, and you are forced to use mask notations to perform operations with your computed masks. One example to this may be a function that extracts an indexed bit from a Byte (e.g. check if bit 3 ise set or cleared). You can use shit operations in order to construct a mask, and then “and” it with the Byte at hand. You cannot do that with Bit Field notation.
I have seen code that uses masks to describe a byte-granularity or even dword-granularity data structures, like below. This is to work on fields like in the struct ELEMENT above:
I simply see no reason to do that. In my view this is a second order of going against Clean Code. C has given us the proper construct to describe memory layouts, it should be used.