Achieving Clean Code can get harder when you go down to the nuts and bolts. Bit-manipulating code is very common when you go “low level” with your coding, usually with C. It may be needed when you manipulate HW registers with a certain bit structure, when you want to compress data elements to occupy bits instead of double words or bytes, or for a number of other reasons.
Imagine a register that is used to set the speed for an imaginary communication device. The register can be set with RX speed, TX speed, a “go” bit to propagate the changes and a “ready” bit to indicate when our last “set” operation is done propagating and the HW is ready to work with the new settings. More formally, it can be put like this:
The C representation of this register can potentially look like this:
#define COMM_SPEED_REG_RX_SPEED_MASK 0x0000000F
#define COMM_SPEED_REG_TX_SPEED_MASK 0x000000F0
#define COMM_SPEED_REG_TX_SPEED_SHIFT_R 0x00000008
#define COMM_SPEED_REG_GO_BIT_MASK 0x00000100
#define COMM_SPEED_REG_READY_BIT_MASK 0x80000000
This Can later be used to perform read/write operations on the register, like this:
#define COMM_SPEED_REG_ADDRESS 0x00010000
volatile unsigned int * p_comm_speed_reg =
(uint32_t*)COMM_SPEED_REG_ADDRESS;
unsigned int comm_speed_reg_value = *p_comm_speed_reg;
unsigned int rx_speed = comm_speed_reg_value
& COMM_SPEED_REG_RX_SPEED_MASK;
unsigned int tx_speed = (comm_speed_reg_value & COMM_SPEED_REG_TX_SPEED_MASK) >> COMM_SPEED_REG_TX_SPEED_SHIFT_R;
//Double logical “not” will make it either a
//zero or a one (as opposed to zero VS. non-zero)
unsigned int is_ready = !!(comm_speed_reg_value & COMM_SPEED_REG_READY_BIT_MASK);
Write operations can be done similarly (although note that the entire register will be written on each write operation, you cannot write just specific bits).
Bitfield notation
As far as I know, all modern C compilers support the Bitfield Notation. The register structure above can be expressed in that notation as follows:
typedef struct {
uint32_t rx_speed : 8; // Field size is 8 bits
uint32_t tx_speed : 8;
uint32_t go : 1;
uint32_t reserved : 14;
uint32_t ready : 1;
}COMM_SPEED_REG;
COMM_SPEED_REG comm_speed_reg =
*((COMM_SPEED_REG*)COMM_SPEED_REG_ADDRESS);
uint32_t rx_speed = comm_speed_reg.rx_speed;
uint32_t tx_speed = comm_speed_reg.tx_speed;
uint32_t is_ready = comm_speed_reg.ready;
Generally speaking, Bit Field notation is superior
When discussing static masks (as opposed to masks that are dynimacally calculated at runtime), in my opinion the Bit Field notation should almost always be used for a Clean Code. Using Bit Field notation has the following advantages over static Mask notation:
- It is by far more readable and clear, and encompasses both the field “isolation” (e.g. instead of masking with COMM_SPEED_REG_TX_SPEED_MASK above) and the field’s shift (e.g. COMM_SPEED_REG_TX_SPEED_SHIFT_R above)
- You get more help from the compiler in case of assignment errors – you cannot assig a 8-bit “char” into a 7-bit field.
- With masks, in case you want to change something (e.g. swap the locations of the RX and TX speed fields above), you may find yourself changing several elements in an inter-dependent way (e.g. if you have two “shift” #define-s similar to COMM_SPEED_REG_TX_SPEED_SHIFT_R above). In general, the mask #define definitions are generally inter-dependent.It is much safer to make changes to struct Bit Field notation.
The register definition struct can be use for typed function argument passing. Again, more complier-assisted protection.
Exceptions
I have encountered a case in which using masks notation may be considered preferable to Bit Fields, but even that not with a noticeable margin, if at all. I wanted to initialize an array of elements, with each element describing a masked write to a register (i.e. only bits set in the mask should be modified in the register). The element structure was roughly that:
typedef struct
{
uint32_t reg_address;
uint32_t reg_value;
uint32_t mask;
}ELEMENT;
Assume an example register structure like COMM_SPEED_REG above, an example one-element array may be initialized like so:
ELEMENT elements[] = {
{.reg_address = 0x1000,
//Set both RX and TX speeds to “0xF”, whatever that may mean
.reg_value = 0x00000F0F,
//Only set the speed fields, but not the “go” field
.mask = RX_SPEED_FIELD_MASK | TX_SPEED_FIELD_MASK
}
};
RX_SPEED_FIELD_MASK has to be defined here to 0xFF, and TX_SPEED_FIELD_MASK to 0xFF00 (compare to the struct definition above)
Note that a dangerous inter-dependency exists between the struct definition and the mask definitions here. One way to solve it may be to auto-generate all those definitions from a single source (refer to other posts where I show some auto-generation techniques)
So, in this example, statically defined masks may be harder to avoid.
Future directions
One constraint in the struct definition above was that in order to have a general data structure that supports many types of registers (and their different structure) we have to stick to the general uint32_t type to describe their value and mask. In a future post I will present a method that can alleviate this pain as well. Given that, the use of statically defined masks gets even more questionable.
Note
Note that across that post I keep saying “statically defined” masks. Of course, if you need to compute a mask dynamically at run-time, Bit Field notation will not help you, and you are forced to use mask notations to perform operations with your computed masks. One example to this may be a function that extracts an indexed bit from a Byte (e.g. check if bit 3 ise set or cleared). You can use shit operations in order to construct a mask, and then “and” it with the Byte at hand. You cannot do that with Bit Field notation.
Final Thoughts
I have seen code that uses masks to describe a byte-granularity or even dword-granularity data structures, like below. This is to work on fields like in the struct ELEMENT above:
#define FIELD_A_OFFSET 0X0
#define FIELD_B_OFFSET 0X4
#define FIELD_C_OFFSTE 0X8
I simply see no reason to do that. In my view this is a second order of going against Clean Code. C has given us the proper construct to describe memory layouts, it should be used.
6 Comments
JryKMfUIzF
GATKtqNgr
xGQMZPiFmtKn
gpPMyxsKOAatuJrZ
hSVFvsPKm
UJvdgSZhzQHMiwXx