Log in

View Full Version : Extracting Individual bits in C


Noah
08-01-2004, 06:49
Since I didn't know how to do this, I thought some people might want to know.

First off, the code:

int some_var=5; /* the variable we will be extracting a bit from. */
int n=3; /* the position of the bit we want */

the_bit = (( some_var & (1 << (n-1) ) ) ? 1 : 0 );

That's it!
Yes folks, a left shift, some binary logic, and the ternary operator, all in one line! looks impressive, huh? (Does anyone know if the PIC supports the ternary operator? if not, it's simple to convert to an if-else, but this is so clean! )

Now, some explanation:

We'll start by talking about the << operator.
This shifts whatever is to it's left of to the left the number of places given by the expression on it's right. In the code above, it shifts 1 left by 2 (n-1 = 3-1 = 2) places, filling spaces on the end with zeroes. This gives us 00000001 => 00000100 (assuming an 8 bit int, didn't check how large an int on the PIC is, but it doesn't matter for the point of this demonstration) An expression like this is known as a mask, and can be used with the & operator to test a single bit.

Now you see that we compare the mask (1 << (n-1) ) with the variable to be tested using the binary bitwise and operator (&). It checks each bit place in each string against each other. Any place where both digits are 1 evaluate to 1. All other places evaluate to zero.

So, what we are really looking at in the example is
(00000101 & 00000100)
Which will evaluate to
(00000100)
Now, we plug this value (4) back into our ternary expression.

Time for a quick lesson in ternary! The ternary operator ? looks at the expression that precedes it. If the expression is true, the expression following the operator is evaluated. If it is false, then it ignores code until it reaches a : . It then procedes to evaluate whatever follows the colon.

In our example, we get an expression that looks like
4 ? 1 : 0;

Four is non-zero, so it evaluates to true. This means that the statement evaluates to 1. So, in our example,

int some_var=5; /* the variable we will be extracting a bit from. */
int n=3; /* the position of the bit we want */

the_bit = (( some_var & (1 << (n-1) ) ) ? 1 : 0 );
printf("the_bit: %d", the_bit);

OUTPUT:
the_bit: 1


A slightly more generalized look at the whole ternary statement now:

the_bit = (( some_var & (1 << (n-1) ) ) ? 1 : 0 );

First we create a mask with a 1 in the place of the bit we want to find and zeroes elsewhere. The zeroes elsewhere mean that the & on those bits will always evaluate to zero. The real comparison, therefore, only takes place on one bit, somevar.bit[n]. If this bit is one, then the expression returns a non-zero value, which is true, which then returns one from the ternary. If this is false, then the whole thing works out to zero, or false, so the ternary evaluates to 0.

And there you have it folks, how to fetch bits in c.

seanwitte
08-01-2004, 09:52
You could round out the set by providing macros to set a bit value and clear a bit value. This would be useful if you have a byte with individual flag bits. var is a value and bit is the bit position you're working with. Bit 0 is the least significant bit (on the far right-hand side).


//get a bit from a variable
#define GETBIT(var, bit) (((var) >> (bit)) & 1)

//set a bit to 1
#define SETBIT(var, bit) var |= (1 << (bit))

//set a bit to 0
#define CLRBIT(var, bit) var &= (~(1 << (bit)))


<edit>
Dave had a good point, added parens around arguments. Left the parens out of SETBIT and CLRBIT since var needs to be a left hand argument. Will leave the conversion to functions to you all, but logic is the same.

We have a union to match the PBASIC BYTE variable type that is bit-addressable. Its not portable but works with the PIC controllers. You can pull out the 8-bit value using the byte field or get a single bit using b0 - b7. Makes a handy way to track a set of flags.


//byte, addressable bits
//size: 8 bits (1 byte)
//range: 0 to 255
typedef union {
struct {
unsigned b0:1;
unsigned b1:1;
unsigned b2:1;
unsigned b3:1;
unsigned b4:1;
unsigned b5:1;
unsigned b6:1;
unsigned b7:1;
};
uchar byte;
} byte;

</edit>

Dave Scheck
08-01-2004, 11:01
Sean

You have a good suggestion to make these macros. They seem to work to perfection, however, there is a common error that you made.

When writing macros, it is a good idea to surround your arguments with parenthesis in the definition.

Let's walk through this example
#define PI 3.14
#define CIRCLE_AREA(r) (PI * (r) * (r))
area = CIRCLE_AREA(4)
expands to
area = 3.14 * (4) * (4)
Often times, though, you may have an expression that you're passing to the macro
area = CIRCLE_AREA(i+2)
which would expand to
area = 3.14 * (i+2) * (i+2)
Now let's see what happens when we remove the parenthesis in the macro
#define CIRCLE_AREA(r) (PI * r * r)
area = CIRCLE_AREA(i+2)
expands to
area = 3.14 * i + 2 * i + 2
The order of operations that C follows causes the result to be clearly different from what you would have expected. It would evaluate as if it were grouped like this
area = (3.14 * i) + (2 * i) + 2

The moral of the story is that the parenthesis around parameters in the macro definition to make it more portable. If I see a macro that calculates the area of a circle, I don't want to have to double check it to see how the order of operations will play out. Same goes for the parenthesis around the entire expression. For the most part, you will have no problems, but it never hurts to be safe.

On a side note, I try to use functions instead of macros when efficiency is less critical or when the routine to be run is complex.

Now, I don't want to get into a whole debate about macros and functions, because each has their place. I find that it is easier to debug functions than macros because you can step through them in a debugger.

Noah
08-01-2004, 12:46
//get a bit from a variable
#define GETBIT(var, bit) (((var) >> (bit)) & 1)

//set a bit to 1
#define SETBIT(var, bit) var |= (1 << (bit))

//set a bit to 0
#define CLRBIT(var, bit) var &= (~(1 << (bit)))


I think that you've got a slight problem here: I suspect that the expression where (bit) is needs to be replaced with (bit - 1)

Example: Try to access the 3rd bit of 20. 20 = 00010100
00010100 >> 3 == 00000010, but
00010100 >> 2 == 00000101, yielding the correct digit in the correct place.
However, without the ternary operation here, you are likely to return a number of non-binary answers: as you just saw, running GETBIT (20, 3) would yield 2 as you have it written, or five after the (bit-1) correction.

One other correction: The comments you used are C++ style, and not valid in C. watch out!

I believe that corrected macos follow:

/* get a bit from a variable*/
#define GETBIT(var, bit) (( var & (1 << (bit-1) ) ) ? 1 : 0 )

/* set a bit to 1 */
#define SETBIT(var, bit) (var |= (1 << (bit-1)))

/* set a bit to 0 */
#define CLRBIT(var, bit) (var &= (~(1 << (bit-1))))

Mike Soukup
08-01-2004, 12:57
I think that you've got a slight problem here: I suspect that the expression where (bit) is needs to be replaced with (bit - 1)

Example: Try to access the 3rd bit of 20. 20 = 00010100
00010100 >> 3 == 00000010, but
00010100 >> 2 == 00000101, yielding the correct digit in the correct place.
However, without the ternary operation here, you are likely to return a number of non-binary answers: as you just saw, running GETBIT (20, 3) would yield 2 as you have it written, or five after the (bit-1) correction.
The original macro is correct, you're just not used to thinking like a computer person; when counting, always start with 0. Bit 0 is the least significant and bit 7 is the most.

Also, GETBIT cannot return 2 or 5, it can only return 0 or 1 since 'AND'ing anything with 1 will get rid of all but the least significant bit.

Joe Johnson
08-01-2004, 13:12
Hey, can anyone give me an idea about the relatative effeciency of the underlying assembly language code for using the shifting macros vs. using the union/structures method?

I suppose that most times the code ends up more or less identical but I wonder on this particular CPU with this particular compiler, if there is a major advantage one way or the other.

My main reason for asking is that I am dreaming about 100 million things I want to do in the interrupt service routine (ISR), 88 million of them involve such bit piddling, not being one who likes to stack up ISR upon ISR upon ISR, I am concerned about effeciency of the code that does these things.

Your comments welcome...

Joe J.

Noah
08-01-2004, 13:57
The original macro is correct, you're just not used to thinking like a computer person; when counting, always start with 0. Bit 0 is the least significant and bit 7 is the most.

Also, GETBIT cannot return 2 or 5, it can only return 0 or 1 since 'AND'ing anything with 1 will get rid of all but the least significant bit.
oops, I missed the signifigance of the parenthases around the shift operation... You're right! It should work as you wrote it.

Random Dude
08-01-2004, 15:53
Hey, can anyone give me an idea about the relatative effeciency of the underlying assembly language code for using the shifting macros vs. using the union/structures method?

I suppose that most times the code ends up more or less identical but I wonder on this particular CPU with this particular compiler, if there is a major advantage one way or the other.

Your comments welcome...

Joe J.

Well, lets throw a sample together and see what we get:

First, the union method:


typedef struct bits {
unsigned char b0:1;
unsigned char b1:1;
unsigned char b2:1;
unsigned char b3:1;
unsigned char b4:1;
unsigned char b5:1;
unsigned char b6:1;
unsigned char b7:1;
} BITS;

typedef union bit_char {
unsigned char byte;
BITS b;
} BIT_CHAR;

main ()
{
BIT_CHAR test1;
test1.byte = 5;
test1.b.b2 = 1;
}


And the corresponding assembly:


000802 cfd9 MOVFF 0xfd9,0xfe6
000806 cfe1 MOVFF 0xfe1,0xfd9
00080a 52e6 MOVF 0xe6,0x1,0x0
00080c 0e05 MOVLW 0x5
00080e 6edf MOVWF 0xdf,0x0
000810 84df BSF 0xdf,0x2,0x0
000812 52e5 MOVF 0xe5,0x1,0x0
000814 52e5 MOVF 0xe5,0x1,0x0
000816 cfe7 MOVFF 0xfe7,0xfd9
00081a 0012 RETURN 0x0


And the macro version (using bit-shifting)


//get a bit from a variable
#define GETBIT(var, bit)(((var) >> (bit)) & 1)

//set a bit to 1
#define SETBIT(var, bit)var |= (1 << (bit))

//set a bit to 0
#define CLRBIT(var, bit)var &= (~(1 << (bit)))
main ()
{
char test1;
test1 = 5;
SETBIT(test1,2);
}


ASM:

000802 cfd9 MOVFF 0xfd9,0xfe6
000806 cfe1 MOVFF 0xfe1,0xfd9
00080a 52e6 MOVF 0xe6,0x1,0x0
00080c 0e05 MOVLW 0x5
00080e 6edf MOVWF 0xdf,0x0
000810 84df BSF 0xdf,0x2,0x0
000812 52e5 MOVF 0xe5,0x1,0x0
000814 52e5 MOVF 0xe5,0x1,0x0
000816 cfe7 MOVFF 0xfe7,0xfd9
00081a 0012 RETURN 0x0



So it appears that the compiler optimizes both methods to the same ASM code. Therefore I'd say that which to use is a matter of personal preference.