bit manipulation - Binary coded decimal addition using integer -
if have 2 numbers in packed bcd format , want add them, approach add them this: convert both numbers integers, perform normal integer addition, convert result bcd?
the c99 code below adds packed bcd operands 8 bcd digits stored in uint32_t
. code can extended wider bcd operands choosing uint64_t
process 16 bcd digits. since approach relies on bit-parallel processing may not efficient narrow packed bcd operands.
in packed bcd format, each bcd digit occupies 1 nibble (4-bit group) of unsigned integer operand. if nibble-wise addition results in sum > 9, want carry next higher nibble. if use regular integer addition add 2 packed bcd operands, desired nibble carries not occur when nibble sum > 9, < 16. remedy this, can add additional 6 each nibble sum.
we can find nibble carries follows: bit-wise sum of 2 integers x
, y
x ^ y
. @ bit position has carry-in next lower bit position during regular integer addition, bits in x ^ y
, x + y
differ. can find bits carry-in (x ^ y) ^ (x + y)
. interested in bits 4, 8, ..., 32 carry-in, carry-outs bits 3, 7, ..., 31.
there slight problem if there carry-out bit 31 bit 32 since uint32_t
operands hold 32 bits. can detect if find sum of 2 unsigned integers smaller either of addends. 3 operations handling carry-out bit 31 can omitted when operating on seven-digit operands instead of eight-digit operands.
/* add 2 packed bcd operands, each uint32_t holds 8 bcd digits */ uint32_t bcd_add (uint32_t x, uint32_t y) { uint32_t t0, t1; t0 = x + 0x66666666; // force nibble carry when bcd digit > 9 t1 = x ^ y; // bit-wise sum t0 = t0 + y; // addition nibble carry t1 = t1 ^ t0; // (x ^ y) ^ (x + y) t0 = t0 < y; // capture carry-out bit 31 t1 = (t1 >> 1) | (t0 << 31); // nibble carry-outs in bits 3, 7, ..., 31 t0 = t1 & 0x88888888; // extract nibble carry-outs t1 = t0 >> 2; // 8 - (8 >> 2) = 6 return x + y + (t0 - t1); // add 6 digit nibble carry-out }
knuth, taocp vol.4a part 1, offers superior solution (requiring fewer operations) in answer exercise 100 section 7.1.3. variant particularly suited processor architectures instruction can evaluate logical function of 3 arguments, such lop3
instruction of modern nvidia gpus.
uint32_t median (uint32_t x, uint32_t y, uint32_t z) { return (x & (y | z)) | (y & z); } uint32_t bcd_add_knuth (uint32_t x, uint32_t y) { uint32_t z, u, t; z = y + 0x66666666; u = x + z; t = median (~x, ~z, u) & 0x88888888; return u - t + (t >> 2); }
Comments
Post a Comment