How does a compiler implement division?steemCreated with Sketch.

in programming •  6 years ago  (edited)

Many programmers realize they can replace division or multiplication by a power of two with a bit-shift operation that is (usually) more efficient. But what if we want to divide by a constant other than a power of two? What optimizations does a compiler make in these cases?

I'll show some examples using the godbolt.org Compiler Explorer.

Division by 2

Let's look at both signed and unsigned division:

int divideByTwo(int num) {
    return num / 2;
}

unsigned int divideByTwo(unsigned int num) {
    return num / 2;
}

This output is from gcc 8.1, with the -O2 optimization level, on an x86-64 architecture:

The unsigned version is pretty simple: move the argument (in edi) to where it needs to be for the return value (eax) and then shift it right. The shr instruction moves the bits right by one position, and zero-fills from the left.

The signed version is a bit more complicated; why is it doing two shifts and an add? The sar instruction, "Shift Arithmetic Right" moves bits right, and "sign-fills" from the left, so that if the most-significant bit was 1 it will still be a 1, and if it was a zero it will still be zero.

With positive numbers in twos-complement notation this is just fine. For example, 11 is

00000000 00000000 00000000 00001011 in binary, and if we shift it we get

00000000 00000000 00000000 00000101 which is 5. But what about -11? In twos-complement notation that is

11111111 11111111 11111111 11110101 but shifting it with sar gives

11111111 11111111 11111111 11111010 which is -6, not -5. Integer math is supposed to round towards zero.

So, the compiler rounds negative numbers up, by taking the sign bit (shifted right by 31 places) and adding it. (So positive numbers have no change.) When we do that with -11 we get the correct answer:

11111111 11111111 11111111 11110101 = -11, adding one gives

11111111 11111111 11111111 11110110 = -10, shifting right by one bit gives:

11111111 11111111 11111111 11111011 = -5 as desired.

Division by 3

The compiler divides by doing multiplication! This is not too surprising: to divide by a real number X, we can multiply by the reciprocal of X, that is, 1/X. But when X is an integer other than 1, 1/X isn't an integer.

The trick the compiler is playing here is that multiplication of a 32-bit value by another 32-bit value gives 64 bits of results. C or C++ doesn't normally make those extra bits available to you. But if you read the documentation for mul you'll see that the upper 32 bits of the result get put in edx and the lower 32 bit in eax.

That funny value on line 21, in unsigned form is 1431655765, and in binary is 1010101010101010101010101010101. It happens that 1431655765 / 2^32 = 0.33333333325572311878204345703125, very close to 1/3. In fact, it's close enough that we can use it as 1/3 for any 32-bit value. (I don't know why disassembly showed it as a negative number, but mul is unsigned multiplication.)

The way that the compiler does so is by treating the special constant like a fractional part. If we wanted to multiply ABCDEF by 0.333333, we could do so by multiplying the integers ABCDEF by the integer 333333, and then the putting the decimal point in the right place. (This is how you're taught to do it in grade school, more or less.) So, when we multiply the input by 1431655765, we're interpreting the result as if we had actually multiplied by 0.101010101010... binary, which we saw above is close to 1/3.

An example: With input 100 decimal, the multiplication gives

         00000000 00000000 00000000 01100100
 x     0.10101010 10101010 10101010 10101010
= 100001.01010101 01010101 01010101 00110100

The lower 32 bits can be thrown away, and the answer we want is in the upper 32 bits of the answer: 100001 in binary is 33 decimal. So the compiler turned integer division into "fixed-point" math and multiplied by a reciprocal. (Or, we can think of it as multiplying by 1/3 * 2^32 --- and then dividing by 2^32 again to get the desired answer, when the compiler moves the upper 32 bits back into eax.)

The signed version of this function is a little more complicated, again, because of the need to make rounding come out the right direction.

Divison of a floating-point number by 2

What does the compiler do with the following two examples?

double divideByTwo(double num) {
    return num / 2.0;
}

double divideByTwoPointFive(double num) {
    return num / 2.5;
}

Surprisingly, the first one is implemented as multiplication, again!

The compiler can calculate an exact reciprocal of 2.0 as a floating-point number: 1/2.0 is just 2^{-1}, and that's the literal that the compiler uses (although the floating-point representation is not as easy to read.) Therefore, the division the programmer asked for, and multiplication by the reciprocal, are exactly identical.

But the value 1/2.5 doesn't have an exact representation in binary floating-point, even though 2.5 itself does. The compiler uses the "divide" instruction and lets the hardware deal with how to properly round the result.

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!
Sort Order:  

I love these clever tricks. 😀 Thanks for sharing.