Make Branching and Condition Evaluation Faster

Some time back, I read how to make branching faster? I don't remember where I read it. May be on Reddit. This blog & in the future also, I will try to write similar posts whenever I find new tricks. You can find all code in the GitHub repository.

In this post, we will see the simplest example of finding odd & even numbers. below code will be used for all possible solutions & only conditions will change.

#include<iostream>
using namespace std;

int main() {
    int n = 999999999;

    int even = 0;
    int odd = 0;

    for(int i=0;i<n;i++) {
    // chaning part
        if(i % 2) {
            odd ++;
        } else {
            even ++;
        }
    // changing part ends
    }

    cout << "Even: " << even << endl;
    cout << "Odd: " << odd << endl;

    return 0;
}

Using hyperfine

I will be using hyperfine for finding the performance of code. You can check it https://github.com/sharkdp/hyperfine.

A. Simplest solution

Using if & else with % (remainder operator) is the common solution.

if(i % 2) {
    odd ++;
} else {
    even ++;
}

The simplest solution took an average time of 2.9 seconds.

B. Using bitwise instead of the remainder

Let's try the bitwise operator instead of the remainder operator. Using i & 1 instead of i % 2 will also work.

if(i & 1) {
    odd ++;
} else {
    even ++;
}

It looks like the bitwise condition took less average time but max time is larger than the previous approach.

C. Remove branching completely

It is possible to remove branching completely and use bitwise operations only. In the first condition, we are checking if the number is odd then increment odd counter. We can achieve it using && operator like (i & 1) && odd++ . odd will be incremented when (i & 1) is true. Similarly, we can achieve else condition also.

(i & 1) && odd++;
(i & 1) || even++;

D. Remove duplicate (i & 1)

We can replace 2 statements using a complex one-liner like below.

((i & 1) && odd++) || even++;

And this is worse than the previous one. It took on average 2.681 seconds.


References

[1] Youtube: Branchless Programming: Why "If" is Sloowww... and what we can do about it!