On my current project, we recently came across the problem of calculating cumulative binomial distribution probability. The formula for this is relatively straightforward:

Any decent programmer could code this up in short order. In fact we used the function provided by Math.Net.

But there's a problem. Although the final results are modest numbers, some of the intermediate results are... well the term 'astronomical' is too small a word. Our failing case involved calculating 12798! (12798 factorial) - which in case you didn't know is 12798 * 12797 * 12796 ... * 2. Yeah, that's a big number. According to Wolfram Alpha it works out at

Compare that to the estimated number of protons in the universe, which is a mere 10^80.

I had a think about this, and I wondered: how is Excel doing this? Excel has a BINOM.DIST function which calculates for large inputs fine – and fairly quickly. As luck would have it, Microsoft have actually published the logic they use, here: http://support.microsoft.com/kb/827459

Kudos to them for both developing this algorithm, and publishing it.

I was able to re-implement this in F# relatively easily, and our binomial statistical tests now work just fine. I didn’t bother to switch to the special algorithm only for large inputs (as Excel does) as for our purposes the results don’t appear to be affected by the different approaches.

Since then I’ve refined the code a bit to make it less repetitive (and not proprietary!). Here is the entire function.

(Full project with units tests here: https://github.com/misterspeedy/Binomial)

If anyone fancies adding a more comprehensive set of tests, and incorporating the code into Math.Net, please feel free.