Download the Product Belief file Here:

BoxMuller Guassion Noise Generator for FPGA v1.0

Now,  we have published the Verilog source code and the bit-match verification script on Github. Please go to


  • Based on the Box-Muller algorithm with no central limit theorem required
  • Piecewise polynomial based Chebyshev function approximation units with range reduction
  • The period of generated noise sequence is 2^88 (about 10^25)
  • 18-bit noise samples with 5 bits of integer and 13 bits of fraction, accurate to one unit in the last place (ulp) up to 8.25*sigma, which models the true Gaussian PDF accurately for a simulation size of over samples 10^15
  • The core can be reset to its initial state and work on clock enable signal
  • Bit-true MATLAB and C mex programs included
  • Maximum clock rate and output sample rate of 320MHz on Xilinx VCU108 Board

Box-Muller Algorithm

The Box–Muller transform, by George Edward Pelham Box and Mervin Edgar Muller [1], is a pseudo-random number sampling method for generating pairs of independent, standard, normally distributed (zero expectation, unit variance) random numbers, given a source of uniformly distributed random numbers.

The Box-Muller method starts with two independent uniform random variables over the interval [0, 1). The following mathematical operations are performed to generate two samples, and, of a Gaussian distribution N(0,1).


Normal Distribution Performance Test

Test Environment

Taus_URNG, seeds are 2846420573 2846420573 2846420573

Taus_URNG, seeds are 912462866 912462866 912462866

Test Generated samples number for analysis: 1e7

Fixed point Accuracy Evaluation

Fixed point output is compared to float point model (with same URNG input):

  Floor truncation Round
e max error value (abs) 6.3224e-08 3.4341e-08
f max error value (abs) 3.1588e-05 2.3977e-05
g0/g1 max error value (abs) 1.0174e-04

1.0174 e-04


1.1467 e-04

x0/x1 max error value (abs) 4.8891e-04




x0/x1 mean value -5.6479e-04




x0/x1 variance value 0.99978




The results show that accuracy loss caused by truncation can be neglected. So truncation method is used in hardware.

Anderson-Darling test result

adtest result (vs. MATLAB randn() function):

  H P adsta cv
This design 0 0.7810 0.2398 0.7519
MATLAB randn() 0 0.3270 0.4209 0.7519

H=0 means data sets fit the normal distribution.

Distribution Figure

Figure 1. PDF Distribution of generated 1e7 samples

Resource Utilization and Throughput

Xilinx VCU108 FPGA Evaluation Board

Fmax  :  330MHz (can be improved furthermore)

Latency :  16 cycles

Figure 2. Resource Utilization