




eTRNG Statistical Test Results 

The output of eTRNG was extensively analyzed using a number of common statistical test suites used for assessing the randomness of random number generators.
The three most significant test suites used were Diehard,
NIST 80022
and FIPS 1402. The results of these tests are summarized below.

Diehard
The Diehard test suite was developed in 1995 by Dr. George Marsaglia, Professor Emeritus of Statistics at Florida State University at the time. The Diehard test suite was
considered the gold standard for testing random number generator quality for many years and is still one of the top three random number test suites.

The Diehard test suite is run on a 10 Mbyte data set and produces over 210 scores referred to as pvalues. Individual test failures are indicated by a pvalue of 0 or 1 but there
is no hard pass/fail criteria for the suite of tests, it is up to the user to decide if the test scores indicate an acceptable level of randomness for their application. The Diehard results file contains
almost 900 lines of text (and the 210+ pvalues) so the best way to interpret the results is graphically. Without real pass/fail criteria this is particularly useful when comparing the
results for two sets of data. In the graph below on the left are the Diehard results for a data set produced by a wellknown hardware true random number
generator (HW TRNG) based on the decay of a radioactive isotope. The graph in the center shows the results for a data set generated using eTRNG. The graph on the right
shows the results for Mersenne Twister, a software PRNG that is generally considered to be one of the better PRNGs (except for security related applications).

The red line in each graph is plotted using the pvalues sorted from smallest to largest. The blue line represents a perfect progression from 0 to 1. The more closely the red
line tracks the blue line the better that data set performed under the Diehard analysis, this is because the pvalue distribution
should be uniform for a sequence of numbers with a high degree of randomness. A common misconception regarding the Diehard results is a pvalue of 0.5 indicates random data. Truly random data with no discernable patterns actually
produces a uniform distribution of pvalues between 0 and 1 which would produce an arithmetic mean very close to 0.5. Digressions from a uniform distribution (appearing as divergences
from the blue line) indicate the Diehard tests detected indications of low randomness in the data. Slight divergences between the blue and red lines aren't necessarily bad,
particularly if the diverging segment is fairly short and it runs parallel to the blue line. Patterns to look for that indicate a lack of randomness are:
 Nearly vertical segments indicate a number of tests produced very similar pvalues. The longer the vertical segment the higher the number of tests that produced a
similar result. Pay particular attention to vertical segments just above the '0' line or just below the '1' line, groups of similar pvalues in these areas are strong
indicators of a lack of randomness.
 Nearly horizontal segments indicate gaps in the pvalues. The longer the horizontal segment the larger the gap in pvalues.
 Line segments following the '0' or '1' line may indicate test failures or pvalues that are very close to 0 or 1. This appears at the beginning of the Mersenne Twister
line where the first 5 pvalues are below 0.01. If one of these segments is very noticeable and very flat, it is likely due to test failures.
 A smooth red line indicates a more uniform distribution. A jagged red line with vertical segments followed by horizontal segments indicates a poor distribution with
groups of similar pvalues followed by gaps in pvalues. The closer these jagged segments are to right angles the tighter the groups of similar values and the larger the gaps
between segments of uniform distribution. This is noticeable in several places on the HW TRNG line.



Another way to look at the Diehard results is to look at the pvalue distribution across the ranges of 0 to 0.1, 0.1 to 0.2, 0.2 to 0.3 and so on up to 0.9 to 1. For a perfectly
uniform distribution, 10% of the pvalues would fall within each of those ranges. The graph below shows the distribution for the same data sets as used in the graphs above. The thick
black line indicates the 10% level. The green lines show the range of percentages for the eTRNG data and the red lines show the range for the hardware TRNG data.



This comparison of Diehard results shows the eTRNG generated data compares very favorably to the data from a highquality hardware true random number generator based
on radioactive isotope decay (considered by many to be the perfect source of entropy). With the cleaner line in the pvalue plot and tighter range on the pvalue distribution graph
it could be said eTRNG outperformed this particular HW TRNG (the irregularities in the HW TRNG pvalues are likely due to Geiger counter measurement anomlaies/inaccuracies or in how the data is
manipulated in producing the random number output). What these results don't show is the HW TRNG produces about 100 bytes/sec while eTRNG can generate over 60 Mbytes/sec and eTRNG
is a MUCH less expensive solution.

Diehard test results files (57KB text files):


NIST 80022
The 80022 suite of tests developed by the National Institute of Standards and Technology (in the US) has replaced Diehard as the gold standard for judging the quality of random
number generators. While the nature of many of the 80022 test are close to those in Diehard, the results are presented in way that is more easily judged as pass/fail and less subjective
than the Diehard test results. Each test in 80022 has specified failure criteria and an overall pass rate greater than 95% is considered to indicate a high degree of randomness.
Each run of 80022 produces 190 test scores.

The table below shows the 80022 (version 2.1.1) results for three 100 Mbytes data sets generated by eTRNG Advanced ("Pass 2" is the data set used for the Diehard results above) and published
results for the same HW TRNG data set used for the Diehard results above. For the tests that produce multiple scores, the average of those scores is presented. The best scores for each
of the tests are highlighted in green. Each of the eTRNG data sets was analyzed as 100 strings of 1Mbytes using the 80022 default settings. For the
three passes (a total of 570 tests) there was a single failure on one of the Nonperiodictemplate tests on Pass 3 (a score of 95 against a minimum passing score of 96). The average of the
pass rates for the three eTRNG data sets is 98.9%. Similar to the Diehard results, on the 80022 test suite the eTRNG results again are comparable to the HW TRNG results.

  Pass rate percentage 
Test  # of tests  Pass 1  Pass 2  Pass 3  HW TRNG 
Frequency  1  100  100  100  97.66 
Blockfrequency  1  100  98  99  100 
Cumulative sums  2  100  100  100  97.66 
Runs  1  99  98  100  96.09 
Longest run  1  100  97  100  100 
Rank  1  100  99  98  98.44 
FFT  1  99  99  98  99.22 
Nonperiodictemplates  148  99.15  99.07  98.97  98.97 
Overlapping templates  1  98  97  98  97.66 
Universal  1  99  98  99  96.09 
Apen  1  97  97  99  100 
Random excursions  8  98.82  99.03  98.87  99.69 
Random excursions  variant  18  99.26  99.20  98.75  98.69 
Serial  2  98.5  98.5  99.5  99.22 
Linearcomplexity  1  100  98  99  99.2 
Overall average   99.15  98.54  99.06  98.51 


NIST 80022 test results files (18KB text file):


FIPS 1402
The US Federal Information Processing Standard (FIPS) 1402 test suite is primarily intended for analyzing random numbers for use in secure applications such as password
generation and encryption key creation. These tests primarily look for issues that could compromise security in these applications, such as predictability and long sequences
of '0' or '1' bits. FIPS 1402 performed 1,000 tests on data strings in the first 20,000,032 bits of the data sets. Lacking features such as whitening and balancing, eTRNG is not intended
for use in these security related applications but it does perform very well on the FIPS 1402 tests as shown below.

 Test failures 
Test  Pass 1  Pass 2  Pass 3 
Monobit  0  0  0 
Poker  0  0  0 
Runs  1  1  0 
Long run  1  0  0 
Continuous run  0  0  0 
Passing %  99.8  99.9  100 



Back to top 

