Score:28

Why is FIPS 140-2 compliance controversial?

ng flag

I was reading the comments of an article about a proposed new implementation of /dev/random in Linux today, and someone remarked that it must be bothersome to go through 43 revisions and still not have your patch landed. A few comments down the line and someone seemingly implies that this new implementation would be FIPS 140-2 compliant, and that this is controversial with "a developer of one famous VPN kernel module" which "purposely utilize only non-NIST approved algorithms" has "strong opinion against FIPS compliance needed for governmental use cases".

Why is this? What is controversial about FIPS 140-2 compliance?

ke flag
Another thing that makes practitioners in the field frown on FIPS (albeit maybe not in the `/dev/random` context) is that code that has been tested to be standards-compliant is consistently behind the "latest" code -- it'll be missing bug fixes, including fixes for security bugs; and it's very expensive to retest. Moreover, the set of which cipher suites are specified by FIPS and the cipher suites that are actually considered best practice to use in the real world come out of alignment over time. An audited-to-be-FIPS-compliant implementation -- of anything -- is an _old_ implementation.
pk flag
Part of it is alluded to farther down. For a system to be compliant, it must not be possible to use non-compliant algorithms and methods. This isn't difficult at the platform OS level. For example, in Windows, when FIPS mode is enabled, it isn't possible to create or use plain text recovery agent volume encryption keys. For application developers, maybe not as easy. And you can only use older algorithms, which may omit some managed frameworks (Java/.NET) crypto algorithms.
Score:33
ng flag

I'll add to the other answer: the FIPS 140-2 certification rules for RNGs were flawed; and FIPS 140-2 change notice 2 (Dec. 2002) removed the part on self-tests. They are literally struck out from the standard, leaving vacuum. Thus FIPS 140-2 prescribes no technically satisfactory test of the entropy source, never did, and that's an issue. It only prescribes approved cryptographic conditioning (on which I have no technical reservations now that Dual_EC_DRBG is out).

It was originally prescribed 4 tests (monobit, poker, runs, and long runs) to be performed on operator demand (at level 3) or at each power-up (level 4), with a manual intervention required if a test fails. That's wrong for several reasons:

  • The acceptance levels are very stringent (more than they were in FIPS 140-1). Even with a perfect generator, the tests will fail on the field with sizable probability, with human intervention mandated. This is plain unacceptable in some applications, including OSes, TPMs, Smart Cards. Anything operated unattended can't be level 4, and contortions are required at level 3 to justify the "operator demand" thing.
  • It's not specified that these tests should be run on the unconditioned entropy source. Thus it is tempting to run the tests on the random numbers as output by some conditioning block, which makes the tests largely pointless: if the unconditioned entropy source fails or becomes low-entropy, a test of the conditioned output won't catch that, unless the conditioning has a huge theoretical flaw.
  • The acceptance level of some tests (monobit test in particular, and to a lesser degree poker test) is such that a slight bias, which is perfectly normal and harmless for an unconditioned entropy source followed by proper cryptographic conditioning, will cause disastrously high rejection rate. Thus it's essentially impossible to apply the tests on the material that requires testing.

The last two issues remain with the later SP 800-22 Rev. 1a NIST statistical tests for RNGs for cryptographic applications (which, according to my limited understanding, are now used during certification). The math of the tests is fine. But as above the tests are very sensitive, thus unusable on an unconditioned true entropy source (the tests would often fail), thus usable only at the output of a conditioned source, thus unable to detect the source's defects if the conditioning is good. And it's impossible to detect a competently backdoored generator from it's output alone.

So these tests either fail and give a good insurance that the material tested is distinguishable from random, or succeed and give an apparent insurance of security, even if the source's entropy is very limited or the conditioning has a backdoor, which both can allow an attack.

The people at NIST are competent, and it's reasonable to wonder if the true purpose of these tests is to give that illusion of security. That would be in line with a long history of the US actively sneaking weakened crypto:

  • The elaborate and decades-long compromise of the Swiss Crypto AG firm, that sold deliberately weakened cipher machines.
  • DES: an NSA publication, now declassified, acknowledges that it's 56-bit key is the result of a bargaining between the designers wanting 64-bit (at least: the Lucipher design had an option for 128-bit keys); and the NSA, trying to impose 48-bit to ease cracking; see this.
  • Dual_EC_DRBG mentioned in the other answer: a deliberate attempt (and at time success) to widely field a RNG, with a public design and parameters, that was secure except against US authorities (or whoever managed to change the public key, which happened).
Score:31
mc flag

Because there was previously a NIST approved random number generator (Dual_EC_DRBG) that was championed by NSA and had a flaw that is generally assumed to be an intentional backdoor created by NSA. This has made some people distrust any crypto algorithms that come out of NIST. Lots of articles on the net about this, here's one by Schneier that explains the issue in a fair amount of detail.

mg flag
I personally found this always a bit of an overreaction. Yes, there was a debacle with the Dual_EC_DRBG. But we still use AES, SHA2, and will use whatever wins the post-quantum algorithm competition. My takeaway point always was: don't trust standards *blindly*. I mean, the brainpool EC curves also have some quirky constants, it still generated a lot less noise than anything coming out of NIST.
Swashbuckler avatar
mc flag
@D.Kovács I understand your point, but the vast, vast majority of people do not have the technical expertise to evaluate crypto algorithms for flaws (intentionally introduced or not) and thus they must trust blindly.
za flag
@D.Kovács A deliberate attempt to weaken crypto is not a debacle, it is an attack. Also, as mentioned by fgrieu's answer the FIPS 140-1 and 140-2 tests are flawed. As such they can also be viewed as an attempt to weaken your crypto system by giving users a false sense of security. The controversy is a result of developers not trusting standards blindly -- as you proposed yourself
user avatar
in flag
@slebetman Do you have any facts to back up the claim that it was deliberate?
za flag
@user NSA memos released by Edward Snowden and reported by the New York Times indicate that the Dual_EC_DRBG case was a deliberate attempt by the NSA to insert a mathematical backdoor (by making it easier for them to guess random numbers) to encryption standards. We don't have the original documents of course because unlike Wikileaks, Snowden did not release the secret documents to the public - only to the press. It's up to you weather you want to believe the NYT or the NSA but my money is on the NSA being the liars - because that's their job.
Mark avatar
jp flag
@user, we don't *know* that the original Dual_EC_DRBG was backdoored by the NSA, but we do know that as-yet-unidentified attackers [modified the Q value of dual EC as implemented by Juniper Networks](https://www.schneier.com/blog/archives/2016/04/details_about_j.html) to get their own backdoor.
fgrieu avatar
ng flag
@user: Dual_EC_DRBG's backdoor was indisputably deliberate: the design makes senses only as such. And there's [a patent](https://worldwide.espacenet.com/textdoc?DB=EPODOC&IDX=US2007189527) clearly stating the intent. My observation that the RNG tests in FIPS 140 and SP 800-22 could be _deliberately_ trying to give a false sense of security is speculative, based on the fact these tests are not a valid proof of security (e.g. they give Dual_EC_DRBG a clean bill of health). Incidentally, they are used as bogus argument of security of RNGs in dozens, perhaps hundreds of substandard publications.
Joshua avatar
cn flag
@fgrieu: My turn to play devil's advocate. Let us say you wanted to reliably construct an RNG that doesn't leak state. Suppose you did so by constructing one from asymmetric encryption algorithm that leaked its state with maximum efficiency, so that there is no free entropy to leak anything else, then destroyed the decryption key? Would that not prove an RNG doesn't leak state?
Score:7
mx flag

You can make the best possible choice in all cases if you don't have to comply with FIPS 140-2. If you do have to comply with FIPS 140-2, you can only make the best approved choice. Thus FIPS 140-2 compliance never enables you to make better choices and sometimes forces you to make worse choices.

Say you have to choose between two options, one of which is FIPS 140-2 approved and the other is generally regarded as the much more secure choice by the cryptographic community. Which should you choose?

The answer is that you should definitely choose the one that is regarded as much more secure unless you must have FIPS 140-2 compliance. In that case, you must use the compliant one.

It is perfectly good to use FIPS 140-2 approved methods when they make sense. The only difference FIPS 140-2 compliance makes is that it forces you to make worse choices in some cases.

fgrieu avatar
ng flag
Yes, FIPS 140-2 restricts the deterministic conditioning that can be used. But unlike Dual_EC_DRBG, those (still) approved are simple, and my (and many other's) opinion is they are no more backdoored than SHA-2 or AES are. Thus it's only on principle, not technical grounds, that we may want to reject them. IMHO the most real technical problem with FIPS 140-2 is the lack of online test (on-the-field test that the entropy source is good). There was some such tests, they were technically defective and are literally [struck out](https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.140-2.pdf#page=65).
mx flag
@fgrieu I've had cases where there was a bug in a FIPS 140-2 certified product that clearly compromised security and my choices were to fix the bug and lose FIPS 140-2 certification or keep both the bug and the certification.
Score:1
cn flag

My thoughts are too long for commenting, so I'll wrap them up as an answer...

There are some serious issues with the other two answers, to the extent that some of us have miss-interpreted what a randomness test is. As bullet points:-

1. "distrust any crypto algorithms that come out of NIST". There are no NIST generated algorithms in FIPS. Certainly none of the complexity of Dual_EC_DRBG. Runs and Poker tests are not US Department of Commerce (NIST) proprietary algorithms. They are mathematical characteristics of a uniformly random distribution. If I posit that the expected number of ones should be ~50%, does that make me a subversive? Neither does expanding the mean of 0.5 with $n$ standard deviations. $\mathcal{N}(\mu, \sigma^2)$ is the standardised form for that distribution and I wouldn't expect anything less incomplete. Checking for repeat output blocks (Continuous random number generator test) is not subversion, it's common sense.

2. Can I offer this FIPS test as evidence:-

$cat /dev/urandom | rngtest
rngtest 5
Copyright (c) 2004 by Henrique de Moraes Holschuh
This is free software; see the source for copying conditions.  There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

rngtest: starting FIPS tests...
rngtest: bits received from input: 8310580032
rngtest: FIPS 140-2 successes: 415198
rngtest: FIPS 140-2 failures: 331
rngtest: FIPS 140-2(2001-10-10) Monobit: 41
rngtest: FIPS 140-2(2001-10-10) Poker: 53
rngtest: FIPS 140-2(2001-10-10) Runs: 123
rngtest: FIPS 140-2(2001-10-10) Long run: 115
rngtest: FIPS 140-2(2001-10-10) Continuous run: 0
rngtest: input channel speed: (min=10.703; avg=1976.720; max=19073.486)Mibits/s
rngtest: FIPS tests speed: (min=75.092; avg=199.723; max=209.599)Mibits/s
rngtest: Program run time: 43724402 microseconds

The failure rate is p=0.0008. That's very comparable to the p=0.001 threshold within the SP800 STS test suite, and dieharder's:-

NOTE WELL:  The assessment(s) for the rngs may, in fact, be completely
  incorrect or misleading.  In particular, 'Weak' p values should occur
  one test in a hundred, and 'Failed' p values should occur one test in
  a thousand -- that's what p MEANS.  Use them at your Own Risk!  Be Warned!

So not apparently controversial.

3. "It's not specified that these tests should be run on the unconditioned entropy source". Of course not. That's correct. No one has statistical characteristics for unconditioned entropy source distributions. They come in all shapes and locations. Some of them do not even have mathematical names (double sample of log normal, bathtub MOD $x$ e.t.c.) We can only run standardised statistical tests on conditioned final output.

4. "it's impossible to detect a competently backdoored generator from it's output alone". Again, of course. That's not the intention of e.g. FIPS startup testing. You need programmers and cryptographers for that. FIPS simply automates the randomness testing and sets out guidelines for basic security programming like no string literals for control, and relocatable code. All very normal.

Therefore FIPS 140 isn't all that contentious. Saying so is equivalent to saying NIST has backdoored the Normal distribution, or that dieharder is useless. FIPS is just not great at some few things. And testing 20,000 bit blocks fits neatly at the bottom end of the scale for randomness testing, just below ent (500,000 bits).

poncho avatar
my flag
"There are no NIST generated algorithms in FIPS"; actually, CTR_DRBG, HASH_DRBG and HMAC_DRBG were designed by John Kelsey of NIST...
fgrieu avatar
ng flag
Some comments have been [moved to chat](https://chat.stackexchange.com/rooms/131745/discussion-on-answer-by-paul-uszak-why-is-fips-140-2-compliance-controversial) to keep on this interesting exchange.
Paul Uszak avatar
cn flag
@fgrieu This happens a lot to me doesn't it.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.