AES-GCM-SIV is misuse resistant authenticated encryption (MRAE), but like all MRAE schemes it still leaks information (equality of plaintexts) in a nonce reuse situation and is not IND-CPA-secure in that case.
This is the crux of it.
In the event of nonce reuse, AES-GCM-SIV fails to provide IND-CPA.
From RFC 8452 (p. 12):
We stress that nonce misuse-resistant schemes guarantee that if a
nonce repeats, then the only security loss is that identical
plaintexts will produce identical ciphertexts.
I'm not really sure how to interpret these bounds.
These are not bounds on the adversary's IND-CPA advantage.
They are bounds on the adversary's advantage in the mrAE game.
The mrAE game is defined in Appendix A of the original GCM-SIV paper.
The adversary is given two oracles:
- encrypt(nonce, ad, message)
- decrypt(nonce, ad, ciphertext)
The adversary's goal is to distinguish:
- oracles computing AES-GCM-SIV under a uniform random key, from
- oracles returning
- uniform random ciphertexts of the correct length on encryption queries, and
- unconditional failure on all decryption queries.
We assume the adversary never repeats an encryption query, and never submits the answer to an encryption query as a decryption query.
The bounds reported in the RFC put additional assumptions on the adversary, for example requiring each nonce submitted to the encryption oracle to be chosen independently uniformly at random.
(In practice, this assumption reflects a system where the adversary doesn't really choose the nonce; the legitimate user chooses it.)
What the bounds you quoted tell you about is roughly the probability the adversary can do more than identify repeated plaintexts—for instance, they can't forge messages and can't learn anything about messages except equality.
These bounds are 'beyond birthday bound' because the notion of security takes as a premise that revealing message equality is not a problem, and they work by deriving a fresh key per message (which is, incidentally, very costly in software AES) so you're not working with a fixed 128-bit permutation per key.
In contrast, even assuming messages don't repeat, bounds on the adversary's advantage against AES-SIV (RFC 5297) are limited by the 128-bit birthday bound.
What is the relevance of limiting any particular nonce being reused to at most 256 times if the total number of colliding nonces is enormous?
If a single nonce is reused across many messages, then the bounds on the adversary's advantage are essentially no better than for AES-SIV, which is limited by the 128-bit birthday bound—for $q$ queries with the same nonce the advantage bound has a $q^2\!/2^{128}$ term (actually $L q^2\!/2^{128}$ for AES-GCM-SIV, where $L$ is the maximum message length in 16-byte blocks).
After a large number of reuses you might hit a tag collision in POLYVAL (GHASH with the bits reversed), which would lead to a repeated AES-CTR key stream and all sorts of bad things happen.
If the number of reuses of any particular nonce is limited, though, the adversary's advantage arising from the possibility of a tag collision is small enough not to matter.
Again, this assumes messages don't repeat or you don't care about leaking message equality.
And am I correct to assume that if you actually want IND-CPA security (with misuse-resistance), then AES-GCM-SIV is limited to a maximum of $2^{32}$ messages?
Yes.
Suppose the adversary is given oracles encrypt(ad, message) and decrypt(ad, ciphertext) for a randomized cipher, and is allowed to repeat queries.
Here's an adversary to distinguish an AES-GCM-SIV oracle (with nonce chosen internally, uniformly at random, on encryption queries) from a uniform random oracle: just submit the same (ad, message) pair to the encryption oracle over and over again and check for a repeated ciphertext—a nonce collision inside AES-GCM-SIV will likely happen long before a ciphertext collision in the uniform random oracle, so this adversary achieves advantage ${\approx}q^2\!/2^{96}$ after $q$ queries, or ${\approx}2^{-32}$ after $2^{32}$ queries.
If you try to set bounds on traditional IND-CPA advantage instead of mrAE advantage, you will run into problems with nonce collisions as you found.
There's no way around the birthday bound for IND-CPA: for the same message (and AD), a nonce collision means all inputs to the encryption function are identical and so the ciphertext must be the same, revealing the repeated message.
If that's a problem, of course the birthday bound still applies; AES-GCM-SIV has no magic to circumvent it.
There is a way around the 128-bit birthday bound for higher security: have a larger space of birthdays!
Daence (implementations) uses a 24-byte tag instead of a 16-byte tag so that it can:
- provide a simpler interface (no separate nonce parameter),
- let you use public or secret nonces—deterministically or randomly and of any size according to your needs!—by just putting them in the associated data or message, and
- set much tighter bounds on advantage for much larger numbers of messages and volumes of data.
For instance, you could use a compact 16-bit sequence number if you only send $2^{16}$ messages per key, or a 128-bit random UUID in long-term asynchronous applications, and just put it in the associated data or message.
Still no magic to prevent message equality from leaking if your nonces collide, of course.
But you can make the probability of nonce collisions negligible by using a large enough uniform random nonce for your application, or by using sequential nonces (which RFC 8452 explicitly advises against for AES-GCM-SIV!) in online interactive protocols where there's no danger of state rollback, or a combination of the two.
Bonus: much smaller code footprint on top of the primitives (Poly1305 and either Salsa20 or ChaCha), and no severe penalty for software implementations like AES-GCM-SIV has from deriving a fresh AES key per message.
Comparison of Daence, AES-SIV, and AES-GCM-SIV advantage bounds, for many message sizes, in the deterministic ('misused'), randomized, and sequential settings (source code).
(Disclosure: I designed Daence.)