Score:2

Is exposing hash of private key provably secure?

US flag

Let's say we have an IND-CPA secure public key encryption scheme $\Pi = (\text{Gen}, \text{Enc}, \text{Dec})$. Construct a new PKE $\Pi' = (\text{Gen}', \text{Enc}', \text{Dec}')$ that behaves exactly as $\Pi$ except it additionally leaks the hash of the private key. For example, $\text{Gen}' = (sk, pk \vert \vert H(sk))$ where $(pk,sk) \xleftarrow{} \text{Gen}()$ and $H$ is a hash function (for concreteness let's say SHA256).

Is this new scheme "provably secure" in either the standard model (first preimage resistance, etc.) or random oracle model? That is, does $\Pi \text{ is IND-CPA secure } \land \text{ some assumptions on } H \implies \Pi' \text{ is IND-CPA secure}$?

Intuitively, it feels like a proof in the ROM should be possible. Naive intuition: if the hash of the private key looks like a random value to the adversary, it shouldn't be providing any additional information to breaking the PKE scheme. But I struggle to create a proof (possibly because I don't fully understand ROM proofs).

Here's a first pass.

  • Let $\mathcal{A}$ be a PPT against $\Pi'$ with advantage $\epsilon$
  • Construct adversary $\mathcal{B}$ against $\Pi$ using $\mathcal{A}$:
    • $\Pi$ challenger generates $(pk,sk)$ and sends $pk$ to $\mathcal{B}$
    • $\mathcal{B}$ generates a random value $r$, and starts $\mathcal{A}$ by sending it $pk \vert\vert r$
    • $\mathcal{A}$ makes some random oracle queries $x_i$, gets back random $r_i$
    • $\mathcal{A}$ submits messages $m_0, m_1$ for encryption
    • $\mathcal{B}$ forwards these messages to $\Pi$ challenger, which randomly chooses $b \xleftarrow{} (0,1)$, encrypts and returns $\text{Enc}(pk, m_b)$; $\mathcal{B}$ forwards this challenge encryption to $\mathcal{A}$
    • $\mathcal{A}$ outputs $\hat{b}$, and $\mathcal{B}$ outputs the same

I'm not sure how to reason about the success probability of $\mathcal{B}$ here. If $\mathcal{A}$ never actually queries the random oracle on $sk$, then $\mathcal{B}$ faithfully simulates a $\Pi'$ challenger to $\mathcal{A}$. However, if $\mathcal{A}$ does query the random oracle on $sk$, then it is going to get a random value instead of $H(sk)$ which it knows from the public key. This is an observable difference between a real $\Pi'$ challenger and the simulation of a $\Pi'$ challenger by $\mathcal{B}$. So I think I can't just conclude that the success probability of $\mathcal{B}$ is the same as the success probability of $\mathcal{A}$.

Any thoughts on if this proof can be made to work? :)


Note that I have seen the related question: Publicly exposed hash of private key. I believe the current question is asking for something more general and rigorous then that question. In particular, that question is specific to RSA and neither answer contains a proof.

Score:3
ng flag

With the question's statement, there is no insurance that $\Pi'$ is secure.

We build a counterexample $\Pi$ from ECIES on curve secp256r1. In this ECIES encryption system, the private key is any integer $d\in[1,n)$ where $n$ is some fixed 256-bit prime, and the public key is obtained from $d$. We modify ECIES into $\Pi$ by making the private key of $\Pi$ any 256-bit bitstring $\text{Priv}$, and computing ECIES's $d$ as $d\gets(H(\text{Priv})\bmod(n-1))+1$. This is done

  • at key generation, to build $d$ and from that the public key of both ECIES and $\Pi$
  • at decryption, to build $d$ from $\Pi$'s private key $\text{Priv}$ and then apply ECIES's decryption procedure.

Our $\Pi$ is secure, but $\Pi'$ is not, because given $h=H(\text{Priv})$ we can compute $d\gets(h\bmod(n-1))+1$ and decipher.


In order to make $\Pi'$ secure, we need to make $H$ a random oracle or PRF that's independent of anything used in $\Pi$.

Informal argument that $\Pi'$ is secure when $H$ is modeled by a "fresh" random oracle that is not invoked in the definition/use of $\Pi'$: compared to attacking $\Pi$, an adversary attacking $\Pi'$ additionally gets the value of $H(\text{sk})$, which is a random number; and oracle access to $H$. The only useful thing to do with such extra information "obviously" is to verify an exact guess of $\text{sk}$ obtained by other means. And if such guess could be made, $\Pi$ would be insecure.

Closer to formal: we could turn a PPT algorithm $\mathcal A'$ that breaks IND-CPA for $\Pi'$ into a PPTA $\mathcal A$ that breaks IND-CPA for $\Pi$, or breaks (first) premiage resistance for $H$, as follows:

  • when $\mathcal A$ receives $\text{Pub}$ from the IND-CPA experiment and $h$ which first preimage is thought, start $\mathcal A'$ with $h$ instead of $H(\text{sk})$
  • when $\mathcal A'$ passes some message $m$ to query the random oracle $H$
    • pass this $m$ to the oracle; if the oracle outputs $h$, stop with output: "a preimage of $h$ is $m$".
    • otherwise, draw a random message $m_0$, encipher it under $\text{Pub}$ yielding $c_0$ (with $m_0$ and $c_0$ optionally cached for subsequent reuse), and try to decipher $c_0$ with $m$ as the private key; if decryption succeeds and yields $c_0$, presumably $m$ is $\text{sk}$, or some $\text{sk}'$ as good as $\text{sk}$ when it comes to decryption (if the nature of the cryptosystem leaves any doubt about this, we can and should confirm it with a number of other random message $m_i$). From then on, no longer let $\mathcal A'$ run; rather, use knowledge of that $m$ usable as $\text{sk}$ to win the IND-CPA experiment with certainty.
    • otherwise (that is, $m$ passed by $\mathcal A'$ to the random oracle is neither a preimage of $h$ nor usable to decipher), let $\mathcal A'$ continue it's run giving it whatever the oracle returned.

The outcome "finding a preimage of $h$" has vanishingly low probability. Excluding that, the probability of success of $\mathcal A$ is at least that of $\mathcal A'$.

More precisely: Probability of finding a preimage of a random $b$-bit string by a fresh¹ random oracle with $b$-bit output is $\le q/2^b$ where $q$ is the number of queries to the oracle (regardless of other work). Thus any non-negligible probability of success of $\mathcal A$ would not come from stopping with a preimage, thus must come from winning the IND-CPA game (per one of the last two bullets). Whenever $\mathcal A'$ would wins the IND-CPA game, $\mathcal A$ also does. Thus if $\mathcal A'$ wins the IND-CPA game with non-negligible probability, so does $\mathcal A$.


¹ I think this is where the independence is necessary: if $H$ is part of the definition of $\Pi$, then the oracle may have been part of the key generation process and is not fresh.

puppet puppet avatar
md
To clarify, the private key for the modified $\Pi$ is actually still $\text{Priv}$, just that we modify the decryption procedure to take $\text{Priv}$, compute $H(\text{Priv})$ and decrypt with that? Otherwise, if the modified $\Pi$'s private key is $H(\text{Priv})$, exposing the "hash of the private key" would actually be $H(H(\text{Priv}))$. As for $\Pi'$ is secure if $H$ is an RO/PRF independent of anything in $\Pi$, can we prove this formally (ignoring the independence property since that is hard to formalize)? The best reduction in this question so far does not show this AFAICT.
fgrieu avatar
ng flag
@puppetpuppet: yes, the private key for the modified $Π$ is $\text{Priv}$. I tried to clarify that. We can't make a valid proof without some degree of formalization of the independence. I made an attempt at that. The formalism is far from perfect.
puppet puppet avatar
md
Can you explain the final probability? Let $\text{H},\text{Q}$ be the events that $\mathcal{A}'$ halts on a preimage and that $m_i$ decrypts $c_0$ successfully. Assuming pair-wise independence of $\text{H},\text{Q}$ and $\text{H},\Pr[\mathcal{A}'\text{ wins}]$ $\Pr[\mathcal{A}\text{ wins}] = \Pr[\neg\text{H}]\Pr[\text{Q}\vert\neg\text{H}] + \Pr[\neg\text{H}\cap\neg\text{Q}]\Pr[\mathcal{A}'\text{ wins}\vert\neg\text{H}\cap\neg\text{Q}]=\Pr[\neg\text{H}]\Pr[\text{Q}]+\Pr[\neg\text{H}]\Pr[\neg\text{Q}]\Pr[\mathcal{A}'\text{ wins}\vert\neg\text{Q}]$.
puppet puppet avatar
md
^In other words, why is $\text{Adv}(\mathcal{A}) = \vert\Pr[\mathcal{A}\text{ wins}] - \frac{1}{2}\vert$ non-negligible, provided $\text{Adv}(\mathcal{A}')$ is non-negl? If $q_H$ is the number of hash queries, then $\Pr[\neg\text{H}] = (1-\frac{1}{\vert X\vert})^{q_H}$. I can't figure out how to express/bound $\Pr[\mathcal{A}\text{ wins}]$ s.t. $\text{Adv}(\mathcal{A})$ is non-negl.
fgrieu avatar
ng flag
@puppetpuppet: Is it clearer? In the circumstance, I don't see that expressing things in term of advantage brings much; and I admit I'm not comfortable enough with that to make this effort.
puppet puppet avatar
md
The game and proof is clear; I am not sure that the final implication is sufficient. For asymptotic security, we must show that if $\Pr[\mathcal{A}'\text{ wins}]$ is non-negl better than chance, then $\Pr[\mathcal{A}\text{ wins}]$ is non-negl better than chance. Following the proof, even if we assumed complete independence between the events of finding a preimage, finding a key, and $\mathcal{A}'$ winning, the best bound I can get is $\Pr[\mathcal{A}\text{ wins}] \geq \Pr[\text{no preimage found}]\Pr[\mathcal{A}'\text{ wins}] = (1-\frac{1}{2^b})^q \Pr[\mathcal{A}'\text{ wins}]$.
puppet puppet avatar
md
Even ignoring asymptotic analysis.. $\Pr[\mathcal{A}\text{ wins}]\geq(1-\frac{1}{2^b})^q\Pr[\mathcal{A}'\text{ wins}]$ means that if an adversary can beat $\Pi'$ w.p. $p'$, we can construct an adversary that beats $\Pi$ w.p. at least $(1-\frac{1}{2^b})^qp'$. In other words, if no adversary can beat $\Pi$ w.p. $p$, no adversary can beat $\Pi'$ w.p. better than $(\frac{2^b}{2^b-1})^qp$, contingent upon all caveats about independence, $q$ oracle queries, etc. Unfortunately, this means that some adversary could still beat $\Pi'$ w.p. $p$ because $p < (\frac{2^b}{2^b-1})^qp$.
puppet puppet avatar
md
Upon further reflection, maybe that’s sufficient. We haven’t shown asymptotic security, but the concrete security loss due to exposing the hash of the private key (under RO and independence assumptions), is quantified and clearly small for reasonable $q$, say $q \ll b$; that is the adversary can do at most a factor of $(\frac{2^b}{2^b-1})^q$ better to break a PKE scheme when the hash of that scheme’s private key is exposed. Thanks, accepting this answer for now.
fgrieu avatar
ng flag
@puppetpuppet: I see a gap in my own proof sketch (starting at closer to formal). Problem is, it's conceivable that giving a random number rather than the real $H(sk)$ to $\mathcal A'$ changes it's behavior to the worse. To plug that hole, we need to add that this can't occur because with $H$ not part of $Π$, $H(sk)$ is indistinguishable from a random number other than by exhibiting a preimage. Formalizing that is giving me a headache.
Score:2
cn flag

The generic statement is false. You would need an additional assumption that $\Pi$ and $H$ are “sufficiently” independent.

Let's build a “silly” counterexample. From a public key encryption scheme $\Pi$, define a public encryption scheme $\Pi_H$ which is $\Pi$ except that what $\Pi_H$ calls its private key $k$ is hashed with $H$ before doing anything else. (If the output of $H$ is not long enough, use a PRF seeded with $H(k)$.) Revealing $H(k)$ (where $k$ is the private key of $\Pi_H$) is equivalent to revealing the private key of $\Pi$. So revealing a hash of the private key is definitely not secure for $\Pi_H$.

This may look silly at first glance, but it's actually pretty close to how EdDSA is defined.


If you need a “key hash” value, base it on the public key, not on the private key.

$H(S || k)$ where $S$ is some string that “doesn't appear” in the construction of $\Pi$ (so $H(S||\cdot)$ is a hash function that's “independent” from $\Pi$) is probably safe in practice, but I don't know how to formalize “doesn't appear”.

puppet puppet avatar
md
1. I agree that for a proof in the RO, independence is necessary so that $\Pi$ (the original scheme) does not depend specifically on $H$, per Bellare and Rogaway's [initial paper](https://cseweb.ucsd.edu/~mihir/papers/ro.pdf) on ROM. 2. That said, can we prove security in the ROM? Or build a "counterexample" in the ROM which is sufficiently independent? 3. I am not sure the counterexample is a counterexample. Wouldn't the "hash of the private key" for $\Pi_H$ be $H(H(sk))$, and crucially not $H(sk)$? 4. Agreed it would be easier to just not expose the hash of the private key.
puppet puppet avatar
md
On further reflection, point 3 can easily be addressed by saying the private key is $sk$, and the decryption procedure takes $sk$ and computes $H(sk)$ to do the remainder of the decryption. So it is a counterexample indeed (for insufficient independence).
Score:1
br flag

Your solution is almost there. Right now, the advantage of $\mathcal{B}$ is at least $$ \mathsf{Adv}^\Pi(\mathcal{B}) = \epsilon - \Pr[\mathsf{E}] $$ where $\epsilon = \mathsf{Adv}^{\Pi'}(\mathcal{A})$ and $\mathsf{E}$ is the event that $\mathcal{A}$ queries $H(sk)$. Of course, this lower bound isn't good enough, because $\mathsf{E}$ might somehow be very likely. Fortunately, $\mathcal{B}$ can actually benefit from $\mathsf{E}$ happening, and turn it into a win.

It does this with some polynomial loss. Let $\mathcal{B}$ log all of $\mathcal{A}$'s oracle queries. For each query $q$, let $\mathcal{B}$ attempt to decrypt the ciphertexts $c_0,c_1$ using $q$ as the secret key. If it gets back $m_b,m_{1-b}$, then $q$ might be the secret key. So $\mathcal{B}$ puts $b$ in its set of potential answers $S$. Finally, $\mathcal{A}$ returns its own guess $b'$, which $\mathcal{B}$ will also put in $S$. In the final step, $\mathcal{B}$ returns a uniformly selected element from $S$.

With this modification, if $\mathsf{E}$ happens, then $\mathcal{B}$ will win with probability at least $1/|S|$ (that is, the likelihood that $\mathcal{B}$ selects a $b$, times the likelihood that that $b$ came from the $\mathsf{E}$ event). Similarly, the likelihood that $\mathcal{B}$ chooses $b'$ and it's correct is at least $\epsilon/|S|$, independent of $\mathsf{E}$.

Thus, with the modification, by the law of total probability, \begin{align*} \Pr[\mathcal{B} \text{ wins}] &= \Pr[\mathsf{E}] \cdot \Pr[\mathcal{B} \text{ wins} \mid \mathsf{E}] + \Pr[\lnot\mathsf{E}] \cdot \Pr[\mathcal{B} \text{ wins} \mid \lnot\mathsf{E}] \\&\geq \Pr[\mathsf{E}] \cdot (1/|S|) + \Pr[\lnot\mathsf{E}] \cdot (\epsilon / |S|). \end{align*} So if $\Pr[\mathsf{E}]$ is noticeable, then $\mathcal{B}$ wins with noticeable probability by picking $sk$ (since $|S|$ is poly-sized). And if $\Pr[\lnot\mathsf{E}]$ is noticeable, then $\mathcal{B}$ wins with noticeable probability by outputting $b'$.


Update: This analysis is incomplete. It's not clear why the advantage of $\mathcal{B}$ is noticeable. Here's a failing example.

Consider the following $\mathcal{A}$. Say $\mathcal{A}$ makes two queries $q_0,q_1$ to $H$. In the case $b=1$ $\mathcal{A}$ will make $q_0$ decrypt the ciphertexts to $m_0,m_1$ and $q_1$ decrypt the ciphertexts to $m_1,m_0$. In the case $b=0$, then $\mathcal{A}$ will make both $q_0$ and $q_1$ decrypt the ciphertexts to $m_1,m_0$. Then \begin{align*} \Pr[\mathcal{B}(1) = 1] &= \frac{1 + 0 + \Pr[\mathcal{A}(1) = 1]}{3} \\\\ \Pr[\mathcal{B}(0) = 1] &= \frac{1 + 1 + \Pr[\mathcal{A}(0) = 1]}{3} \\\\ \mathsf{Adv}(\mathcal{B}) &= \left|\frac{\Pr[\mathcal{A}(1) = 1] - \Pr[\mathcal{A}(0) = 1] - 1}{3}\right| \\\\ &\geq |\mathsf{Adv}(\mathcal{A}) - 1/3| \end{align*} (the last ineq is reverse triangular inequality). This is not a lower bound we want, since it vanishes if $\mathcal{A}$'s advantage is $1/3$. The issue might have to do with the supposition that $\mathcal{A}$ can construct that many fake decryption keys, especially ones that decrypt to precisely the opposite thing. It's probably the case that the phrase "the private key" in the question is doing some heavy lifting.

puppet puppet avatar
md
Is $\epsilon = \text{Adv}^{\Pi'}(\mathcal{A})$ equal to $\Pr[\mathcal{A} \text{ wins}]$? I've seen definitions which account for guessing better than chance (e.g. $\text{Adv}(\mathcal{X}) = \vert \Pr[\mathcal{X} \text{ wins}] - \frac{1}{2}\vert$), but for the proof we need $\epsilon$ to be the raw success probability of $\mathcal{A}$. Namely, for $\Pr[\mathcal{B} \text{ wins} \vert \neg E] \geq \Pr[\mathcal{B} \text{ chooses } \mathcal{A} \text{'s answer}] \times \Pr[\mathcal{A} \text{ wins}] = (1 / \vert S \vert) \times \epsilon$, we assume that $\Pr[\mathcal{A} \text{ wins}] = \epsilon$.
rozbb avatar
br flag
Yeah, I think I was inaccurate here. I assumed $\Pr[\mathcal{A} \text{ wins}]\geq \epsilon$. Which I think is true but you could probably do this analysis more tightly.
puppet puppet avatar
md
Treating $\epsilon = \text{Adv}^{\Pi'}(\mathcal{A}) = \Pr[\mathcal{A} \text{ wins}] - \frac{1}{2}$ (drop absolute value w.l.o.g.), the final lower bound becomes $\Pr[\mathcal{B} \text{ wins}] \geq \Pr[E] (1/\vert S\vert) + \Pr[\neg E] (\epsilon + \frac{1}{2}) / \vert S\vert$. To finish off the proof, we'd need to show $\epsilon \geq \text{negl}(n) \land \vert S\vert = \text{poly}(n) \implies \Pr[\mathcal{B} \text{ wins}] - \frac{1}{2} \geq \text{negl}(n)$. I'm not sure how to get there; alternatively, as you said, maybe a tighter bound is necessary to show this.
rozbb avatar
br flag
Great point. Updated the post to show a counterexample to my answer. Not sure how to fix it yet. You can consider this question unanswered for now. Sorry for the false answer.
Gilles 'SO- stop being evil' avatar
cn flag
”It's probably the case that the phrase "the private key" in the question is doing some heavy lifting” — exactly. If the “real” private key happens to be the hash of the “formal” private key, revealing the hash of the formal private key is trivially insecure.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.