Score:3

Is it possible to sign in to a website using two different passwords using an MD5 hash collision?

sb flag

I wanna do an experiment. I wanna see if it's possible to sign in to an outdated website that still uses MD5 to store passwords (there are surprisingly still a lot) with two different passwords.

For example "Password123" must have another string that produces the same hash value. I've found a few examples online of two strings that produce the same MD5 hash value, but all of them would exceed the password size limit of any website. They were also almost identical, with just 2 or 3 characters that are different in a string of like a hundred characters. And let's be honest, showing that to a friend to explain the principles of hashing isn't nearly as impressive as signing in with 2 completely different passwords. Anyone knows where or how to find two strings that are not too long and different enough that would produce the same hash value? Or do you think this experiment won't work?

This is one MD5 collision I found:

d131dd02c5e6eec4693d9a0698aff95c2fcab58712467eab4004583eb8fb7f89 55ad340609f4b30283e488832571415a085125e8f7cdc99fd91dbdf280373c5b d8823e3156348f5bae6dacd436c919c6dd53e2b487da03fd02396306d248cda0 e99f33420f577ee8ce54b67080a80d1ec69821bcb6a8839396f9652b6ff72a70

and

d131dd02c5e6eec4693d9a0698aff95c2fcab50712467eab4004583eb8fb7f89 55ad340609f4b30283e4888325f1415a085125e8f7cdc99fd91dbd7280373c5b d8823e3156348f5bae6dacd436c919c6dd53e23487da03fd02396306d248cda0 e99f33420f577ee8ce54b67080280d1ec69821bcb6a8839396f965ab6ff72a70

Both produce digest:

79054025255fb1a26e4bc422aef54eb4

But as you can see these are not suitable passwords.

DannyNiu avatar
vu flag
Would any website that still uses MD5 for password hashing protect themselves against password length DoS attack?
Score:5
ng flag

Let's assume that the password hash consists of an 128-bit MD5 hash over the password. Due to the birthday problem there is a >98%† chance that, among 11-characters passwords consisting of ASCII letters and digits (62 possible values, giving $2^{\approx65.5}$ passwords), two have the same hash. Thus what's asked most likely mathematically exists.

However, the fast techniques we know to obtain MD5 collisions are for 64-byte (or more) inputs. They are not applicable in the context.We thus need brute force, that is in the order of $2^{65}$ MD5 computations, when a GeForce RTX 3090 does in the order of $2^{36}$ MD5/s (source), thus we'd need running like 200 such GPUs for a month. High-end FPGAs would also do, I imagine with less power.

In order to avoid tremendous RAM requirements, algorithms to find collision of a hash function by brute force use some cycle finding technique. The simplest re-hashes the previous hash result (or a surjective remapping of the hash to the message space, e.g. encoding the hash to Base64 before hashing again if we want collision between ASCII messages), until finding a cycle. With near certainty there is a collision where the cycle closes, which can be identified e.g. by Floyd's technique. Variations allow to distribute the work with feasibly little RAM, and identify where the cycle closes with less work. The standard references is Paul C. van Oorschot and Michael J. Wiener, Parallel Collision Search with Cryptanalytic Applications, in Journal of Cryptology, 1999.

One issue with the technique that I summarized is that the message domain must be as large as the hash domain, requiring $2^{128}$ passwords, like 20 characters among 85. I'm not sure existing password systems accept that.

I asked for techniques allowing shorter passwords, and now I'm given references showing that's been studied and is possible to some degree. I'll update the present answer when I'll have that ironed out.

Unfortunately I have no sponsor for the GPUs or FPGAs.


† Computed as $1-e^{-62^{11}/2^{129}}$

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.