Score:3

Would this be considered a secure password hash?

ng flag

I think I've understood properly, but I want to make sure as this will involve money.

  1. Password requirement is a minimum of 16 characters and must contain one of each [Upper, Lower, Digit, Other]
  2. My salt is Crypto RNG generated 64 bytes
  3. I convert salt to Base64 (without the trailing ==)
  4. I then get the UTF8 bytes of concatenating (SaltString + Password)
  5. I then SHA512 those bytes in a loop 100k times
private static byte[] SaltPassphrase(byte[] salt, string password)
{
    var sha512 = SHA512.Create();
    // Salt as Base64 string (without the == at the end) followed by password
    string input = $"{Convert.ToBase64String(salt)[0..^2]}{password}";
    byte[] bytes = sha512.ComputeHash(Encoding.UTF8.GetBytes(input));
    for (int i = 0; i < 100_000; i++)
        bytes = sha512.ComputeHash(bytes);
    return bytes;
}

EDIT Thanks everyone for your detailed explanations and useful comments!

jp flag
You're putting way too much emphasis on the salt -- all it really needs is to be unique, so an attacker can't run parallel attacks against multiple accounts that have the same salt. Beyond that is overkill.
hm flag
Agreed that the salts are larger than needed - but salts are for not just resistance to parallel computation, but also resistance to *pre*-computation. So they need to be not just unique within the scope of the target hashlist / platform, but also practically unique *globally* - enough to make it infeasible to precompute and then store them. For example, bcrypt and scrypt use a 128-bit (16 byte) salt - still quite a bit smaller than 64 bytes, but enough to make building a list of all values largely infeasible.
marcelm avatar
cn flag
**[Do not create your own password hashing function](https://security.stackexchange.com/a/31846/99775), full stop.**
Persistence avatar
br flag
@marcelm - If everyone followed that advice we'd have no password hashing functions... Maybe couch that with "unless you're an expert in cryptography"
marcelm avatar
cn flag
@Persistence No, because people tend to think they're an expert in something they're not. This comment is made in the context of countless SE questions about hobe-brew crypto, and the answer is always: _don't_. Actual cryptographers who do this professionally won't be stopped by a comment on crypto.SE anyway.
Persistence avatar
br flag
You have a point... The Dunning Kreuger effect is strong
Seth R avatar
in flag
@Persistence experts in cryptography are not looking for advice on Stack Exchange. If you are reading this, you should not be creating hash functions.
Persistence avatar
br flag
@SethR - That's not necessarily true... I'm both an expert software and electronics engineer, to a postgraduate level. I still use Stack Overflow and Electronics SE because other experts use it too and nobody, not even experts, can know everything.
Peter Morris avatar
ng flag
I didn't create a crypto hashing function, I used existing ones.
marcelm avatar
cn flag
@PeterMorris No, you created your own password hashing function. The fact that you used SHA512 as a primitive in your function is irrelevant. It's pretty easy to create an insecure algorithm using secure building blocks. [AES-ECB](https://crypto.stackexchange.com/a/20946/32547) would be a well-known example of that. Cryptography is very hard to get right, that's why people who know what they're doing only use well-vetted algorithms that have stood up to public scrutiny. Rolling your own crypto opens you up to much larger risks than using well-known algorithms.
Peter Morris avatar
ng flag
@marcelm Ah I see, thanks for that information, I appreciate it!
in flag
"and must contain one of each [Upper, Lower, Digit, Other]" -> I normally let my password manager generate a random password for each website but for websites that force me to use certain special characters I have a single password that basically just consists of special characters. Those are usually the websites that don't matter.
marcelm avatar
cn flag
@PeterMorris No problem! I also spotted an interesting defect in your algorithm. All this is becoming way too much for the comments so I wrote an answer.
Wernfried Domscheit avatar
za flag
Why do you trim the `==` from your salt?
Wernfried Domscheit avatar
za flag
Apart from all the theory given in the various answers: At least some people think, developing a hash algorithm is easy. But look how many, or better how few [password hash algorithms](https://en.wikipedia.org/wiki/Cryptographic_hash_function#Cryptographic_hash_algorithms) are actually available, it seems to be a tough job and only few experts around the world are able to develop them. It seems to be much easier to develop a [programming language](https://en.wikipedia.org/wiki/List_of_programming_languages)
Score:32
in flag

Except for the iteration amount ( we can argue that is small, too and some entropy loss *), the answer is no!

  • There is no memory hardness that mainly prevents massive ASIC/GPU searches
  • There are no multiple threads that mainly cripple the massive CPU searches.

There are already password hashing algorithms other than the cryptographic hashes like the SHAx series. Some important are Scrypt (2009), PBKDF2 (2000), Argon2 (2015), and Balloon Hashing (2016). Argon2 was the winner of the password hashing competition held in 2015. Now, Balloon Hashing can fight with Argon2. Use Balloon Hashing or Argon2 for secure password hashing.

There are already libraries for each that you only need to parametrize like for Argon2 instead of writing the details yourself.

A good Argon2 library

Here are the parameters;

Usage:  ./argon2 [-h] salt [-i|-d|-id] [-t iterations] [-m memory] [-p parallelism] [-l hash length] [-e|-r] [-v (10|13)]
        Password is read from stdin
Parameters:
        salt            The salt to use, at least 8 characters
        -i              Use Argon2i (this is the default)
        -d              Use Argon2d instead of Argon2i
        -id             Use Argon2id instead of Argon2i
        -t N            Sets the number of iterations to N (default = 3)
        -m N            Sets the memory usage of 2^N KiB (default 12)
        -p N            Sets parallelism to N threads (default 1)
        -l N            Sets hash output length to N bytes (default 32)
        -e              Output only encoded hash
        -r              Output only the raw bytes of the hash
        -v (10|13)      Argon2 version (defaults to the most recent version, currently 13)

        -h              Print argon2 usage

I might ask what is $i,d,$ and $id$ in Argon2. The answer is from draft-irtf-cfrg-argon2-03;

If you do not know the difference between them or you consider side-channel attacks as viable threats, choose Argon2id.

  • Argon2d is faster and uses data-depending memory access. Data dependency immediately enables side-channel. This is suitable for cryptocurrencies and applications with no threats from side-channel attacks.

  • Argon2i uses data-independent memory access and this is preferred for password hashing and password-based key derivations.

  • Argon2id In the first half of the first iteration works as Argon2i and the rest works as Argon2d. This enables both side-channel protection and time-memory trade-off.

Balloon Hashing

Now, It is in NIST Special Publication 800-63B and it needs some time for adaptation. There are some libraries ready to use.

The main properties are;

  • The memory-hardness has been proven
  • It is designed on the standard primitives: it can use any of the non-memory hard cryptographic hash functions like SHA2, SHA3, and BLAKE2.
  • It has built-in resistance against side-channel attacks that requires that the memory access pattern is independent of the bits of the data to be hashed.

Little about passwords

For the user's passwords, you should educate them to use dicewire based password generation. If they are not using them, then restricting them can be helpful. Note that, dicewire based passwords don't need upper, lower, digit, etc restrictions. So be careful about restricting users to some rules.

Little about password managers

Besides, almost everybody has a smartphone. One can use password managers to store passwords that are almost random passwords generated by the password managers and stored. Let users know about this, too. Some of them are free like LastPass, 1Password, and KeePass.


Answer to comment:

I can't really use an algorithm that uses lots of memory (if I understood Memory Hardness correctly) because it's a server. Is the approach correct, is SHA512 suitable, and are the iterations suitable considering password length?

SHA-512 has very little memory usage on the passwords, only needs to store 1024 bits to produce a password hash. Argon2's default memory parameter is $2^{12}$KiB = 4MiB. If we consider that a system can run any amount of threads then the power to brute-force an SHA-512 based algorithm is reduced by 32768-times. So, even a little memory usage strengthens the password hashing complexity by $2^{15}$. 4MiB is totally fine for servers and one can adjust the memory according to login access per second or better buy some more memory. Remember, password hashing algorithms left the memory to the system. The bottleneck to consider is the number of login requests per second.


‡) OpenSSL speed result to show that SHA-512 100K is small.

Doing sha512 for 3s on 64 size blocks: 10896142 sha512's in 3.00s
Doing sha512 for 3s on 256 size blocks: 4673961 sha512's in 2.99s

*) How much entropy do we lose entropy per iteration of a cryptographic hash function

Squeamish Ossifrage wrote a great answer about this.. The core of the result is

Rather, there's a good chance that for any fixed function F, there are many cycles on which F is a permutation, and restricted to which F thus preserves entropy

Peter Morris avatar
ng flag
I can't really use an algorithm that uses lots of memory (if I understood Memory Hardness correctly) because it's a server. Is the approach correct, is SHA512 suitable, and are the iterations suitable considering password length?
kelalaka avatar
in flag
@PeterMorris Password hashing function has adjustable parameters. You need to set your security target then adjust the parameters. If you cannot use high memory then you need to increment the iteration count. You can find the performance on the hashcat forums especially for the GPUs.
stefreak avatar
jp flag
@PeterMorris password hashing algorithms are used on servers almost 100% of the time. A quick google reveals that Argon2 uses 1MiB of memory by default. This is totally fine and usually not a problem. Please don't roll your own password hashing, this stuff is hard to get right. Especially if stakes are high.
Cort Ammon avatar
gb flag
To put stefreak's comment in perspective, at current market prices you are talking about something on the order of four-tenths of a penny worth of DDR4 memory. If that is unacceptable, then I think it's worth thinking long and hard about whether you have the infrastructure needed for a system that involves money.
il flag
@kelalaka is there a reason why you suggest `argon2` over something like https://en.wikipedia.org/wiki/Bcrypt?
kelalaka avatar
in flag
@N3buchadnezzar most of the time Bcrypt is fine, as stated in [Wikipedia](https://en.wikipedia.org/wiki/Bcrypt#Comparison_to_other_password_hashing_algorithms). However, Argon2 has parallelization factor, side-channel resistance, and independent parametrization ( see the parameter of the library and compare to BCrypt's) and Argon2 is also A KDF Bcrypt is not.
Peter Morris avatar
ng flag
@kelalaka Thanks for your detailed reply. I've updated my question with regards to argon2. Would you mind letting me know what you think, please? Thanks!
kelalaka avatar
in flag
@PeterMorris once answered, do not change the source especially after all of this.You can ask a new question instead. I've rollbacked it.
kelalaka avatar
in flag
@PeterMorris And, your edition may well be fitted for [security.se], too. Look for answers, there too.
Peter Morris avatar
ng flag
@kelalaka Okay, I thought it would be okay to append. What kind of parameters am I going to need for memory size + iterations? I don't want my server crippled, but I don't want it too easy to break either.
kelalaka avatar
in flag
@PeterMorris default parameters? You can measure the time on your server. In general, we require that all processes must end in 1sec since we don't want to have a lag for the users. You can measure your server and ask about this on [security.se] see [this](https://security.stackexchange.com/questions/210982/what-are-the-minimum-parameters-for-argon2) or [search](https://security.stackexchange.com/search?q=argon2+parameter)
polfosol avatar
in flag
A nitpick: Lastpass is **NOT** free. It pretends to be so
Score:10
in flag

Beyond @kelalka's excellent answer it is worth pointing out key differences between your implementation and more standard iterative hashing approaches like PBKDF2. Though PBKDF2 isn't memory hard, like some newer password hashing algorithms I do consider it (or bcrypt) sufficient for most purposes. It is perhaps most similar to your proposed solution so it will be easier to understand. https://en.wikipedia.org/wiki/PBKDF2

A key change is that it hashes the password in repeatedly with the previous results. This prevents convergence of the hash. The iteration count is presumably small compared to the space of the hash function so this may not be significant, but if the effective space ends up smaller than we think it could be. When you hash the values repeatedly each iteration reduces your potential output space by a tiny amount due to collisions. When iterating over and over again you end up with less and less different values.

Also password complexity requirements are considered harmful. Consider a modern password policy: https://auth0.com/blog/dont-pass-on-the-new-nist-password-guidelines/

Maarten Bodewes avatar
in flag
What do you mean with convergence of a hash? A cycle (it seems so)? And if so, how does inclusion of a constant (within the iteration, of course) save you from that?
ar flag
@MaartenBodewes: I assume what Meir means by convergence is the fact that, because cryptographic hash functions (restricted to their output range) aren't bijections, iterating a hash will gradually decrease the number of possible outputs, and thus increase the risk of such iterated hashes colliding: if $H^{(m)}(a)=H^{(m)}(b)$ for some $m$, then $H^{(n)}(a)=H^{(n)}(b)$ for all $n≥m$. Including the original input in the hash function (i.e. replacing $H$ with something like $H_a(x) =H(x\ ||\ a)$ prevents this: $H_a^{(m)}(a)=H_b^{(m)}(b)$ does _not_ imply $H_a^{(n)}(a)=H_b^{(n)}(b)$ for $n>m$.
ar flag
… That said, as long as the hash function $H$ has a decent output length, collisions aren't a really major issue even for hash functions iterated without such tweaks. I think I have an old answer somewhere here where I do the math, but basically if the size of the _actual_ input space of the iterated hash (including both actual passwords and anything tried by brute force attackers) is significantly below the birthday bound, collisions are mostly irrelevant. For modern hashes with 256-bit or longer outputs, that's almost certainly the case.
Maarten Bodewes avatar
in flag
Ah, alright, that makes sense, although I think that expecting a 256 bit collision to happen is still a rather far fetched notion. Now I understand why salt or password are generally repeated in the input though. .... uh yeah, what you say :P
Meir Maor avatar
in flag
If we had a true PRF the issue indeed seema minor. But we worry we don't actually have a true PRF, and naive iterative hashing may amplify minor imperfections of our PRF. I also see no reason to leave trivially fixeable minor issues.
ilkkachu avatar
ws flag
A somewhat related matter is that since their thingy is already quite close to PBKDF2, they might as well _use PBKDF2_, so they'd at least get something known.
Meir Maor avatar
in flag
Definetly I said it's similar to PBKDF2, and obviously people should use standard stuff. PBKDF2 isn't state of the art, but still sufficient when tuned properly IMHO. Though personally it's my least favorite of the reasonable options, I used to prefer bcrypt and it has advantages of being around forever unblemished, and now we have argon and scrypt families. So though I deem PBKDF2 sufficient it's IMHO the worst reasonable option.
kelalaka avatar
in flag
@IlmariKaronen there is one [here](https://crypto.stackexchange.com/q/15068/18298).
Score:9
cn flag

It's a bad idea to design your own password hash.

Making cryptographic functions is still very much a process of trial and error. Sure, over time we have stumbled on some things that are a good idea or even mandatory. And we certainly have learned a lot of things that don't work. But we don't know how to make things that are proven secure all the way. A very subtle small mistake can cause a cryptographic construct to fail catastrophically. And there are a lot of opportunities for such mistakes. I'll name some later.

The best we can do right now is that people with experience in this make a crypto thing, and then everyone tries and break that crypto thing. If everyone tries enough and doesn't succeed at that for a long time then we have something that's probably good enough. We still can't prove the absence of problems, but if thousands of cryptanalysists couldn't, then hopefully the next thousand also can't.

But even this fails from time to time. Sometimes after years, decades even, a critical problem is found in something that was scrutinized by the smartest people for ages. That is why MD5 is no longer considered OK, and why RC4 is considered problematic. Or the landscape changes, leading e.g. to the need for salts in password hashes, and intentionally slow hash functions.

By using the best known current cryptographic algorithms, we can leverage this collective experience, and probably avoid repeating past mistakes. New mistakes may be discovered, but that can't be helped. Someone who builds their own crypto functions though, is likely to repeat some of those mistakes, or create entirely new ones.

I know, I know, it's fun to design that stuff yourself. And it feels good to build something that feels secure. Stack Exchange is littered with questions about custom encryption and (particularly) password hashes. Problem is, it's easy to build something that you yourself cannot break. But that doesn't mean others can't. This adage is ancient, and comes up often enough that it's now known as Schneier's law.

What can go wrong?

So, there can be problems with the very core of a crypto algorithm. This is what broke MD5 and RC4.

Or the algorithm can be completely fine, but an implementation can be broken. This can be a simple bug, or it can be something more subtle like a side-channel attack (which are surprisingly tricky to avoid sometimes).

But it's also possible (easy, even) for secure cryptographic primitives with bug-free implementations to be used incorrectly, rendering the system insecure. For example, correctly using encryption and digital signatures, but making naive assumptions during a communication handshake (SSL had its share of issues with this). Another well-known example is AES-ECB, which uses the perfectly fine AES in a way that can leak significant information.

In your hash function?

Your own password hashing function also falls prey to something like this here:

repeat 100_000:
    hash = sha512(hash)

SHA-512 is perfectly fine (as far as we know). Using it like this is frowned upon.

SHA-512 generates an "indistinguishable from random" 512-bit output for every input. It may generate the same output for two distinct inputs, this is called a collision. The problem is that N-bit hash functions on N-bit inputs are generally not bijective. That means that when feeding a SHA-512 output to itself, some of the possible 512-bit values will collide with others, producing the same output. By necessity, this means also that some other hash values cannot be generated at all.

Let's look at how this works for SHA-512, truncated to 4 bits. Specifically, we take the first byte from the output, and zero out the high 4 bits in this byte. The resulting single byte is used as input to the next round. Different ways of selecting the bits gives different results, but the principle is the same.

Trying all 16 possible inputs for several rounds gives us these chains:

in      1.      2.      3.      4.      5.      6.      7.
00  ->  08  ->  06  ->  08  ->  06  ->  08  ->  06  ->  08
06  ->  08  /
07  ->  06  ->  08  ->  06  ->  08  ->  06  ->  08  ->  06
08  ->  06  /   
03  ->  04  ->  05  \\
13  ->  15  ->  05  ->  09  ->  02  ->  10  ->  14  ->  14
04  ->  05  ->  09  ->  02  ->  10  ->  14  ->  14  /
15  ->  05  ->  09  ->  02  ->  10  ->  14  /
05  ->  09  ->  02  ->  10  ->  14  ->  14  
01  ->  11  ->  02  ->  10  ->  14  /
09  ->  02  ->  10  ->  14  ->  14  
11  ->  02  ->  10  ->  14  /
02  ->  10  ->  14  ->  14  
12  ->  10  /       /
10  ->  14  ->  14   
14  ->  14  /

After just 6 rounds, the 16 possible outputs have been reduced to just 3. The same effect can be seen for larger truncations. I ran some simulations for SHA-512 truncated to the first 1 byte (2⁸), 2 bytes (2¹⁶), and 3 bytes (2²⁴):

                bytes:  1    2      3
          input space:  256  65536  16777216
converged to M values:  13   330    2765
       after N rounds:  31   518    7114

You can see that the convergence effect is present in all three scenarios. As for uncut SHA-512, my computer isn't fast enough to compute this, I couldn't find any information on the strength and certainty of the effect, and I'm not smart enough to construct a proof (if a proof is even possible). But odds are you'll see the effect in full SHA-512 too. My instinct suspects that it reduces the hash space by its square root (halving the bit length), but I could be very wrong [update: see links provided in the comments: one, two, three]. 100k rounds probably limits the damage, too.

Now, does this really harm your hash? Probably not. But it is a weakness, one that real-world password hash functions take care to avoid (e.g. by ensuring the password and salt are mixed back in regularly).

I think this is a great example of the subtle traps in crypto design. It's so easy to miss something like this. That's why I recommend using a battle-tested algorithm. They're better than anything you or I can come up with.

The other answers already make some recommendations for what hashes to use; Okay: pbkdf2, bcrypt. Better: argon2, scrypt. I hear Balloon is a thing, but I don't know anything about it so I have no opinion on it.

kelalaka avatar
in flag
FYI. There is one calculation for the general case [Entropy when iterating cryptographic hash functions](https://crypto.stackexchange.com/q/15068/18298). I've remember another, however, couldn't find.
chux - Reinstate Monica avatar
cn flag
Nice answer that directly addresses OP's code.
kelalaka avatar
in flag
Ok, [I've found what I was looking for](https://crypto.stackexchange.com/a/51029/18298) and the answer should be surprising for you. We don't expect that iterated cryptographic hash functions are losing too much entropy thanks to the [expected large cycles](https://crypto.stackexchange.com/a/68442/18298) by Harris's work.
marcelm avatar
cn flag
@kelalaka Very interesting, thanks for the links! But if I understand [the third link](https://crypto.stackexchange.com/a/68442) correctly, it's saying that relatively _few_ elements are expected to be on cycles. Roughly √ℓ of them, in fact, so that iterating would ultimately reduce its effective space to the square root of elements, halving the effective bit length. Or am I misinterpreting something?
kelalaka avatar
in flag
The point you might miss is that 1M is too tiny to see this effect. The first link talks about 256 iterations and the third link talk about when iteration goes to square root. Most of the elements are still on the tails...
marcelm avatar
cn flag
Fair enough. That type of math is always interesting, but it's hard to get a good grasp of the scales involved. I did note the limited effect due to the number of rounds in my original answer though, but that entire part was just a hunch anyway :)
Score:1
cn flag

It think the answer is: "Secure" is relative. The 512 bits of salt (8*64 bits) alone gives roughly 6*10^57 possibilities. Transforming the salt (via BASE64) does not change its strength.

Then you have 992 bits (62*16 bits) that make the password (26*2 letters plus 10 digits), or roughly 4*10^28 possibilities. Assuming uniform probability (rather unlikely if chosen by a human) you'll have about 1504 input bits to produce 512 output bits, up to (has conflicts are likely) 5*10^452 possibilities. That's a huge number.

The numbers assume that the salt cannot be "factored out" from the hash, so the salt extends the search range.

Repeating the hash will only slow down building the table, not looking up the result (the resulting table will be at most 2^512 entries large).

kelalaka avatar
in flag
We consider that the salt is public. For a brute force attackers it only prevents the rainbow tables and they will search only the password not the password and the salt. An attacker to the system will get the salt and the hash. They are usually stored in a string. Therefore only the strength of the password and password hashing mechanisms are important.
kelalaka avatar
in flag
Salt kills the rainbow tables. That's obvious. However, the security of the password hashing is not based on the secrecy of the salt.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.