Score:7

Do I need to sanitize user input to scrypt, or to PBKDF's in general?

tr flag

I'd like to allow the user to supply a password as input to some PBKDF, which I will use to construct a key for file encryption (currently using aes-256-ctr. It may change as I learn more).

I am considering using scrypt. Do I need to do any escaping, sanitization, or other checks on the user input I will pass to scrypt?

More generally, do PBKDF's in general require any safety checks on user supplied input to them?

jp flag
In general you do *not* need to sanitize or safety check *anything* unless there is a specific reason... but *a lot* of those specific reasons do exist, so it's good to assume you do until shown otherwise.
Score:19
fr flag

No, you do not need to do escaping or sanitization on data that you pass in as the user input to these functions. They accept arbitrary byte sequences, so any arbitrary byte sequence you pass is acceptable, and there should be no security risks as a consequence of it. In general, cryptographic algorithms operate on arbitrary byte sequences (possibly of specific sizes) and don't require standard escaping or sanitization for security (although they may require padding, range, or other types of checks) and systems that use the data may require this.

However, if you are accepting passwords that contain non-ASCII characters, you probably want to do some sort of Unicode normalization on the string (probably NFC), since there are often multiple ways to express the same logical character. For example, you could express "é" as a single code point (U+00E9) or as two code points (U+0065 U+0301), and normalization will rewrite these to the same string. Again, there are no security issues with this, but because users will think of these two passwords as the same when they have different byte sequences, performing normalization allows your system to think of them as the same password as well.

phoenixdown avatar
tr flag
oh very cool, i didn't think of the normalization thanks! is there a simple way to do NFC Unicode normailzation in javascript (nodejs) that you might prefer? i'll have a google about it as well, thanks
Aman Grewal avatar
gb flag
@phoenixdown javascript has a built-in way to normalize https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/normalize
jp flag
@phoenixdown: Note that there are other kinds of "normalization" that may or may not make sense. It's more a question of usability than security. For example, I believe Facebook downcases passwords to prevent problems due to stuck shift keys / inadvertent caps lock. Then there's the question of homoglyphs, i.e. different characters that look the same. That may even depend on the font, e.g. some fonts use different glyphs for the Greek letter my and the micro sign, some use the same glyph. So, someone may think they typed a my but actually typed a micro or vice versa.
bk2204 avatar
fr flag
Case folding passwords significantly worsens security and you should not do it. Additionally, it is impossible to correctly fold the case of Unicode text in a locale-insensitive way. The IETF has other kinds of Unicode normalization that can be applied to passwords if NFC doesn't meet your needs that address some of these issues.
in flag
Be careful regarding normalization. On the one hand, the user may not be aware of which version of é they entered. On the other hand, they may be *very* aware and are assuming that the password handler will not make changes to their entry. At least document which normalization will be applied.
phoenixdown avatar
tr flag
@bk2204 - which RFC are you thinking of for Unicode normalization applied to passwords? Is it https://datatracker.ietf.org/doc/html/rfc8265 ?
bk2204 avatar
fr flag
Yes, that's the RFC.
Score:5
gh flag

This will depend on the specific implementation of the KDF that you're using. I'm not aware of any known issues with scrypt (although that doesn't meant there aren't any), but there have certainly been issues with the PHP implementation of Bcrypt where the presence of null bytes in the input would cause problems.

ar flag
I'd like to blame that mostly on PHP, but a quick glance at Wikipedia [suggests](https://en.wikipedia.org/wiki/Bcrypt#Versioning_history) that the handling of null bytes in bcrypt was indeed poorly specified, and I wouldn't be surprised if the original C reference implementation also had the same bug. (In any case, bcrypt isn't really a KDF but a password hashing function, and a rather old and outdated one at that. You can't use it for key derivation — at least not easily — and given that there are better alternatives, you really shouldn't be using it for password hashing nowadays either.)
phoenixdown avatar
tr flag
Thanks @Gh0stFish, i notice in the blog post the author says `scrypt` and `PBKDFv2` are not effected by the issue. What would you recommend? Checking the input for any null bytes and removing them before passing to `scrypt`? @bk2204 is this something you would also recommend?
Gh0stFish avatar
gh flag
@phoenixdown I don't think that you necessarily need to strip out null bytes, you just need to check that the specific *implementation* you're using handles them properly. For example, if you compare two strings such as `foo` and `foo\0bar` (with a fixed salt) and you end up with the same result, that suggests the library you're using doesn't handle null bytes properly - in which case you'd need to remove them.
phoenixdown avatar
tr flag
Thanks that's a great idea - i can add a test to ensure `foo` and `foo\0bar` do not evaluate to the same result, and fail if they do.
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.