This exposes information about the password: namely it's estimated entropy. The bet you're making is that the increased work factor overshadows any advantage an attacker may gain knowing that information.
How could someone use this? Well, I could decide to only target the rows with a low work factor. Since your entropy estimate is high for these rows, I can know that it's more likely they'll be 8 characters or longer and use a wider range of characters. I can likely ignore all candidate passwords that are shorter or that do not include non-alpha characters.
How useful is this? Let's assume 2 choices of work factor. Also let's assume strong passwords of length 8 have 96^8 ~= 53 bits of entropy and eak passwords of length 8 or less have 27^8 ~= 38 bits of entropy.
You just let me cut the search space for strong passwords of length 8 to to ~15 bits, and in a double whammy, I get to use bcrypt with a low work factor when brute forcing against these rows.
I'm not a cryptographer, and so it's entirely likely I've made some mistake here. But as a general rule, I think it is an _extremely_ bad idea to use cryptography in any way that exposes additional information about individual rows in a database.
Cryptography is not a place for innovative thinking. Even cryptographers need their cleverness to undergo exhaustive review.
If I’m remembering my discrete math correctly, your claim isn’t correct:
> … Let's assume 2 choices of work factor. Also let's assume strong passwords of length 8 have 96^8 ~= 53 bits of entropy and eak passwords of length 8 or less have 27^8 ~= 38 bits of entropy.
> You just let me cut the search space for strong passwords of length 8 to to ~15 bits…
You can’t subtract bits of entropy like that.
Here’s something I hope will convince you this reasoning is faulty.
Imagine a universe of 3-digit passwords, and there are two kinds of passwords: Strong ones use a mix of digits 0–7, and weak ones only use the digits '0' or '1'.
You could see the strong passwords could be any of 8^3 = 512 different combinations (~9 bits of entropy), except the 2^3 = 8 combinations (3 bits of entropy) that would only contain ones and/or zeros. So while a worst-case for brute-forcing a known-weak password is trying 8 strings, the worst-case for brute-forcing a known-strong password is trying 504 strings. This is still the same order of magnitude, and still approx. 9 bits of entropy! You removed such an incredibly small sliver of passwords, that an attacker really isn’t any better off than before.
Another way to think of this is, just because the user didn’t use only lowercase letters, doesn’t mean that none of the characters are!
Back to your example, with a strong password search space of 96^8. Now if you know a password is strong, that means it isn’t one of the 27^8 possible weak passwords. By how much does this reduce our search space?
7,213,895,789,838,336 possible strings of length 8
- 0,000,282,429,536,481 possible 'weak' 8-char passwords
= 7,213,613,360,301,855 possible 'strong' 8-char passwords
We’ve reduced our search space by only .0039%.
That said, rolling your own crypto — which the grandparent post isn’t really quite doing — is something you should run away from, fast, unless you really are a cryptographer!
If I understand the grandparent correctly, there would be no way to determine which rows have a higher work factor. The work factor would be determined when the password is given, based on the password entropy - the correct password will always have the same entropy, therefore the same work factor. If the work factor is calculated on the wrong password (typo, etc.) it will not generate the correct hash anyway. Therefore an attacker has no way of determining which password hashes have a high work factor, and which ones have a low work factor.
Ah, yes, I'd missed that. Since you have the password at login time, you don't need to store the work factor. There are some practical things about making sure you can version the entropy estimate code without breaking existing logins, but it's certainly doable.
How could someone use this? Well, I could decide to only target the rows with a low work factor. Since your entropy estimate is high for these rows, I can know that it's more likely they'll be 8 characters or longer and use a wider range of characters. I can likely ignore all candidate passwords that are shorter or that do not include non-alpha characters.
How useful is this? Let's assume 2 choices of work factor. Also let's assume strong passwords of length 8 have 96^8 ~= 53 bits of entropy and eak passwords of length 8 or less have 27^8 ~= 38 bits of entropy.
You just let me cut the search space for strong passwords of length 8 to to ~15 bits, and in a double whammy, I get to use bcrypt with a low work factor when brute forcing against these rows.
I'm not a cryptographer, and so it's entirely likely I've made some mistake here. But as a general rule, I think it is an _extremely_ bad idea to use cryptography in any way that exposes additional information about individual rows in a database.
Cryptography is not a place for innovative thinking. Even cryptographers need their cleverness to undergo exhaustive review.