Summary: Jungledisk doesn't protect the integrity of encrypted data, and doesn't securely derive keys and is thus vulnerable to fast offline attacks. The thing Jungledisk right is to use the same block cipher mode as Tarsnap (and, incidentally, virtually every mainstream encrypted storage system).
The impact of using unauthenticated encryption to store data is that your backup provider could end up owning your machine. Attackers can carefully choose which data to corrupt. They can exploit the randomization of corrupted decryption to set up conditions for memory corruption exploits, and, in more sophisticated but totally realistic attacks, exploit guesses about known plaintext to produce attacker-controlled nonrandom plaintexts. A backup provider with client-authenticated crypto can't do that, because the keys that encrypt the data also ensure it's integrity.
The password storage issue is no different than any other password storage problem; again, direct your attention to http://codahale.com/how-to-safely-store-a-password/, mentally substituting "storage of password hash" to "derivation of AES key".
To my mind, the key derivation is the real problem here. A surprisingly large number of secure encryption storage products don't ensure data integrity. Realistic attacks against that vulnerability are feasible but difficult: you'd have to be targeted.
If you're going to write an article about how a competitor's encryption is inferior to yours and cast it as a vulnerability report, I'd suggest not recommending your own encryption scheme as the replacement. The scrypt recommendation in this article sticks out like a sore thumb. Virtually nothing uses scrypt.
We can nerd out on CTR mode vs. CBC mode; I'm starting to come around to Colin's take on CTR because of ciphertext indistinguishability as I see more practical vulnerabilities that take advantage of it. I think the padding issue is a red herring. CBC padding is easier to get right than absolute rock solid reliable generation of CTR nonces and absolute rock solid management of CTR counters, which are things I see people get wrong regularly. Distinguishability is the real problem with CBC.
To my mind, the key derivation is the real problem here. A surprisingly large number of secure encryption storage products don't ensure data integrity. Realistic attacks against that vulnerability are feasible but difficult: you'd have to be targeted.
I think the lack of integrity is more important than you're making it sound. There's a lot of situations where a lack of integrity can be exploited to create a lack of privacy too.
But the main reason I mentioned the lack of integrity first is that I needed to mention the lack of HMAC to explain why they had the ridiculous "salted key hash" construct.
If you're going to write an article about how a competitor's encryption is inferior to yours and cast it as a vulnerability report, I'd suggest not recommending your own encryption scheme as the replacement. The scrypt recommendation in this article sticks out like a sore thumb. Virtually nothing uses scrypt.
I think you're misstating what I wrote a bit. I said that scrypt is the state of the art in the field -- which it is -- and that given that Jungle Disk was around before I developed scrypt, they should have used PBKDF2 or bcrypt.
I'd rather geek out about CTR v CBC than harp on the scrypt recommendation. Consider the scrypt thing a friendly style note. You wrote an article about a competitor's insecurities. When you do that, don't recommend they adopt your own cryptosystem unless (like CRI had to do with DPA countermeasures) they have to. Here, it just made you look unnecessarily petty.
What privacy attacks were you thinking of? Call some of them out.
I think the author's point about privacy is valid, and a little silly. If I understand correctly (the article is very confusingly worded in some places), he is saying that weak passwords are weak. Anyone who cares about privacy should already be choosing long, complex, strong passwords for this kind of application.
Also, I'm confused about one feature of JD. When I signed up years ago, they allowed me to hold my key privately and it never left the client. I had the option to upload that key to the server, if I wanted to, or not. I understand from the article that the client might misbehave and, for example, share my key in ways I don't want it to. Am I getting this right?
When I looked into secure cloud-based storage two years ago, I found that JD was the best mix of privacy and convenience, if for no other reason than it could be deployed on a mix of Windows, Mac and Linux boxes. It was clear even then that data integrity was the weak link/trade off.
I'm interested in hearing about the latest, best solutions for easy, cross-platform, secure backups to cloud services that offer better data integrity.
This article points out two flaws. Neither of them are silly.
First, there's no integrity protection on data stored on Jungledisk. Jungledisk can own up your machine. That's not a good property for a secure backup system to have.
Second, the key derivation scheme it uses makes every passphrase, no matter how carefully chosen, drastically weaker.
I'm glad you like Jungledisk and I don't think you need to read stories like this as an indictment of your choice or a demand to change services. But it doesn't help to downplay them.
Second, the key derivation scheme it uses makes every passphrase, no matter how carefully chosen, drastically weaker.
I'd just like to repeat this point because it's so important. The password verification method in JungleDisk is fundamentally broken and needs to be rearchitected immediately.
For non-cryptography people, this is similar to the vulnerability that allowed passwords to be retrieved from the Gawker database hack a couple months ago (just not quite as vulnerable).
OK, I freely admit that I'm not an expert in this area, so I'll rescind my "silly" comment.
But, "drastically weaker" than what? If the password is strong, JD doesn't make it weak. JD just doesn't make it as strong as it should/could? Is this correct?
But, "drastically weaker" than what? If the password is strong, JD doesn't make it weak. JD just doesn't make it as strong as it should/could? Is this correct?
Correct. The vast majority of people can't remember strong passwords, so it's necessary to "strengthen" them using a good key derivation function. Jungle Disk doesn't do this.
OK, well, I guess I don't see how that's fairly described as a "flaw" in JungleDisk.
I can understand why a responsible developer should assume their users are simple-minded, mouth-breathers who can't be trusted to choose a proper password (and I'm sure there's plenty of evidence to support that assumption), but it just isn't right to characterize JungleDisk as compromised from a security perspective because it relies on the user to choose a strong password.
Saying that Jungle Disk is secure as long as users pick strong passwords is like saying that the Ford Pinto is safe as long as drivers don't get into rear-end collisions. In both cases you're asking for behaviour which we know perfectly well that users don't exhibit; and in both cases there is a simple fix for the problem.
The Ford Pinto is an unsafe car, and Jungle Disk is an insecure backup service.
I'm trying to understand this. Again, I'm no expert.
I can see why the data integrity issues may allow external factors to compromise the security of my buckets and/or local device. That's me in a Pinto, at the mercy of the bad driver behind me.
I don't see how password strength is open to any external factor; it would seem to be purely a matter of user error. That part doesn't seem to fit the Pinto analogy. That's where I'm struggling to follow your article.
The issue is how fast the password can be broken. MD5 is a very fast hash, so even a relatively slow computer can do a lot of attempts very quickly.
Bcrypt, on the other hand, can be tuned to go as slow as you want. You can force it to take 250 milliseconds, regardless of how good or bad the password is.
And that is the fundamental flaw. Jungle Disk's key derivation makes it possible to crack your password in a reasonable time; bcrypt does not. Because of that decision, everybody's data is much less safe as a result (I'm referring to everybody's data in a statistical sense: the average password sucks and is easily broken in this scheme, so the average file is at risk).
As a provider of security software (like my company is doing), Jungle Disk should be doing everything it can to help users keep their data secure. Jungle Disk isn't doing that.
OK. I think I understand now. I still don't think it's fair to call it defective design (and I'm not really certain that you ever did call JD's password privacy defective, BTW). More like a design that is unsafe in the hands of the typical driver, perhaps.
Why do I care? I just want to understand the risks for someone like me, who has taken care to choose very strong passwords.
My conclusions from all of this:
(1) The data integrity issue is serious because it presents an opportunity for introduction of malicious code, creates a risk of data loss, and may lead to security breaches.
(2) The local binary is opaque, and therefore presents a theoretical risk of compromising even the most "close to the vest" key management strategy.
(3) The password protection issue is a serious shortcoming that can, and should, be mitigated by choosing strong passwords.
One way requires the user to have a drastically stronger password to be safe, and the other significantly strengthens passwords, protecting a subclass of users that will always exist (those that are unable to remember strong passwords or don't know enough about the dangers of password cracking to know how to effectively choose passwords).
It is madness to defend the use of MD5 for password hashing these days. It is clearly not designed for that at all.
The impact of using unauthenticated encryption to store data is that your backup provider could end up owning your machine. Attackers can carefully choose which data to corrupt. They can exploit the randomization of corrupted decryption to set up conditions for memory corruption exploits, and, in more sophisticated but totally realistic attacks, exploit guesses about known plaintext to produce attacker-controlled nonrandom plaintexts. A backup provider with client-authenticated crypto can't do that, because the keys that encrypt the data also ensure it's integrity.
The password storage issue is no different than any other password storage problem; again, direct your attention to http://codahale.com/how-to-safely-store-a-password/, mentally substituting "storage of password hash" to "derivation of AES key".
To my mind, the key derivation is the real problem here. A surprisingly large number of secure encryption storage products don't ensure data integrity. Realistic attacks against that vulnerability are feasible but difficult: you'd have to be targeted.
If you're going to write an article about how a competitor's encryption is inferior to yours and cast it as a vulnerability report, I'd suggest not recommending your own encryption scheme as the replacement. The scrypt recommendation in this article sticks out like a sore thumb. Virtually nothing uses scrypt.
We can nerd out on CTR mode vs. CBC mode; I'm starting to come around to Colin's take on CTR because of ciphertext indistinguishability as I see more practical vulnerabilities that take advantage of it. I think the padding issue is a red herring. CBC padding is easier to get right than absolute rock solid reliable generation of CTR nonces and absolute rock solid management of CTR counters, which are things I see people get wrong regularly. Distinguishability is the real problem with CBC.