Developer, tinkerer, thinker

Why Bruce Schneier is wrong about passwords

March 11th, 2014

Bruce Schneier had an interesting article about passwords and password hash cracking last week.

His characterisation of how password cracking can vacuum up relevant data and add it to a dictionary is excellent.

I hold Bruce Schneier in high regard, both for his technical work (I renewed Applied Cryptography about 6 times from our University library) and his sensible campaigning on security-related issues. However, there were a few points where I don’t think his analysis holds up.

Firstly, he criticises the xkcd password scheme, on the basis that crackers know about it and will search for multiple joined dictionary words. With the analysis below, we can see that this is not actually a problem.

Secondly, there are some issues with the scheme he describes for phrase-based passwords (the Schneier scheme). His description has a terrible mistake from a security usability point of view, which I hope he will fix. For the same entropy, it is harder to remember and type than the xkcd scheme. And the passwords it generates are probably of lower entropy than we might estimate, although attacking that is probably still infeasible.

Why the xkcd scheme works

This is why the oft-cited XKCD scheme for generating passwords — string together individual words like “correct horse battery staple” — is no longer good advice. The password crackers are on to this trick.

The issue here is that joining together several random words is not a “trick” for generating good passwords; it is a straightforward way to ensure there is a measurably safe amount of entropy in the password. By entropy, we mean the information content, as in the number of bits needed to identify this password if we had a complete list of possible passwords. And this is the same argument as for any security system that doesn’t use obscurity: the strength of the security comes from the amount of entropy in the keys (i.e. the output of the random number generators).

Because each word is chosen independently and randomly, the entropy of the password is equal to the entropy of the word list (log2 of the length of the word list) multiplied by the number of words.

If I run grep '^[a-z]\{4,7\}[a-rt-z]$' /usr/share/dict/words | wc I get 22350 words that are between 5 and 8 letters long, that aren’t plurals. If I use grep '^[a-z]\{4,7\}[a-rt-z]$' /usr/share/dict/words | shuf --random-source /opt/google/chrome/product_logo_128.png -n10 to quickly check these words are sensible (and I have nothing up my sleeve), the first ten I got were: nonzero sprouted vamoosed select jacket shaped chowing sprayed unwisest abashed. It’s not Finnegan’s Wake; these are reasonable words.

Okay, so we got 22350 words, how much entropy is that? Rounding down in case we need to trim the word list, we’ll call it log2(16384), or 14 bits per word. Four words is then 56 bits.

Is 56 bits enough? Well, Bruce quotes one cracker as being able to check 8 million passwords per second, and that this could be parallelized across multiple machines and run for months. 8 million is 23 bits, a thousand machines is 10 bits, so a thousand machines for a month (2592000 seconds, ~21 bits) can enumerate a space of about 54 bits.

That’s uncomfortably close; however, this assumes a very well-motivated and resourced attacker. If the passwords have even been salted, then this is almost certainly too much effort for one user’s website password — it would cost about 0.1*24*30*1000 = $72,000 to rent the required machines from Amazon. If the site owner has used bcrypt or similar, you are safe here.

If you are worried or need more security (e.g. for your primary e-mail account), adding one or two more words is enough to be safe against any feasible cracking attempt (of course, even at the four word mark, investing tens of thousand dollars in social engineering and/or breaking laws is more likely.)

Issues with the Schneier scheme

Choose your own sentence — something personal.

This little piggy went to market

Firstly, in Bruce’s written description, he says you should use a personally memorable sentence. He isn’t explicit that it should be made up by you, and two of his sentences are quotations (and famous ones at that.) This is a terrible idea, because crackers can and will index famous quotations to use them in scanning programs (a complete wikiquote dump is only 70MB compressed, and is going to have most of the quotes you think up).

This is mostly a problem in the presentation though; the described scheme is much better.

Memory vs entropy

The problem the xkcd cartoon highlighted is that these kind of password generation tricks with substitutions and special characters are harder for humans to remember than just a string of words, for a given information content.

In comparing the schemes, I would therefore ignore case and symbol scattering — they are hard to remember, most of the examples given do not vary case or use much punctuation, and if you are asking the user to remember special punctuation and substitutions on top of the base password, well, that can be used on the xkcd scheme just as easily.

So the overall scheme reduces to an English sentence (lower entropy than random words, both because of grammar, and because a person made it up and people are bad at random), and you throw away most of the characters, keeping just the starting ones. I don’t believe a sentence of about 8-10 words, keeping only the first letter of most words, can be more secure. Let’s test.

zxcvbn is a great tool made by dropbox to test password strength. It splits the password into sections and makes an entropy estimate for each section using a variety of entropy estimators (dictionaries of words and names, dates, repeated characters, brute force, and even leetspeak and patterns on the keyboard.) Let’s try it on the examples Bruce gives:

Password Entropy
WIw7mstmsritt 69
uTVMTPw55 59
utvmtpwstillsecure 51
Ltimeagoinagfaaa 46
Wowdoestcst 43
tlpWENT2m 40

That’s quite a range. The longer sentences, and the ones with numbers, do better, and arguably have more entropy than the xkcd scheme (although I have some theoretical reservations below). However, many of them do worse than the xkcd scheme. They also are harder to remember exactly, and more awkward to type than a few complete words. This is not a good scheme, and only achieves good security through hard to remember punctuation.

Low entropy content

Although the zxcvbvn meter is using bruteforce to estimate most of these passwords, the actual entropy content is likely lower for two reasons: English grammar, and poor human randomness.

Because we are taking a real sentence, rather than a set of random words, we have the redundancy of grammar, which could easily consume half of our entropy (consider, could you fill in the blanks in this sentence: “I _ walking _ dog _ it _ raining”?)

Also, and Bruce makes this point, a cracker will gather any relevant information like pets’ names, important dates, events attended, affiliations, and so on. All of these will give clues to the sentence the user might select.

However, even given the lower theoretical entropy content, a cracker also needs to generate candidate passwords very fast, so it is unlikely either of these is too much of a problem.

Passwords through obscurity

The overall problem with recommendations like Bruce’s (and the GRC Haystack method) is that they basically rely on the cracker not knowing the scheme you use — and this reliance is dangerous. Even if you use a different password for every site, are you going to use a different punctuation/padding scheme? Security through obscurity is, at best, only temporary.

In contrast, the xkcd scheme is simple to remember, and is still secure assuming the attacker knows how you generate your passwords. A quick:

if [ $(grep '^[a-z]\{4,7\}[a-rt-z]$' /usr/share/dict/words | wc -w) -lt 16384 ]; then echo "Failed to find enough words"; else grep '^[a-z]\{4,7\}[a-rt-z]$' /usr/share/dict/words | shuf --random-source=/dev/urandom -n4; fi

is generally enough.

Debugging is… science!

March 7th, 2014

Doing a good job of debugging code is pretty much the same as science: you must try and prove yourself wrong.

How to do science in four easy steps

  1. Formulate a hypothesis.
  2. Think of an experiment that would disprove the hypothesis.
  3. Perform the experiment.
  4. Return to step 1 with a new piece of knowledge, or stop when you have enough knowledge.

In debugging, like science, the important thing is to try and prove yourself wrong.

Sure that function is being called? Well, then a print statement or a breakpoint is called for.

Sure that variable never goes below zero? An assertion in the setter or a watch on modifications will save you.

Because if your code did what you thought it did, you wouldn’t be debugging it.

Be correct, by proving yourself wrong.

Laziness, Ruby, reduce and the map that wasn’t

March 6th, 2014

The other day, I was writing some Ruby code to go through a list, apply a function to it, and return the first element that the function didn’t return nil for. A good example would be matching a RegExp, and returning the MatchData for the first result element of the array that matched.

There is a Enumerable#grep function, which almost does this, but not quite. Other than that, there didn’t seem to be a function.

Now, you could do this with a map, followed by a find, but that seems inefficient since you only want the first non-nil result. Probably a first world programming problem, but it got me thinking if there was a way to do this efficiently, simply and idiomatically.

Hoogle

Hoogle is a great tool for looking for Haskell functions based on their type signature. I figured if there was a Haskell function for this, I wouldn’t be going crazy and it might give me an idea of what the name is called. In this case, we want a function that takes a list of something, a function that converts that something into either None or Maybe something else (equivalent to returning nil or something in Ruby, but in a statically-typed way). The Haskell signature is therefore [a] -> (a -> Maybe B) -> Maybe B.

Search for that on Hoogle, and you don’t get any useful results. So am I crazy for wanting it? Well, no.

Laziness

Because Haskell is lazy, map followed by find is efficient, because the map will only be run as many times as needed to get output from find.

This reminded me of a more general problem with Ruby – its reduce function is strict (not lazy), so it always consumes the whole list, even if you don’t need it to. So, since this list function can be written as a reduce, if we had a lazy reduce, we would have our solution.

Lazy reduction

Let’s take a simple definition of reduce, and rewrite it to be left associative and lazy.

def reduce(initial, &block)
  each do |x|
    initial = block.call(initial, x)
  end
  initial
end

Okay, so how do we make this lazy? The block normally gets the initial value (or memo value), and the next value, and returns a new memo value. Instead, we’ll get the next value in the list, and a block we can call that reduces the rest of the list (and returns initial on an empty list.)

def reduce_left(initial, &block)
  again = lambda do |list|
    if list.empty?
      initial
    else
      thunk = lambda do
        again.call(list.rest)
      end
      block.call(list.first, thunk)
    end
  end
  again.call(self)
end

Calling list.rest like this is inefficient, so we’ll rewrite it to use an Enumerator in a minute. Firstly, let’s see how we could use it.

Implementing map_first

Without further ado, the first non-nil result, or nil is:

def map_first(&block)
  reduce_left(nil) do |x, rest|
    block.call(x) || rest.call
  end
end

Note this has the short-circuiting OR structure we would expect.

An efficient implementation of reduce_left

We should use an Enumerator, so it would look like this:

def reduce_left(initial, &block)
  e = self.each
  again = lambda do
    block.call(e.next, again)
  end
  again.call
rescue StopIteration
  initial
end

And it’s even less code and fewer lambdas. Thus concludes the blog post.

Our Sponsors