A practical solution to the BREACH vulnerability

James RoperAugust 19th, 2013Last Updated: August 19th, 2013

0 38 5 minutes read

Two weeks ago CERT released an advisory for a new vulnerability called BREACH. In the advisory they say there is no practical solution to this vulnerability. I believe that I’ve come up with a practical solution that we’ll probably implement in Play Frameworks CSRF protection.

Some background

First of all, what is the BREACH vulnerability? I recommend you read the advisory, there’s no point in me repeating it here, but for those that are lazy, here are is a summary. The prerequisites for exploiting this vulnerability are:

The target page must be using HTTPS, preferably with a stream cipher (eg RC4) though it is possible to exploit when block ciphers with padding are used (eg AES)
The target page must be using HTTP level compression, eg gzip or deflate
The target page must produce responses with a static secret in them. A typical example would be a CSRF token in a form.
The target page must also reflect a request parameter in the response. It may also be possible to exploit if it reflected POSTED form body values in the response.
Responses must be otherwise reasonably static. Dynamic responses, particularly ones that vary the length of the response, are much more expensive to exploit.
The attacker must be able to eavesdrop on the connection, and specifically, measure the length of the encrypted responses.
The attacker must be able to coerce the victims browser to request the target web page many times.

To exploit, the attacker gets the victims browser to submit specially crafted requests. These requests will contain repeat patterns that the compression algorithm will compress. If the pattern matches the first part of the secret, then the response will be shorter than if it doesn’t, since that part of the secret will also be compressed along with the repeat patterns. Then character by character, the attacker can determine the secret.

Some work arounds

The advisory mentions some work arounds. Whether these work arounds are effective depend greatly on the specific application, none of them can be effectively done by a framework, without potentially breaking the application.

Probably the most effective of the work arounds is randomising the secret on each request. In the case of CSRF protection tokens, which is often provided by many frameworks, this would prevent a user from using the application from multiple tabs at the same time. It would also cause issues when a user uses the back button.

I would like to propose a variant of using randomised tokens, that should work for most framework provided CSRF protection mechanisms, and that, pending feedback from the internet on whether my approach will be effective, we will probably implement in Play Framework.

Signed nonces

The idea is to use a static secret, but combine it with a nonce, sign the secret and the nonce, and do this for every response that the secret is sent in. The signature will effectively create a token that is random in each response, thus violating the third prerequisite above, that the secret be static.

The nonce does not need to be generated in a cryptographically secure way, it may be a predictable value such as a timestamp. The important thing is that the nonce should change sufficiently frequently, and should repeat old values sufficiently infrequently, that it should not be possible to get many responses back that use the same nonce. The signature is the unpredictable part of the token.

Application servers will need to have a mechanism for signing the nonce and the secret using a shared secret. For applications served from many nodes, the secret will need to be shared between all nodes.

The application will represent secrets using two types of tokens, one being “raw tokens”, which is just the raw secret, the other being “signed tokens”. Signed tokens are tokens for which a nonce has been generated on each use. This nonce is concatenated with the raw token, and then signed. An algorithm to do this in Scala might look like this:

def createSignedToken(rawToken: String) = {
  val nonce = System.currentTimeMillis
  val joined = rawToken + "-" + nonce
  joined + "-" + hmacSign(joined)
}

where hmacSign is a function that signs the input String using the applications shared secret using the HMAC algorithm. HMAC is not the only signing algorithm that could be used, but it is a very common choice for these types of use cases.

Each time a token is sent in a response, it must be a newly generated signed token. While it is ok to publish the raw token in HTTP response headers, to avoid confusion on which incoming tokens must be signed and which can be raw, I recommend to always publish and only accept signed tokens. When comparing tokens, the signature should be verified on each token, and if that passes then only the raw part of the tokens need to be compared. An algorithm to extract the raw token from the signed token created using the above algorithm might look like this:

def extractRawToken(signedToken: String): Option[String] = {
  val splitted = signedToken.split("-", 3)
  val (rawToken, nonce, signature) = (splitted(0), splitted(1), splitted(2))
  if (thetaNTimeEquals(signature, hmacSign(rawToken + "-" + nonce))) {
    Some(rawToken)
  } else {
    None
  }
}

where thetaNTimeEquals does a String comparison with Θ(n) time when the lengths of the Strings are equal, to prevent timing attacks. Verifying that two tokens match might look like this:

def compareSignedTokens(tokenA: String, tokenB: String) = {
  val maybeEqual = for {
    rawTokenA <- extractRawToken(tokenA)
    rawTokenB <- extractRawToken(tokenB)
  } yield thetaNTimeEquals(rawTokenA, rawTokenB)
  maybeTrue.getOrElse(false)
}

Why this works

When using a signed token, the attacker can still work out what the raw token is using the BREACH vulnerability, however since the application doesn’t accept raw tokens, this is not useful to the attacker. Because the attacker doesn’t have the secret used to sign the signed token, they cannot generate a signed token themselves from the raw token. Hence, they need to determine not just the raw token, but an entire signed token. But since signed tokens are random for each response, this breaks the 3rd prerequisite above, that secrets in the response must be static, hence they cannot do a character by character evaluation using the BREACH vulnerability.

Encrypted tokens

Another option is to encrypt the concatenated nonce and raw token. This may result in shorter tokens, and I am not aware of any major performance differences between HMAC and AES for this purpose. APIs for HMAC signing do tend to be a little easier to use safely than APIs for AES encryption, this is why I’ve used HMAC signing as my primary example.

Framework considerations

The main issue that might prevent a framework from implementing this is that they might not readily have a secret available to them to use to do the signing or encrypting. When an application runs on a single node, it may be acceptable to generate a new secret at startup, though this would mean the secret changes on every restart.

Some frameworks, like Play Framework, do have an application wide secret available to them, and so this solution is practical to implement in application provided token based protection mechanisms such as CSRF protection.

Reference: A practical solution to the BREACH vulnerability from our JCG partner James Roper at the James and Beth Roper’s blogs blog.