# Attacking Hardware AES with DFA

For the past couple of months, I have been trying to extract the hardware keys from the PlayStation Vita. I wrote a paper describing the whole process with all the technical details, but I thought I would also write a more casual blog post about it as well. Consider this a companion piece to the paper where I will expand more on the process and the dead ends than just present the results. In place of technical accuracy, I will attempt to provide more intuitive explanations and give background information omitted in the paper.

## DFA

For a nice practical introduction to differential fault analysis, check out this article on using DFA to attack white-box software AES. The authors give a good explanation that is not overly academic and actually presents code at the end (which we use for our attack). The main idea of DFA is this: we can use glitch attacks on AES hardware just as we can on processors, but instead of using it to control code execution, we use it to make faulty AES encryptions with the right key. Since AES is a brittle algorithm, slight modifications will cause it to leak information about the key in unintended ways and we abuse this fact.

Unfortunately, there is not much interest in AES DFA outside of academia. A search on Github shows a handful of results and overall we only found two serious implementation of AES DFA attacks. dfa-aes is an implementation of a 2009 paper where a single precise fault in round 8 and $2^{32}$ brute force can yield the AES-128 key. phoenixAES (from the authors of that article linked to above) is an implementation of a 2003 paper which requires two separate precise faults in round 8 and no brute force (although later on, we will later describe some modifications that relaxes the “precise fault” requirement and increases the required brute force to about $2^8$). There has been many other papers published from 2002 to 2016 describing attacks that assume faults in earlier rounds or more bytes are affected by a fault or other parts of the algorithm. However, we were not able to find any source code attached to these papers. In the end, we derived our work from phoenixAES even though it was not state-of-the-art because writing code is boring and most of the improvements in the literature do not mean much in practice (one hour vs five minutes is a lot of time but if you only have to do it once, the time it takes to write all that code and debug it would negate the gain).

With that rant aside, the main bulk of work is in perfecting our glitching setup in order to inject precise (as in corrupting no more than a single byte) faults on the AES engine during round 8. Once we have that in place, we can feed the collected samples into phoenixAES (or dfa-aes) and it should Just Work.

## DPA

Before getting into how we designed the setup for DFA glitching, it is worth sidetracking into our (failed) attempt on a DPA attack on the Vita as context for some of the design decisions made later on. Differential power analysis is a type of side channel attack where if the attacker observes the power consumption of the AES engine while it is operating with a secret key, then it is possible to leak the key. First she hypothesizes the value of a part of the key. Next, the attacker defines a power usage model of the AES engine to predict how much power is consumed if a random input is encrypted and the hypothesis was correct. Finally, she actually runs the engine with that input and measures the actual power consumption to see how close the prediction was. By repeating this many times and for different parts of the key, it is possible to find the entire key. Chipwhisperer wiki has a great introduction to how differential power analysis works that goes into much more details but is still approachable.

In order to do DPA on a target, you need to be able to precisely measure the current in the chip. One way is an application of Faraday’s law: a changing magnetic field induces a voltage. You can measure current with a “magnetic probe.” Colin O’Flynn described at Blackhat how to build your own magnetic probe and I managed build one and to get it to work with the ChipWhisperer example target.

## Master keys

So far, everything described works for non-master keys. Recall from earlier that we said master keys cannot be used to directly encrypt content. Instead the process involves using Bigmac to encrypt some plaintext to another keyslot, where the slave keyslot cannot be read out either. Of course one way to get around this is to perform two levels of DFA attack: one fault to fill the slave keyslot and then $m$ faults using the slave keyslot in order to recover the faulty ciphertext for the master keyslot. However, we did not go down this route because we already know of a hardware vulnerabilty in Bigmac that exposes the slave keys.

Davee wrote a great post about how this vulnerability works. In short, because Bigmac does not clear the internal state after a successful encryption, if you perform a second encryption with size < 16 bytes (block size of AES), then it “borrows” the remaining bytes from that internal state (which happens to be the same as the slave key because it was the last encryption operation). Using this fact, we can brute force the remaining bytes with four $2^{32}$ tries to recover a single slave key. (You might notice a theme occurring here: if something doesn’t work, just brute force it.)

For each master keyslot, we collect around $m=100$ samples (to be safe) of these slave key “partials.” Then we run Davee’s tool to “bust” the partials and recover the slave key. This slave key is the corrupted ciphertext. Then we do the same DFA attack described above and we can recover the master key as well.

For the partial busting, we spinned up an AWS c5.18xlarge spot instance (with has 72 AES-NI enabled cores), which can bust one partial in around 15 seconds (the longest we’ve seen was still under than a minute).

AWS EC2 core utilization over a couple of hours.

## Conclusion

We recovered all 30 master keys including the slot 0x208 key.

We also recovered 238 of the 240 non-master keys. The last two are AES XEX keys for full-disk-encryption and are locked out before we can execute the RPC payload (which is loaded from the eMMC). Getting them would require additional work that we did not find to be useful because the keys are device unique.

### Costs

Such an attack is not as expensive as one might think. We are hobbyists working on this only during our free time for a span of half a year. We received no funding or access to any professional labs. The total cost of the whole experiment from the equipment to the boards to AWS EC2 was easily less than $1000. The majority of that cost was in the Rigol osciloscope (for debugging) ($400) and the ChipWhisperer Lite (\$300). In a world where software attacks are getting harder and harder to pull off and companies are protecting more and more of their software with hardware security, it seems like a huge oversight that the hardware is not protected as well.

The remaining cost was dominated by the death of 9 Vita motherboards. Here are their obituaries: one gave the pinout for eMMC, two led to the realization that 3.3V eMMC damages the SoC, one taught the importance of not keeping the solder iron too hot, two brought caution in probing since shorting the adjacent 1.1V core to 1.8V IO is not allowed, one had internal metal on a cut trace warp and got shorted due to heat expansion from a reflow, two died from mysterious causes. (Thanks to everyone who donated spare Vita boards for this experiment.)

### Code

As always, all the tools referenced in this post are public and open source. Please check out the paper for more details on the setup and implementation.

• Our fork of ChipWhisperer contains all the modifications needed to glitch the Vita target.
• f00dsimpleserial includes the RPC payload, the ChipWhisperer scripts to run it, the ChipWhisperer scripts to glitch Bigmac and collect ciphertexts, the analysis scripts, and the DFA tools based off of phoenixAES.
• f00d-partial-buster brute forces the slave keys from the partials.
• psvemmc and psvcw target boards for interfacing with ChipWhisperer.