The 3DS Cryptosystem

I think one of the biggest challenge for system engineers is designing security. Recently, at the 32c3 conference, plutoo, derrek, and smea presented a series of hacks that completely defeated the security of the 3DS. As a result, people have implemented boot-time unsigned code execution (called “arm9loaderhax” in the 3DS community; other communities might relate this with “untethered jailbreak” or “bootloader unlock”). What I want to do today is not to reveal anything novel, but look at the security of the 3DS as a whole and see what went wrong. In this deep dive, I will hypothesize the design decisions that led to the cryptosystem found on the 3DS. Then I will present the flaws that led to “arm9loaderhax.” Finally, I will summarize the findings and provide a few tips to fellow engineers in hopes that these kinds of mistakes will not be made again. (Extra details are provided in parenthesis, they are for people with deeper knowledge of the 3DS and are not required to understand the rest of the article.)

Preliminary

Let’s begin with the first question that ought to be asked when designing a secure system: what is the threat model? I believe that for Nintendo, it boils down to two things

  • System integrity: all code running on the system should be checked and signed by Nintendo. This ensures that users are safe from malicious code and that Nintendo gets licensing fees from games and applications.
  • Content protection (DRM): code and resources should not be extractable by the user. This protects license holders (creating trust in the system) and ensures intellectual property cannot be stolen by competitors.

Note that the threat model takes account of factors that directly impact business. I’m sure there’s also other points considered (prevent cheating, protect user privacy, etc), but once you know what the most important assets are, you have a better understanding of how to protect them. These take priority. Of the two points listed, which is the most important? You might argue DRM because that’s where most of the money lies (and I think Nintendo did too). However, we will soon see that without system integrity, you cannot even toy with the idea of DRM.

System Integrity

In order to ensure all code running on a system is authorized, there needs to be a trust hierarchy. An example of this is: userland code always trusts kernel code, but kernel code has to cryptographically verify user code before running it. Now who verifies kernel code? The kernel loader. And so on… This process ensures that every component of the system has been audited by some other component.

SecureBootOn the 3DS (as with most systems), the boot ROM is the root of the trust hierarchy. The code there cannot be changed, and contains root cryptographic certificates (public keys).

There are two processors on the 3DS. The ARM9 “security” processor facilitates crypto (access to key generator, AES/RSA engines, etc), file system access to the NAND, and other low-level stuff. The ARM9 processor talks to the ARM11 processor through Process9 (user mode on ARM9) which ideally does security checks and then talks to Kernel9. (However, if you watched the 32c3 talk linked to above, this isn’t necessarily the case as Kernel9 has a system call that is literally “run whatever code you pass in.” Therefore, the diagram on the left is actually wrong as Process9 is not separated from Kernel9 at all.) We will look at the ARM9 Loader later.

The ARM11 “application” processor is the CPU that does everything you see. I go into more details about how modules are loaded in my last article, but the gist of it is that there are kernel calls that only certain system module (running in user mode) can access. Applications must communicate with these modules to get system resources. The ARM11 kernel sometimes has to communicate with Process9 (on ARM9) to get access to more sensitive system resources.

So aside from a couple of concerns, the design of the secure boot chain is mostly solid. (Of course implementation is another story; there has been many flaws in the system software and almost no exploit mitigation.) One good design choice here is that Kernel11 has low exposure. Because the system modules does most of the work, the kernel does not have to expose all syscalls to every application. For example, a game does not have access to the syscalls to map executable memory. So, in order for an exploit found in a game to run arbitrary code, it would have to first compromise the “ro” module (which has the right syscalls). This also means that code running in supervisor mode can be more limited and smaller in size, which means that bugs would be easier to spot. Again, this is all good in theory; but in practice, some very stupid implementation flaws (allowing the GPU to write to executable memory for example) makes this all moot–a story I don’t have time to get into. Another good design choice is that complexity only grows as you move down the trust hierarchy. When there’s more code, there’s more bugs, so keeping the bulk of the complexity in less trusted code is good for security. However, this is only a rule of thumb and should not serve as a type of defense!

Content Protection

Let’s assume for now that we have system integrity (we don’t). How do we implement DRM? The truth is that DRM is impossible in theory. However, as engineers, we do not always have to follow theory. The secret of DRM is that, unlike other cryptosystems, you are not designing it to be secure forever. (Note for the pedantic: I know that no cryptosystem currently known would last forever, but if you can point out this fact, you also know what I mean.) Specifically, if your DRM can last 100 years, most people (on the engineering side) would be very happy. In fact, if you can provably do that, you would “solve” the problem of DRM. Most DRM schemes are designed with decades in mind (something that you might not admit to business people). That means we can commit some security faux pas that the textbooks would forbid. For example, security by obscurity is a tool here. If it takes the hacker 5 years to figure out your scheme, then by all means do it, because you just bought another 5 years. (But be warned that if you think it takes 5 years to crack the scheme, it likely will take 5 months.)

Original 3DS Implementation

Pre70EncryptionFirst, let’s go over the crypto primitives we have on the 3DS. There is an on-chip hardware AES engine with 64 keyslots. When a key is written into a key-slot, it stays there until it’s either cleared or rewritten with another key. You can write a “normal key” into the key-slot and the AES operation works as usual (with the normal key being the key). The more interesting case is if you instead write two keys–KeyX and KeyY–into the slot. In that case, an on-chip key generator derives the “normal key.” However, the normal key is never revealed outside the AES engine! That means, even if we extract KeyX and KeyY (by dumping the code that sets them), we cannot find the normal key (… in theory, I’ll come back to this point later). There is also a hardware RSA engine that has its own key-slots and operates similarly (except there is no key generator).

Games are stored in a container format named NCCH (unnecessary details: game carts are in NCCH with an additional block level encryption on the cart. eShop games are in NCCH with an additional layer of encryption by a “title key” that is decrypted by the “common key” (key-slot 0x3D, KeyX set by boot ROM, KeyY set by Kernel9). However, these details are unnecessary because it does not change anything in the trust.) It is decrypted with a key generated from a KeyX set by the boot ROM and a KeyY set by Process9. Process9 is found in NATIVE_FIRM which is stored encrypted on the NAND with a console-unique IV. The key to decrypt NATIVE_FIRM is set by the boot ROM. The console-unique IV is derived from a (maybe not so) unique card ID found on the eMMC (NAND). This is to prevent a downgrade attack on a new console using the firmware extracted from an older, vulnerable, console.

7.x Implementation

Speaking of vulnerabilities–in 2013, the first 3DS hack came out (a simple buffer overflow in Process9). This set Nintendo into something of a panic. Hackers broke the trust hierarchy at the Kernel9 level, which means that they control everything underneath that. Even though the NCCH KeyX (0x2C) was in the boot ROM and therefore still safe, hackers can use the key generator as a black box to decrypt whatever games they like. In order to protect future games from being decrypted in this way, they came up with a rather ingenious plan

70Encryption

In the boot ROM, the RSA engine (slot 0x0) is initialized with a key so it can be used later in the boot process. In Kernel9, slot 0x0 is used with an RSA operation and the key-slot is overwritten with another key. Just like the AES engine, once a key-slot is overwritten, it is impossible to extract the previous key. Note that it is unlikely Nintendo purposely erased this key-slot for security measures. The real reason is likely that there’s only 4 RSA key-slots (versus 64 AES key-slots), so they needed to re-use slot 0x0 for other RSA operations. The ingenious method Nintendo devised was to use some unrelated data outputted from the RSA engine at slot 0x0 to derive the 7.x NCCH KeyY. Because all versions of the firmware (including the vulnerable ones) have wiped this slot by the time the exploit could be triggered and since the boot ROM is still secured, even if you run the algorithm to derive the 7.x NCCH KeyY, it won’t work since RSA slot 0x0 has already been wiped. (In the diagram, the old NCCH key is used in conjunction. This is for compatibility–if you put a 7.x game on an older console, the game icon and banner can still show up, but you must update the system before the game can run).

Ultimately, to get these keys you need to get code execution to work before Kernel9 is first initialized (and wipes the key-slot). arm9loaderhax (described later) found in 2015 does this, but from what I understand, a undisclosed method was used to retrieve these keys originally (in 2014). However, since a 9.2 exploit was found in early 2015 (I wrote a series on it back then), the black-box method to decrypt games worked again.

New 3DS Implementation

Near the end of 2014, Nintendo released the New 3DS–the first major hardware revision. This was also their chance to try to salvage what’s left of their cryptosystem. Sorry mobile viewers, the diagram only gets more complicated…

First, I want to introduce an element I’ve left off before because it did not affect the security then. The OTP (one time programmable) section is unique data found physically in the CPU (okay pedants, the “SoC”). Its primary use is to store a ECDSA private key for identifying the console to Nintendo’s servers (for eShop and services). It exists on the old 3DS as well but on the New 3DS, it has been integrated into the chain of trust.

The New 3DS also comes with a new key-store found in a sector of the eMMC (NAND). The data in the key-store is the same on each console, but the key-store itself is AES encrypted with a SHA256 hash of the OTP section as the key. That means that the encrypted key-store is console unique! The OTP region is disabled early in the boot process: Kernel9 on the old 3DS (but not really, more on this later…), and the new ARM9 Loader on the New 3DS. That means, if implemented correctly, it should not be possible to decrypt the key-store after boot–even if you exploit the system at a later point. (Key word: correctly. Even though the keys were wiped from the AES engine, they forgot to wipe it from the SHA engine used in deriving the keys!)

Using the new NAND key-store, the New 3DS derives keys for NCCH decryption of New 3DS-only titles. Remember, the old 3DS does not have this NAND key-store, so it would not be able to derive the new keys. It is also unlikely that there is any hope in the form of another lucky slot that’s cleared early in the boot process. That means, black-box decryption of any old 3DS supported games will always be possible. However, maybe they can still protect New 3DS only games…

The other addition is the ARM9 Loader. Previously, once NATIVE_FIRM is decrypted, the code for ARM9 and ARM11 runs on their respective processors. Nintendo now added an additional layer of encryption for the Kernel9/Process9 code and they added ARM9 Loader to derive the new keys and decrypt the new layer. I think the idea here is this: there are potentially 32 keys we can use in the NAND key-store. If someone hacks the latest ARM9 firmware and obtain the secrets there, we can always release an update that encrypts it with a new key. Since the ARM9 Loader is added to the hierarchy of trust, as long as A9L is safe (and it is a much smaller code-base than Kernel9/Process9), even if they hack Kernel9, they cannot find the new keys released in an update and must hack the firmware again to get the new secrets. This was perhaps a solution to the panic of 7.x when it was hard to re-secure secrets after the system was hacked once.

We first saw this new system put to test with firmware 9.6. Before then, Nintendo made the mistake of forgetting to clear a key-slot initialized by the OTP hash. Although the OTP hash was still safe, it was possible to derive many other New 3DS keys from access to that key-slot. This includes the KeyX/KeyY for the new encryption layer on the ARM9 code. But no fear! 9.6 encrypted the ARM9 code derived from a new key in the key-store and for almost a year people outside of a few inner circles cannot decrypt anything for the New 3DS.

Things Fall Apart

What I like about the arm9loaderhax (here-forth A9LH) is that it’s only possible because of all the added complexity in the cryptosystem. The original system I showed would not have been vulnerable. However, A9LH is inevitable because Nintendo must respond to each attack. It is just unfortunate that each hole they patch only reveals more leaks. I’m not going to go into details of how A9LH works, delebile did a wonderful writeup on it. Instead, I’ll try to answer why it works.

To do that, let’s work backwards. The finale is that because the decryption of the ARM9 code is unauthenticated, if you corrupt the key-store, a junk key will be derived and the encrypted ARM9 code will decrypt into junk data. Then the console jumps into junk data which, if interpreted as code, can branch into a controlled region of memory with our payload. Thanks to ARM’s RISC-y instruction set, a random branch instruction is easy to find.

But if we were to have performed this attack on firmware 9.5, it would have failed. In 9.5, the first 16 bytes of the decrypted key-store is used to derive key-slot 0x15. That key is first used to decrypt a control block at offset 0x40 of the encrypted ARM9 code. If the control block is all zeros, then slot 0x15 is used again to decrypt the actual binary at offset 0x800. Now on 9.6, the first 16 bytes of the decrypted key-store is still used to derive key-slot 0x15 and the next 16 bytes are used to derive key-slot 0x16. 0x15 is still used to check the control block but 0x16 is used to decrypt the binary! They forgot to check the validity of the second 16 bytes of the key-store! They either forgot to change to slot 0x16 for checking the control block, or for technical reasons, they had to use slot 0x15 there and just neglected to use a test vector with slot 0x16. Either way, the only reason this is possible at all is because Nintendo had to revoke 9.5 and use a new key.

But we would still be stuck if we didn’t have access to the decrypted keystore. We had to modify the second 16 bytes but keep the first 16 bytes intact. There were two things people exploited here to get the KeyX/KeyY for decrypting the key-store. First, as I touched upon already, is that Nintendo forgot to clear the SHA256 registers which was used to derive the keys. However, to exploit this, for various reasons beyond the scope of this article, you needed to perform a hardware attack. The other method was more insidious. Before firmware 3.0, Nintendo  forgot to lock the OTP after boot! That means if you flash any firmware < 3.0 on your 3DS (which involved some trickery for the New 3DS since it was not designed to run < 3.0), it would boot normally (as the code, of course, is signed). Then you can use any of the exploits found since then, take over the system, and dump the OTP. This means the New 3DS key-store can be decrypted and all new keys (even ones not currently used) can be derived. It’s not surprising that such a hole was overlooked because back then (three years ago), Nintendo did not expect the OTP to be used in the chain of trust. The irony is that the feature designed to bring more security was the one that completely broke it.

Key Generator

Since A9L is broken, any 3DS past, present, or future can be hacked on boot. That’s the result of A9LH. (Although for the future, a new exploit is needed to trigger a downgrade to obtain the OTP. However, I also think something that modifies the eMMC CID via hardware might also be possible. EDIT: It is not possible. NAND encryption uses a unique console key along with an unique IV derived from the CID.) So system integrity is gone. What about content protection? It’s hanging by its last thread. We can always use the 3DS as a blackbox for decrypting content. However, the goal for attackers is to get the “normal” keys and be able to decrypt content offline. The key generator is the defense for this. Remember that security by obscurity can only buy you so much time? It took about four years since the original release of the 3DS for hackers to break it.

If you haven’t watched the 32c3 presentation linked at the top of this post, I highly recommend you do so, as I’m not going to give the full details here. The gist of plutoo and yellows8’s ingenious crack was that they discovered through cryptoanalysis the algorithm of the key-generator was just some XORs and rotates. They did this because there were a couple of normal keys that were “leaked.” A couple of keys that we only know KeyX and KeyY for are found as normal keys on the WiiU (needed for communication with 3DS). Another key was accidentally included as a normal key in one firmware release, and then changed to KeyX KeyY in the next release!

That means that if we get KeyX and KeyY, we now have the normal keys. Unfortunately, there are still a bunch of shiny keys hidden in the boot ROM (which is also disabled like the OTP after use; no mistakes found yet).

Postmortem

So what went wrong here? I want to summarize by listing some of the big mistakes Nintendo made that hopefully won’t be made by anyone again

  • Focusing on content protection instead of system integrity: you cannot have one without the other. Always implement exploit mitigation when you can!
  • Not having a contingency plan for when the system is hacked: no matter how secure you think your system is, you need to have a plan for when it’s broken. That way you don’t end up scrambling around and introducing more bugs.
  • Too much complexity: having lots of blocks that say “AES” and “RSA” in your plan might impress the boss, but it just adds to the attack surface. Always go with the simplest plan that secures against your threat model.
  • Do not change the trust hierarchy after production! Everything is built on that hierarchy. Adding/removing from it will break assumptions that you might not even be aware of.

Sources

Some of the information here is from my own reverse engineering, but the bulk of it is from information found on 3dsbrew.org. Please let me know if there’s any mistakes or anything that doesn’t make sense.

Cosmo3DS: The CFW nobody wanted

In the last article, I talked about my plan for creating 3DS mods. Now, I will put that plan to the test with a CFW (modified firmware) that nobody wants except me.

The idea for this CFW is that I want

  • Keep my 3DS on the hackable 9.2 firmware but still use the latest system software (emuNAND)
  • Play games region free right from the home menu
  • Change the system region without possibly bricking the device
  • Use the eShop with region changed systems

and only those things. Specifically, I don’t want to patch signatures that allow for installing pirated content (personal choice, I don’t care what you do). I don’t want threads running in the background to support fancy features such as replacing version strings or taking screenshots or in-game menus. So I created my own CFW in two components: a stripped down version of ReiNAND where everything except emuNAND is removed and an implementation of my custom loader which does all the patches. No threads. No glut.

I understand that I’m the only one who needs such a CFW and most people can make do with one of the ones out there. This is mostly written just for my own use and for me to test out my idea of patching the system. However, I’m happy that my N3DS is now a cosmopolitan.

If you want to use it, you need the CFW as well as the custom loader. Then, the important part is injecting the custom loader into the right FIRM file. (If you search for “reinand 3.1 firmware.bin pastebin” you should be able to find it)

3DS Code Injection through “Loader”

I’ve seen many CFWs (custom firmware; actually they’re just modified firmwares) for the 3DS but there seems to be a lack of organization and design in most of them. I believe that without a proper framework for patching the system, writing mods for the 3DS is extremely difficult and usually requires an in depth knowledge of the system even to make simple modifications. So here I present a plan that I hope developers will pick up on and contribute to.

3DS System Firmware: An Introduction 

The 3DS has a microkernel architecture. This means most functionalities are implemented as modules (running as separate processes) and IPC is used by modules to talk to one another. The kernel has limited functionality (basic memory management, threading and synchronization, and access to IO memory) and these basic functions are exposed to modules (where the main work is done). All modules run in userland, but the security is not equal. There are three categories of modules: system, applets, and applications. (This is not an official categorization, as different modules have permission settings based on service-call access, access to other modules, memory space, and so on. However, you can basically group modules by these security properties into these three groups.) System modules do the backend work: installing content, TLS stack, communication with Nintendo’s servers, etc. Applets are system software the user “sees”: home menu, notifications, Nintendo ID login, keyboard, etc. Applications are the rest: settings app, camera app, games, etc.

The reason why it’s hard to write patches or mods for the Nintendo 3DS is because of this modularization of the system. When we take control of the kernel, patching system features is not as simple as finding the right address in memory and overwriting it. Because everything is in different processes (and each have their own virtual memory space), you either have to recognize the kernel layout and parse kernel objects and map the right memory addresses to patch (NTR does this) or you have to stupidly scan all of physical memory (slow and high risk of false positives) for the right thing to patch. There’s additional complications if you’re using a hack that reboots the firmware or requires patching a process that’s currently not running: you have to know when the process launches and modify the code when it does. It is possible to insert hooks when you first gain control in order to run code when processes are launched. However, most of the CFWs out there does something insanely non-practical: it busy-loops in a separate thread and scans all of physical memory for the right thing to patch… ad infinitum.

So we want to do the following

  • Patch code for any module (system, applet, or application) we want easily
  • Ideally we can insert the hook early in the boot process so we can patch most modules BEFORE they start (an example would be to change the region of the system: once the CFG module is loaded, every other module gets the region information from CFG; so we can either patch CFG or patch every other module that uses it)
  • Everything should be fast and use minimal resources (linear scan of physical memory is NOT okay)
  • Ideally the code should not be overly dependent on offsets–you should not have to compile the code separately for each firmware version

3ds-bootLuckily there is a place in the boot-chain we can patch that does just that. The “loader” module is in charge of parsing the CXI executable format and loading the code into memory. That means that if we change “loader” to patch the code after it is read from the CXI file (and decompressed) but before it signals to the kernel to run (schedule) it, we can patch most modules in the system. The kernel bootstraps the five main modules: fs (file IO), loader (CXI loader), pm (process manager), sm (IPC), and pxi (talks to security processor) and then pm uses loader to start the rest of the modules. Other modules can start processes using pm, which in turn uses loader.

So loader is a good place to start if we want flexibility in patching the system and it fits the bill for what we require in loading patches, but there are some complications. First, the loader module does not have permission to access the SD card (where we want to use to store patches). Second, and more importantly, the code to read and parse patches from the SD card and run them when a process is being loaded seems like a good amount of code. If we wish to insert it into the loader module, we might have to move stuff around–a big pain with object code. So we need a better plan than just hijacking “loader.”

[2016/01/19 04:15:01] <shinyquag> You could probably code your own “loader”, I’d wager it’s probably doable

Shinyquag would have won the wager because, luckily, loader is also the smallest module on the system. It has a couple of commands and no complicated logic. It’s small enough (20KB) to completely reverse in a day. Once we rewrite the module, adding functionality is a trivial matter. In fact, since Nintendo uses a lot of boilerplate code so our loader implementation would actually be even more lean. Finally, since this module has not changed much since 2.x, we can likely use it on future firmwares without much changes. However, just because it is doable, does not mean it will be easy…

Debugging without a Debugger

First I reversed the module and rewrote it in C. Then it got difficult. Since this was the first time anyone tried to build a sysmodule with the homebrew development environment, I had to make changes to many parts of the toolchain. Then there’s the problem of testing: we’re replacing a key component of the system very early in the boot process. There’s no iterative development since we cannot isolate the module without ending up rewriting the entire operating system. So, I just wrote the entire module in one sitting and hoped for the best. Unfortunately the best is a black screen and we don’t have access to anything like gdb or even “printf”. The only output we have is that the console boots or it does not boot. After re-reading my code over and over again and fixing bugs each time until I can’t spot anything else obvious, I’ve discovered, thanks to #3dsdev, that prayer was not the only means of debugging this.

Enter XDS, the 3DS emulator we all forgot about. ichfly managed to get it to boot the 3DS home menu a while ago, but development has since ceased (likely because of other more promising emulators in the works). There’s one thing XDS does that the other 3DS emulators does not do is that XDS fully emulates the 5 sysmodules bootstrapped by the kernel (the other guys use HLE). This is perfect because we only need to run “loader” in the boot process and let it interact with the other bootstrapped modules until enough modules are loaded by “loader” that we can consider it working. After porting XDS to OSX, I was able to make use of the generous debug logs (comparing the IPC requests/responses from my loader with stock) to find the final bugs in my loader and get it to boot on the 3DS.

Our loader can now correctly duplicate the functionality of the original but ultimately, we wish to add more features. The most important one is SD card reading. The way that access permissions work is that each module specifies in its CXI header what filesystem devices it has access to. However, in the case of bootstrapped modules, the access fields are ignored. So in order to get “loader” the right permissions to read the SD card, it is not enough to modify the CXI headers. We need to talk with the “fs” module and request the permissions directly (this is what “pm” does with other modules after loader puts the code into memory). This involves reversing enough of the “fs” module to figure out how the permissions are stored and then making the right requests to “fs” to give “loader” those permissions. Since XDS does not emulate the SD card, I had to find a different avenue to test my code. What I decided to do was modify the URL to the Themes Store in the Home Menu to point to my server + the return value of the file IO calls. Then I would load up the Themes Store to find the return value and look through my reversed object code to see what caused the error. This worked well because I did not want to add more code (e.g: framebuffer or networking) just for debugging. (Who debugs the debugger?) With SD card access, future debugging is much easier because we can just write logs to the SD card.

Loader Based CFWs

I believe that using a custom “loader” will make it much easier to write mods for the 3DS. We could see hacks such as a “Homebrew” button in the Home Menu or custom keyboards or custom themes outside of what Nintendo officially supports. We might also see hacks for games similar to HANS but without requiring access to a dump of the game. I hope 3DS developers will pick up on this and make cool mods and hacks for the system.

The code can be found here. Developers should modify patcher.c and replace it with their own implementation.

CGEN for IDA Pro

It all started when I wanted to analyze some MeP code. Usually, I do all my disassembly in IDA Pro, but this is one of the few processors that isn’t supported by IDA. Luckily, there is objdump for this obscure architecture. After fumbling around for a bit, I was convinced that porting the disassembler to IDA would be a much better use of my time than manually drawing arrows and annotating the objdump output.

Process

Turns out, there isn’t too many resources online on writing IDA processor modules. The readme of the SDK was minimal (it tells you to read the sample code and the header files) and refers to two documents: an online guide that is now gone and The IDA Pro Book by Chris Eagle. Opening the book to the chapter on writing processor modules, you’re greeted with a dire warning to turn back (along with a note about the lack of documentation) as many have tried and failed.

One of the reasons why writing a processor module is so challenging is that the processor_t struct contains 56 fields that must be initialized, and 26 of those fields are function pointers, while 1 of the fields is a pointer to an array of one or more struct pointers that each point to a different type of struct (asm_t) that contains 59 fields requiring initialization. Easy enough, right?

Chris Eagle, The IDA Book 2nd Edition

Well, I’m not easily discouraged, so I read on and familiarized myself with the process of creating a processor module. I’m not going to describe the process here in detail because Chris does a great job in the book, but I’ll give a brief outline

IDA Processor Module

There are four components of a processor module. The “analyzer” parses the raw bits of the machine code and generates the information about an instruction. The “emulator” uses that information to help IDA with further analysis. For example, if an instruction references data, your module can tell IDA to look for data at that address. If the instruction performs a function call, your module can tell IDA to create a function there. Contrary to its name, it does not actually “emulate” the instruction set. The “outputter” does just that: given the data generated from the analyzer, it prints out the disassembly to the user. Finally, there’s the architectural information, which is not a component mentioned elsewhere but I consider it one. This is not code, but static structures that tell IDA information such as the names of the registers, the mnemonic of the instructions, alignments and so on.

CGEN

The binutils (objdump) for MeP are machine generated by CGEN. CGEN attempts to abstract away the task of writing CPU tools (assemblers, disassemblers, simulators, etc) into writing CPU definitions. The definitions describe the CPU (including the hardware elements, the instruction sets, operands, etc) with the Scheme language. CGEN takes the definition and outputs C/C++ code for all the needed CPU tools. Originally I wanted to avoid CGEN and just wrap the (generated) binutils code to an IDA module (à la Hexagon). In theory, your module does not have to follow the convention laid out above. You can make the analyzer record the raw bits, the emulator do nothing, and the outputter use binutils to generate a complete line and print it out. However, in doing so, you essentially lose most of the power of IDA (finding xrefs, stack variables, etc). It would also be a shame to not use all the information given to us by the CGEN CPU definitions. These definitions (in theory) are strong enough to generate RTL code to implement the processor, so we would like to give as much of this information to IDA as possible.

CGEN Generators

The generators themselves (CGEN docs refer to them as “applications”) are also written in Scheme, in the Guile dialect. I have never written a line of functional code before, so it took me a day to understand the relatively small codebase. CGEN has its own object system that they call COS. Everything defined in the CPU descriptions becomes objects, and each generator gives these objects methods to print themselves out. For example: the simulator might give the operand object a “generate code to get value” method. Then a call to generate the semantics of an instruction into C code would use these objects’ methods. Like a true software engineer, I cut out functions from the generators for the simulator, disassembler, and architecture description and stitched them together with my own code to generate various components of an IDA module.

The analyzer used, as its base, the simulator instruction decode generator. I had to modify CGEN to record the order of operands as specified by the instruction syntax (the only modification to CGEN itself, everything else are additions). Then, I overwrote the simulator’s method definitions for extracting the operands from the instruction with code to populate the “cmd” structure in IDA (which requires the operands to be ordered).

The emulator used the simulator model generator as its base and was the most difficult to write (in terms of code complexity). The one major issue here is that while the generated simulator expects code to run in order and have state information stored, the IDA emulator does not store state information and IDA does not guarantee that your emulator will be called in the order that the instructions appear. That means we cannot make any assumptions about the state and our emulator can only make decisions based on the instruction alone. Because we only care about finding data and code references with the emulator, we can make the following simplifications:

  • Any conditionals will have the condition stripped out and all paths will be taken
  • Using any values from registers will stop the emulator and return immediately
  • Setting any values to registers will evaluate the value but discard the result

The first point allows for xrefs to be found regardless of the condition. A conditional branch, for example, will allow for a code xref to be generated. The second point is there because we do not know the state, so any dependencies on a register value that’s not already stripped out will make finding a xref difficult. In theory, we can still find offset xrefs this way, but we would have to know that only additions/subtractions are used and only a single register is used and that adds a lot of complexity. The third point allows memory reads to be captured. With those simplifications in place, we know that any memory reads/writes and any PC reads/writes that are found without knowing the state can be turned into xrefs.

The outputter used the syntax parser (binutils’ opcode builder) as its base. It reads the instruction definition in order to output the right orderings of parenthesis, commas, and so on. I just replaced the generate print methods of the hardware objects to generate calls to IDA output functions.

Results

MeP executable loaded and recognized by IDA Pro

MeP executable loaded and recognized by IDA Pro. All the blue is a result of the auto-analysis.

At the most basic level, the generated modules outputs what you would expect from objdump. The analyzer will find the right type for the operands (if possible). The emulator tries to find all constant addresses and adds xrefs to them (code and data). The outputter will print all instructions correctly, and the operands with the right type/size/name if needed.

The main thing it doesn’t do right now is keeping track of the stack pointer. It also does not identify if branches are jumps or calls (requiring CF_CALL flag). It does not identify if an instruction does not continue flow (requiring the CF_STOP flag) (it’s actually trivial to add this, but harder to add it without introducing emulator code to other generators. Since it’s easy to identify the instructions by hand, I decided to leave it out).

Usage

Once you generate the IDA module components, you still need to manually write the processor_t structure, the notify() function (optional), and implement and special print functions (as defined by the CPU definition). Then you can copy the CGEN headers from binutils and compile it with the IDA SDK. Take a look at the MeP module as an example. You can reuse most of the non-generated code as-is (just change some strings and constants). If you run into any issues, feel free to contact me. I haven’t tested this on anything other than MeP because of laziness but I hope the code works more generally.

Downloads

The CGEN code is here and the Toshiba MeP module is here. The MeP module has basic stack tracking and call recognition added manually. When I have the time, I’ll port the rest of the CGEN supported modules that IDA does not support over.

On the future of Rejuvenate

Since, the announcement ten days ago, Rejuvenate received tons of positive reception and thousands of downloads. Progress on both SDK projects is moving at fast speeds. There are already Vita homebrew projects in the works. No doubt, there are more to come. However, Sony’s response has not been positive. Yesterday, Sony released firmware 3.52 which revokes access to PSM DevAssistant and PSM Unity DevAssistant along with a friendly request for PSM developers to delete the DevAssistant from their devices. This means that if you ever want to run homebrew on your Vita, regardless of your opinions on the current limitations and regardless of your ability to use PSM, do NOT update to 3.52.

CHIP-8 emulator by @xerpi, picture by @Chihab_rm

CHIP-8 emulator by @xerpi, picture by @Chihab_rm

Continue reading

Hacking the PS Vita

The following was taken from a series of unpublished posts I wrote back in September 2012 (almost three years ago). The posts not only detail the exploit I found but also the thought process that led me to it. I intended to publish it as soon as the exploit was patched by Sony or after someone found another exploit on the system by examining the memory dumps. However, as of today, the PSM privilege escalation is still the only known way to execute native ARM code on the PS Vita. Apologizes for the outdated references.

Pathfinding

To start, lets brainstorm the different ways we can attack this black box of a device. Typically, a new device is unlocked in a process that usually involves: 1) dumping the device’s RAM/ROM/NAND, 2) analyzing the dumps for information and vulnerabilities, 3) using the vulnerability to create a tool that allows others to easily gain root access.
Continue reading

Rejuvenate Public Beta Release

Rejuvenate, announced last week allows users to install unofficial applications and games (homebrew) onto their PS Vita device. Please read that announcement post for more information. Today, the public beta is ready for testing.

The beta is only for those who were able to obtain a publisher’s license (whose application was approved by Sony before the deadline on May 31). For the rest of you who do not have the publisher license (and no friends with a publisher license) but only the DevAssist app on your Vita, please wait for further instructions to come.

Continue reading

How To Register and Download for PSM (Shutdown Bypass)

Update: It seems that Sony has closed this loophole. However, if you own a PS3, there is another way.

Did you miss the call to register for PSM? People found out today that if you did not register for PSM (not the publisher’s license that requires manual approval, but the general developer registration), they cannot download the Developer Assistant app on their Vita in order to run Rejuvenate. Not to worry, I have discovered a workaround. But act quickly because Sony will patch this loophole soon and possibly also remove the Developer Assistant from PSN, so you should download it now.
Continue reading

Credits to egarrote from elotrolado.net for the logo!

Rejuvenate: Native homebrew for PSVita

(Sadly, they did not give me a spot at the Sony E3 conference, so I have to make do with this blog post.) I am excited to announce Rejuvenate, a native homebrew platform for PS Vita. The tools that will be released through the next couple of weeks will allow developers (not in contract with Sony) to develop and test games, apps, and more on the PS Vita. These unofficial software can run on any PS Vita handheld device without approval by Sony. These tools cannot enable pirated or backup games to run (I’m not just saying this… the exploits used does not have enough privilege to enable such tasks). Rejuvenate requires PlayStation Mobile Development Assistant to be installed on your Vita! Sony will remove this from PSN soon, so if you wish to ever run homebrew apps on your PS Vita, you must download this app now!

Continue reading