J-Link Detective Squad: Dead KORG microKEY-37
08 November 2015
I love the KORG microKEY 37. It's an excellent entry-level MIDI controller that fits unobtrusively on your desk, great for impromptu jamming when you should be focused on something boring like "tax" or "finding a new house before eviction day". The keys feel pretty soft, as they use rubber domes instead of metal springs, but the velocity response is excellent. I highly recommend it, as it's one of those products which basically lives forever and delivers year after year of good service.
I bring this up because back in about February, my KORG microKEY 37 stopped turning on. The unit would no longer connect via USB; each time it would give up at different points during the initial handshake, with dmesg spewing a number of unhappy messages. Here's one attempt at plugging the device in.
[Sun Sep 27 06:00:56 2015] usb 3-1: new full-speed USB device number 17 using xhci_hcd [Sun Sep 27 06:00:56 2015] usb 3-1: device descriptor read/64, error -71 [Sun Sep 27 06:00:57 2015] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for slot 16. [Sun Sep 27 06:00:57 2015] usb 3-1: hub failed to enable device, error -22 [Sun Sep 27 06:00:57 2015] usb 3-1: new full-speed USB device number 18 using xhci_hcd [Sun Sep 27 06:00:57 2015] usb 3-1: device descriptor read/64, error -71 [Sun Sep 27 06:00:57 2015] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for slot 17. [Sun Sep 27 06:00:57 2015] usb 3-1: hub failed to enable device, error -22 [Sun Sep 27 06:00:57 2015] usb 3-1: new high-speed USB device number 19 using xhci_hcd [Sun Sep 27 06:00:57 2015] usb 3-1: Device not responding to setup address. [Sun Sep 27 06:00:57 2015] usb 3-1: Device not responding to setup address. [Sun Sep 27 06:00:58 2015] usb 3-1: device not accepting address 19, error -71 [Sun Sep 27 06:00:58 2015] usb 3-1: new full-speed USB device number 20 using xhci_hcd [Sun Sep 27 06:00:58 2015] usb 3-1: Device not responding to setup address. [Sun Sep 27 06:00:58 2015] usb 3-1: Device not responding to setup address. [Sun Sep 27 06:00:58 2015] usb 3-1: device not accepting address 20, error -71 [Sun Sep 27 06:00:58 2015] usb usb3-port1: unable to enumerate USB device
Oh snap! I'm torn on what to do... buying a replacement is doable, but this one was limited edition and a replacement wouldn't be the same cool colour scheme! Maybe there's a clue to what went wrong inside the unit?
Here's the view of the main KLM-3130-B board. As you can see, there's nothing obviously charred or melted. I tried following the golden rule of electronics repair (check for busted capacitors), but all of the capacitors in this are tiny ceramic surface-mount ones. Which hopefully aren't going to fail because they don't have liquid in them.
The big square chip I identified as a Fujitsu (now Cypress) MB9A310A series ARM Cortex M3 processor (MB9AF312L), and the smaller rectangular chip is a GL850G USB hub controller (as this keyboard has a two port expansion USB hub). Checking the datasheet for the CPU, and employing my patented multimeter stabbing technique on the teeny tiny chip legs, I'm sure the CPU is getting +5V at the Vcc (a.k.a. supply voltage) pins and ground at the GND pins. The two clock crystals I believe are 12MHz (X1) and 4MHz (X2). There. That's my electronics knowledge exhausted.
But wait! What are these intriguing pads there on the board marked "For Debug"? SWO? SWDIO? You mean, there's a chance that this board could be fixed in the software realm?
Ah well, in for a penny... after checking in with a Subject Matter Expert I purchased a Segger J-Link debug probe for $89. I then went ahead and hacked open a sturdy quadruple-shielded USB type-B cable (TIP: if you need to strip a USB cable go for the ultra-cheapass thin ones; the only "shielding" they have is a few microns of aluminium foil. Also cutting it ensures it will never be used again), soldered some hookup wire to the pads, grafted some prototype wires on the ends, and plugged everything into a breadboard.
For Serial Wire Debug there are four pins on the J-Link which correspond to our pads (SWCLK, SWDIO, SWO, RESET), plus VTref (spliced off our supply voltage, aka. the +5V pin of the keyboard's USB in) and GND (spliced off ground pin of USB in). D- and D+ of the USB cable are passed through without any splicing. To keep things slightly less failure prone I'm running the J-Link off my desktop's USB controller, and the keyboard's spliced cable will be attached to my laptop.
So... it was at this point that I thought about reading the J-Link manual. According to the Getting Started section, I should switch on the J-Link, then the keyboard, and it should identify the hardware. Hah. As if it'd be that easy.
[email protected]:~/Tools/JLink_Linux_V502_x86_64$ ./JLinkExe SEGGER J-Link Commander V5.02 ('?' for help) Compiled Aug 28 2015 19:48:33 DLL version V5.02, compiled Aug 28 2015 19:48:30 Firmware: J-Link V9 compiled Aug 28 2015 17:49:48 Hardware: V9.30 S/N: 269304498 OEM: SEGGER-EDU Feature(s): FlashBP, GDB Emulator has Trace capability VTarget = 4.950V Info: TotalIRLen = ?, IRPrint = 0x..FFFFFFFFFFFFFFFFFFFFFFF1 Info: TotalIRLen = ?, IRPrint = 0x..FFFFFFFFFFFFFFFFFFFFFFF1 No devices found on JTAG chain. Trying to find device on SWD. Info: Found SWD-DP with ID 0x2BA01477 Info: Found Cortex-M3 r2p1, Little endian. Info: FPUnit: 6 code (BP) slots and 2 literal slots Info: CoreSight components: Info: ROMTbl 0 @ E00FF000 Info: ROMTbl 0 : FFF0F000, CID: B105E00D, PID: 000BB000 SCS Info: ROMTbl 0 : FFF02000, CID: B105E00D, PID: 003BB002 DWT Info: ROMTbl 0 : FFF03000, CID: B105E00D, PID: 002BB003 FPB Info: ROMTbl 0 : FFF01000, CID: B105E00D, PID: 003BB001 ITM Info: ROMTbl 0 : FFF41000, CID: B105900D, PID: 003BB923 TPIU-Lite Info: ROMTbl 0 : FFF42000, CID: B105900D, PID: 003BB924 ETM-M3 Cortex-M3 identified. Target interface speed: 100 kHz J-Link>
AWW YISS. The chip is alive! Let's read through the command list and see what things we can do. Oh look! CPU registers! They contain things that are important. Can we see those?
J-Link>regs CPU is not halted ! J-Link>halt PC = 00009C56, CycleCnt = E3B49CD7 R0 = 00000000, R1 = 00000001, R2 = 1FFFE444, R3 = 1FFFEF04 R4 = 1FFFEF04, R5 = 1FFFE306, R6 = 00000000, R7 = 00000000 R8 = 00000000, R9 = 00000000, R10= 00000000, R11= 00000000 R12= 4266609C SP(R13)= 20001FD0, MSP= 20001FD0, PSP= 00000000, R14(LR) = 00009E29 XPSR = 81000000: APSR = Nzcvq, EPSR = 01000000, IPSR = 000 (NoException) CFBP = 00000000, CONTROL = 00, FAULTMASK = 00, BASEPRI = 00, PRIMASK = 00
Yep. Those are certainly ARM registers. Piece of piss, this electronics lark.
Okay. What next? We don't have the source code of what's running on this M3, so we want to whip out a debugger. But first, what about the stuff stored on the flash memory? It'd be a good idea to take a copy of that to safeguard against us screwing up. More importantly, where IS the flash memory? The Cortex M3 is a 32-bit microcontroller, so there's 4GB of address space!
Back to the CPU datasheet, there's a memory map for the chip on page 55; of course, only a small amount of the 4GB address space is mapped to anything useful. We can see that the chip's princely 128kb of flash (a.k.a. stored code) is mapped to the range 0x00000000-0x00020000, and the generous 16kb of SRAM can be found at 0x1FFFE000-0x20002000. Notice how the values stored in several of those registers above are within the SRAM range? I think we can say with confidence that these are pointers to stuff in memory!
But what's the deal with R12? According to our memory map that address sits in somewhere called the "Bit band alias". The ARM manual for the M3 describes this feature pretty well; this is basically another way of reading and writing to the SRAM, except every bit is given a full processor word. Meaning the 1MB SRAM range gets stretched out to 1MB*(8 bits in a byte)*(4 bytes in a word) = 32MB! This may sound like the definition of wastefulness, but it means you can do bit-level operations with one instruction. As in, normally if you wanted to set bit 3 in a byte of memory to 1 you'd have to do something like [load byte into register] + [OR register with 0x08] + [write register to byte] = 3 instructions; with a bit band alias you can just write 1 to the address that corresponds to the bit. Snazzy!
Okay, enough fun registers talk, let's get ourselves a dump of the flash memory range from our datasheet.
J-Link>savebin microkey.bin, 0x00000000, 0x20000 Opening binary file for writing... [microkey.bin] Reading 131072 bytes from addr 0x00000000 into file...O.K.
Poking around inside the file with a hex editor, there's a definite rhythmic quality to the bytes, interspersed with large blocks of nulls and the occasional plaintext referencing KORG. Code!
At this point we hand it over to the Reverse Engineer's Friend™, IDA Pro. In this case, I had to make sure that the architecture was set to ARMv7-M with Thumb-2 (the variant used by the Cortex M3). The code doesn't start immediately from address 0x00000000: instead there's the interrupt vector table, which tells the Cortex M3 which code address to jump to after an interrupt event, such as Reset or a Hard Fault. We're interested in the Reset handler (address 0x00000004), because that's where code starts executing when the CPU is switched on; in this case the value stored was 0x0000011b (aka. address 0x0000011a + 1). IDA needed a hint for where to start disassembling; visiting address 0x0000011a and converting to code was enough start the auto disassembly process.
The first few subroutines after our start point are probing around the peripheral map at 0x40000000, but what does any of it map to? The memory map in that datasheet was a start, but pretty thin on detail. I found a page containing a copy of the C header file for the chip (mb9a310l.h, for those playing at home), which has a more detailed listing of all of the peripheral memory addresses. (Lesson learned: more detail doesn't help) This crap shouldn't be so hard, all we want is a rough idea what addresses match output from the keys and wheels. Let's take advantage of some street-grade debugging!
Right, step one: halt the CPU and get a dump of the full peripheral range named in the datasheet.
J-Link>halt PC = 000090FA, CycleCnt = 5314F092 R0 = 00000002, R1 = 00000008, R2 = 00000008, R3 = E000E100 R4 = 000000FF, R5 = 1FFFE306, R6 = 00000000, R7 = 00000000 R8 = 00000000, R9 = 00000000, R10= 00000000, R11= 00000000 R12= 4266609C SP(R13)= 20001FC0, MSP= 20001FC0, PSP= 00000000, R14(LR) = 00009E0F XPSR = 21000000: APSR = nzCvq, EPSR = 01000000, IPSR = 000 (NoException) CFBP = 00000000, CONTROL = 00, FAULTMASK = 00, BASEPRI = 00, PRIMASK = 00 J-Link>savebin per_clean.bin 0x40000000 0x61000 Opening binary file for writing... [per_clean.bin] Reading 397312 bytes from addr 0x40000000 into file...Could not read memory.
Whuuuuuuuut. You can't talk to me like that, I'm the debugger! I pay your salary!
Fine. Okay, clearly "reserved" is EE speak for "get off my land", so let's try just the GPIO block. Lots of buttons sounds like General-Purpose Input/Output to me.
J-Link>savebin per_gpio.bin 0x40033000 0x1000 Opening binary file for writing... [per_gpio.bin] Reading 4096 bytes from addr 0x40033000 into file...O.K.
$ hexdump per_gpio.bin 0000000 001f 0000 0300 0000 0000 0000 0000 0000 0000010 0000 0000 0000 0000 0000 0000 0000 0000 0000020 0003 0000 0000 0000 0000 0000 0000 0000 0000030 0000 0000 0000 0000 000c 0000 0000 0000 0000040 0000 0000 0000 0000 0000 0000 0000 0000 * 0000100 001f 0000 0000 0000 0000 0000 020f 0000 0000110 00c0 0000 0007 0000 0000 0000 0000 0000 0000120 0000 0000 0000 0000 0000 0000 0000 0000 * 0000200 9000 0000 00ff 0000 000a 0000 fc00 0000 0000210 7e00 0000 0000 0000 0006 0000 0000 0000 0000220 0000 0000 0000 0000 0000 0000 0000 0000 * 0000300 0407 0000 0000 0000 0000 0000 320f 0000 0000310 00c0 0000 0007 0000 0001 0000 0000 0000 0000320 0000 0000 0000 0000 0000 0000 0000 0000 * 0000400 0000 0000 0000 0000 0000 0000 3000 0000 0000410 0000 0000 0000 0000 0000 0000 0000 0000 * 0000500 ff00 ffff 0000 0000 0000 0000 0000 0000 0000510 0000 0000 0000 0000 0000 0000 0000 0000 * 0000580 0014 0000 0000 0000 0000 0000 0000 0000 0000590 0000 0000 0000 0000 0000 0000 0000 0000 * 0000600 0000 0003 0000 0000 0000 0000 0000 0000 0000610 0000 0000 0000 0000 0000 0000 0000 0000 * 0000800 0000 0000 000d 0000 0000 0000 0000 0000 0000810 0000 0000 0000 0000 0000 0000 0000 0000 * 0001000
Cracking it open in a hex editor, it's mostly 0s but a few 1s! Pulling a second dump shows it to be identical to the first.
Now! The next step is to pull another dump of the GPIO range, but this time whilst holding down C3 on the keyboard. Then, compare it with our first clean dump.
$ hexdump per_gpio.bin > per_gpio.hex $ hexdump per_gpioC3.bin > per_gpioC3.hex $ diff per_gpio.hex per_gpioC3.hex 16c16 < 0000310 00c0 0000 0007 0000 0001 0000 0000 0000 --- > 0000310 00c0 0000 0004 0000 0001 0000 0000 0000
Well whoda thunk it. We can see that bits 0 and 1 of offset 0x315 are zeroed when we hold down C3. How about C#3?
$ diff per_gpio.hex per_gpioCsh3.hex 15,16c15,16 < 0000300 0407 0000 0000 0000 0000 0000 320f 0000 < 0000310 00c0 0000 0007 0000 0001 0000 0000 0000 --- > 0000300 0407 0000 0000 0000 0000 0000 320e 0000 > 0000310 00c0 0000 0003 0000 0001 0000 0000 0000
Interesting. Bit 2 of 0x315 and bit 0 of 0x30d got zeroed that time. Strangely, this only worked for keys C3 to D#3, and the two octave buttons. Where's the rest of the keys and the two wheels? Or the key velocity? Given that the wheels are just potentiometers wired to the main board, I have a good feeling that they'd use the M3's built in analog-to-digital converter, but at a guess I'd say the block of code responsible for activating the ADC sampling wasn't being hit, so no dice.
Okay, so far this seems like a distraction. The Cortex M3 is alive and well; at a guess from stepping through the code with the J-Link I'd say it was stuck in a holding pattern waiting for the USB interface to come up. That leaves the GL850G USB hub-on-a-chip, which has the M3 wired up to one of the downstream inputs, along with the two female USB connectors. Everything we care about is getting 5V; the M3, the hub, and the two USB ports. What's not happening is a proper USB handshake!
And sadly, this is where the current line of investigation stops. The GL850G doesn't have a JTAG/SWD interface; the datasheet does mention a TEST mode and an I2C interface, but buggered if I know how those work (the datasheet certainly doesn't say), and I'm not keen on soldering more stuff to the microscopic chip legs. Dumping the USB traffic using the Linux usbmon driver doesn't reveal anything new; just like the dmesg output says, the device fails to identify itself at seemingly random points during the initial handshake. I guess the next level down from there would be to try a USB logic analyser like the Beagle 12, or maybe one of those fancy digital oscilloscopes that support USB TX, but those cost more than 500 dollars! I think the new plan will be to buy a totally-not-bootleg replacement GL850G on eBay (hello China!), then have a stab at desoldering the current one without ruining everything. Plan C is buy a used microKEY 37 on eBay and steal the main board.
Still, using the J-Link was a fun experience. For anyone looking to brush up their reverse engineering know-how, I highly recommend finding some hardware with one of these Cortex M3 SOCs inside and having a go debugging it with a J-Link. As a beginner to assembly language, I was quite impressed with how easy to read the Cortex M3 instructions were. Then again my last experience was reading 16-bit x86, which even the fans will tell you is a horrific architecture with no redeeming qualities, except maybe the power to induce vomiting. If the GL850G transplant works, we'll be sure to pick up where we left off and continue exploring the many nooks and crannies of the microKEY hardware.