J-Link Detective Squad: Dead KORG microKEY-37

08 November 2015

Korg microKEY-37

I love the KORG microKEY 37. It's an excellent entry-level MIDI controller that fits unobtrusively on your desk, great for impromptu jamming when you should be focused on something boring like "tax" or "finding a new house before eviction day". The keys feel pretty soft, as they use rubber domes instead of metal springs, but the velocity response is excellent. I highly recommend it, as it's one of those products which basically lives forever and delivers year after year of good service.

I bring this up because back in about February, my KORG microKEY 37 stopped turning on. The unit would no longer connect via USB; each time it would give up at different points during the initial handshake, with dmesg spewing a number of unhappy messages. Here's one attempt at plugging the device in.

[Sun Sep 27 06:00:56 2015] usb 3-1: new full-speed USB device number 17 using xhci_hcd
[Sun Sep 27 06:00:56 2015] usb 3-1: device descriptor read/64, error -71
[Sun Sep 27 06:00:57 2015] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for slot 16.
[Sun Sep 27 06:00:57 2015] usb 3-1: hub failed to enable device, error -22
[Sun Sep 27 06:00:57 2015] usb 3-1: new full-speed USB device number 18 using xhci_hcd
[Sun Sep 27 06:00:57 2015] usb 3-1: device descriptor read/64, error -71
[Sun Sep 27 06:00:57 2015] xhci_hcd 0000:00:14.0: Setup ERROR: setup context command for slot 17.
[Sun Sep 27 06:00:57 2015] usb 3-1: hub failed to enable device, error -22
[Sun Sep 27 06:00:57 2015] usb 3-1: new high-speed USB device number 19 using xhci_hcd
[Sun Sep 27 06:00:57 2015] usb 3-1: Device not responding to setup address.
[Sun Sep 27 06:00:57 2015] usb 3-1: Device not responding to setup address.
[Sun Sep 27 06:00:58 2015] usb 3-1: device not accepting address 19, error -71
[Sun Sep 27 06:00:58 2015] usb 3-1: new full-speed USB device number 20 using xhci_hcd
[Sun Sep 27 06:00:58 2015] usb 3-1: Device not responding to setup address.
[Sun Sep 27 06:00:58 2015] usb 3-1: Device not responding to setup address.
[Sun Sep 27 06:00:58 2015] usb 3-1: device not accepting address 20, error -71
[Sun Sep 27 06:00:58 2015] usb usb3-port1: unable to enumerate USB device

Oh snap! I'm torn on what to do... buying a replacement is doable, but this one was limited edition and a replacement wouldn't be the same cool colour scheme! Maybe there's a clue to what went wrong inside the unit?

Main circuit board inside microKEY

Here's the view of the main KLM-3130-B board. As you can see, there's nothing obviously charred or melted. I tried following the golden rule of electronics repair (check for busted capacitors), but all of the capacitors in this are tiny ceramic surface-mount ones. Which hopefully aren't going to fail because they don't have liquid in them.

The big square chip I identified as a Fujitsu (now Cypress) MB9A310A series ARM Cortex M3 processor (MB9AF312L), and the smaller rectangular chip is a GL850G USB hub controller (as this keyboard has a two port expansion USB hub). Checking the datasheet for the CPU, and employing my patented multimeter stabbing technique on the teeny tiny chip legs, I'm sure the CPU is getting +5V at the Vcc (a.k.a. supply voltage) pins and ground at the GND pins. The two clock crystals I believe are 12MHz (X1) and 4MHz (X2). There. That's my electronics knowledge exhausted.

But wait! What are these intriguing pads there on the board marked "For Debug"? SWO? SWDIO? You mean, there's a chance that this board could be fixed in the software realm?

Ah well, in for a penny... after checking in with a Subject Matter Expert I purchased a Segger J-Link debug probe for $89. I then went ahead and hacked open a sturdy quadruple-shielded USB type-B cable (TIP: if you need to strip a USB cable go for the ultra-cheapass thin ones; the only "shielding" they have is a few microns of aluminium foil. Also cutting it ensures it will never be used again), soldered some hookup wire to the pads, grafted some prototype wires on the ends, and plugged everything into a breadboard.

J-Link and microKEY plugged into breadboard

For Serial Wire Debug there are four pins on the J-Link which correspond to our pads (SWCLK, SWDIO, SWO, RESET), plus VTref (spliced off our supply voltage, aka. the +5V pin of the keyboard's USB in) and GND (spliced off ground pin of USB in). D- and D+ of the USB cable are passed through without any splicing. To keep things slightly less failure prone I'm running the J-Link off my desktop's USB controller, and the keyboard's spliced cable will be attached to my laptop.

So... it was at this point that I thought about reading the J-Link manual. According to the Getting Started section, I should switch on the J-Link, then the keyboard, and it should identify the hardware. Hah. As if it'd be that easy.

[email protected]:~/Tools/JLink_Linux_V502_x86_64$ ./JLinkExe
SEGGER J-Link Commander V5.02 ('?' for help)
Compiled Aug 28 2015 19:48:33
DLL version V5.02, compiled Aug 28 2015 19:48:30
Firmware: J-Link V9 compiled Aug 28 2015 17:49:48
Hardware: V9.30
S/N: 269304498
OEM: SEGGER-EDU
Feature(s): FlashBP, GDB
Emulator has Trace capability
VTarget = 4.950V
Info: TotalIRLen = ?, IRPrint = 0x..FFFFFFFFFFFFFFFFFFFFFFF1
Info: TotalIRLen = ?, IRPrint = 0x..FFFFFFFFFFFFFFFFFFFFFFF1
No devices found on JTAG chain. Trying to find device on SWD.
Info: Found SWD-DP with ID 0x2BA01477
Info: Found Cortex-M3 r2p1, Little endian.
Info: FPUnit: 6 code (BP) slots and 2 literal slots
Info: CoreSight components:
Info: ROMTbl 0 @ E00FF000
Info: ROMTbl 0 [0]: FFF0F000, CID: B105E00D, PID: 000BB000 SCS
Info: ROMTbl 0 [1]: FFF02000, CID: B105E00D, PID: 003BB002 DWT
Info: ROMTbl 0 [2]: FFF03000, CID: B105E00D, PID: 002BB003 FPB
Info: ROMTbl 0 [3]: FFF01000, CID: B105E00D, PID: 003BB001 ITM
Info: ROMTbl 0 [4]: FFF41000, CID: B105900D, PID: 003BB923 TPIU-Lite
Info: ROMTbl 0 [5]: FFF42000, CID: B105900D, PID: 003BB924 ETM-M3
Cortex-M3 identified.
Target interface speed: 100 kHz
J-Link>

AWW YISS. The chip is alive! Let's read through the command list and see what things we can do. Oh look! CPU registers! They contain things that are important. Can we see those?

J-Link>regs
CPU is not halted !

J-Link>halt
PC = 00009C56, CycleCnt = E3B49CD7
R0 = 00000000, R1 = 00000001, R2 = 1FFFE444, R3 = 1FFFEF04
R4 = 1FFFEF04, R5 = 1FFFE306, R6 = 00000000, R7 = 00000000
R8 = 00000000, R9 = 00000000, R10= 00000000, R11= 00000000
R12= 4266609C
SP(R13)= 20001FD0, MSP= 20001FD0, PSP= 00000000, R14(LR) = 00009E29
XPSR = 81000000: APSR = Nzcvq, EPSR = 01000000, IPSR = 000 (NoException)
CFBP = 00000000, CONTROL = 00, FAULTMASK = 00, BASEPRI = 00, PRIMASK = 00

Yep. Those are certainly ARM registers. Piece of piss, this electronics lark.

Okay. What next? We don't have the source code of what's running on this M3, so we want to whip out a debugger. But first, what about the stuff stored on the flash memory? It'd be a good idea to take a copy of that to safeguard against us screwing up. More importantly, where IS the flash memory? The Cortex M3 is a 32-bit microcontroller, so there's 4GB of address space!

Back to the CPU datasheet, there's a memory map for the chip on page 55; of course, only a small amount of the 4GB address space is mapped to anything useful. We can see that the chip's princely 128kb of flash (a.k.a. stored code) is mapped to the range 0x00000000-0x00020000, and the generous 16kb of SRAM can be found at 0x1FFFE000-0x20002000. Notice how the values stored in several of those registers above are within the SRAM range? I think we can say with confidence that these are pointers to stuff in memory!

But what's the deal with R12? According to our memory map that address sits in somewhere called the "Bit band alias". The ARM manual for the M3 describes this feature pretty well; this is basically another way of reading and writing to the SRAM, except every bit is given a full processor word. Meaning the 1MB SRAM range gets stretched out to 1MB*(8 bits in a byte)*(4 bytes in a word) = 32MB! This may sound like the definition of wastefulness, but it means you can do bit-level operations with one instruction. As in, normally if you wanted to set bit 3 in a byte of memory to 1 you'd have to do something like [load byte into register] + [OR register with 0x08] + [write register to byte] = 3 instructions; with a bit band alias you can just write 1 to the address that corresponds to the bit. Snazzy!

Okay, enough fun registers talk, let's get ourselves a dump of the flash memory range from our datasheet.

J-Link>savebin microkey.bin, 0x00000000, 0x20000
Opening binary file for writing... [microkey.bin]
Reading 131072 bytes from addr 0x00000000 into file...O.K.

Poking around inside the file with a hex editor, there's a definite rhythmic quality to the bytes, interspersed with large blocks of nulls and the occasional plaintext referencing KORG. Code!

IDA Pro disassembler settings for ARM Cortex M3 code

At this point we hand it over to the Reverse Engineer's Friend™, IDA Pro. In this case, I had to make sure that the architecture was set to ARMv7-M with Thumb-2 (the variant used by the Cortex M3). The code doesn't start immediately from address 0x00000000: instead there's the interrupt vector table, which tells the Cortex M3 which code address to jump to after an interrupt event, such as Reset or a Hard Fault. We're interested in the Reset handler (address 0x00000004), because that's where code starts executing when the CPU is switched on; in this case the value stored was 0x0000011b (aka. address 0x0000011a + 1). IDA needed a hint for where to start disassembling; visiting address 0x0000011a and converting to code was enough start the auto disassembly process.

IDA Pro showing annotated Cortex M3 code

The first few subroutines after our start point are probing around the peripheral map at 0x40000000, but what does any of it map to? The memory map in that datasheet was a start, but pretty thin on detail. I found a page containing a copy of the C header file for the chip (mb9a310l.h, for those playing at home), which has a more detailed listing of all of the peripheral memory addresses. (Lesson learned: more detail doesn't help) This crap shouldn't be so hard, all we want is a rough idea what addresses match output from the keys and wheels. Let's take advantage of some street-grade debugging!

Right, step one: halt the CPU and get a dump of the full peripheral range named in the datasheet.

J-Link>halt
PC = 000090FA, CycleCnt = 5314F092
R0 = 00000002, R1 = 00000008, R2 = 00000008, R3 = E000E100
R4 = 000000FF, R5 = 1FFFE306, R6 = 00000000, R7 = 00000000
R8 = 00000000, R9 = 00000000, R10= 00000000, R11= 00000000
R12= 4266609C
SP(R13)= 20001FC0, MSP= 20001FC0, PSP= 00000000, R14(LR) = 00009E0F
XPSR = 21000000: APSR = nzCvq, EPSR = 01000000, IPSR = 000 (NoException)
CFBP = 00000000, CONTROL = 00, FAULTMASK = 00, BASEPRI = 00, PRIMASK = 00

J-Link>savebin per_clean.bin 0x40000000 0x61000
Opening binary file for writing... [per_clean.bin]
Reading 397312 bytes from addr 0x40000000 into file...Could not read memory.

Whuuuuuuuut. You can't talk to me like that, I'm the debugger! I pay your salary!

Fine. Okay, clearly "reserved" is EE speak for "get off my land", so let's try just the GPIO block. Lots of buttons sounds like General-Purpose Input/Output to me.

J-Link>savebin per_gpio.bin 0x40033000 0x1000
Opening binary file for writing... [per_gpio.bin]
Reading 4096 bytes from addr 0x40033000 into file...O.K.
$ hexdump per_gpio.bin
0000000 001f 0000 0300 0000 0000 0000 0000 0000
0000010 0000 0000 0000 0000 0000 0000 0000 0000
0000020 0003 0000 0000 0000 0000 0000 0000 0000
0000030 0000 0000 0000 0000 000c 0000 0000 0000
0000040 0000 0000 0000 0000 0000 0000 0000 0000
*
0000100 001f 0000 0000 0000 0000 0000 020f 0000
0000110 00c0 0000 0007 0000 0000 0000 0000 0000
0000120 0000 0000 0000 0000 0000 0000 0000 0000
*
0000200 9000 0000 00ff 0000 000a 0000 fc00 0000
0000210 7e00 0000 0000 0000 0006 0000 0000 0000
0000220 0000 0000 0000 0000 0000 0000 0000 0000
*
0000300 0407 0000 0000 0000 0000 0000 320f 0000
0000310 00c0 0000 0007 0000 0001 0000 0000 0000
0000320 0000 0000 0000 0000 0000 0000 0000 0000
*
0000400 0000 0000 0000 0000 0000 0000 3000 0000
0000410 0000 0000 0000 0000 0000 0000 0000 0000
*
0000500 ff00 ffff 0000 0000 0000 0000 0000 0000
0000510 0000 0000 0000 0000 0000 0000 0000 0000
*
0000580 0014 0000 0000 0000 0000 0000 0000 0000
0000590 0000 0000 0000 0000 0000 0000 0000 0000
*
0000600 0000 0003 0000 0000 0000 0000 0000 0000
0000610 0000 0000 0000 0000 0000 0000 0000 0000
*
0000800 0000 0000 000d 0000 0000 0000 0000 0000
0000810 0000 0000 0000 0000 0000 0000 0000 0000
*
0001000

Cracking it open in a hex editor, it's mostly 0s but a few 1s! Pulling a second dump shows it to be identical to the first.

Now! The next step is to pull another dump of the GPIO range, but this time whilst holding down C3 on the keyboard. Then, compare it with our first clean dump.

$ hexdump per_gpio.bin > per_gpio.hex
$ hexdump per_gpioC3.bin > per_gpioC3.hex
$ diff per_gpio.hex per_gpioC3.hex
16c16
< 0000310 00c0 0000 0007 0000 0001 0000 0000 0000
---
> 0000310 00c0 0000 0004 0000 0001 0000 0000 0000

Well whoda thunk it. We can see that bits 0 and 1 of offset 0x315 are zeroed when we hold down C3. How about C#3?

$ diff per_gpio.hex per_gpioCsh3.hex
15,16c15,16
< 0000300 0407 0000 0000 0000 0000 0000 320f 0000
< 0000310 00c0 0000 0007 0000 0001 0000 0000 0000
---
> 0000300 0407 0000 0000 0000 0000 0000 320e 0000
> 0000310 00c0 0000 0003 0000 0001 0000 0000 0000

Interesting. Bit 2 of 0x315 and bit 0 of 0x30d got zeroed that time. Strangely, this only worked for keys C3 to D#3, and the two octave buttons. Where's the rest of the keys and the two wheels? Or the key velocity? Given that the wheels are just potentiometers wired to the main board, I have a good feeling that they'd use the M3's built in analog-to-digital converter, but at a guess I'd say the block of code responsible for activating the ADC sampling wasn't being hit, so no dice.

Okay, so far this seems like a distraction. The Cortex M3 is alive and well; at a guess from stepping through the code with the J-Link I'd say it was stuck in a holding pattern waiting for the USB interface to come up. That leaves the GL850G USB hub-on-a-chip, which has the M3 wired up to one of the downstream inputs, along with the two female USB connectors. Everything we care about is getting 5V; the M3, the hub, and the two USB ports. What's not happening is a proper USB handshake!

Test rig next to soldering equipment

And sadly, this is where the current line of investigation stops. The GL850G doesn't have a JTAG/SWD interface; the datasheet does mention a TEST mode and an I2C interface, but buggered if I know how those work (the datasheet certainly doesn't say), and I'm not keen on soldering more stuff to the microscopic chip legs. Dumping the USB traffic using the Linux usbmon driver doesn't reveal anything new; just like the dmesg output says, the device fails to identify itself at seemingly random points during the initial handshake. I guess the next level down from there would be to try a USB logic analyser like the Beagle 12, or maybe one of those fancy digital oscilloscopes that support USB TX, but those cost more than 500 dollars! I think the new plan will be to buy a totally-not-bootleg replacement GL850G on eBay (hello China!), then have a stab at desoldering the current one without ruining everything. Plan C is buy a used microKEY 37 on eBay and steal the main board.

Still, using the J-Link was a fun experience. For anyone looking to brush up their reverse engineering know-how, I highly recommend finding some hardware with one of these Cortex M3 SOCs inside and having a go debugging it with a J-Link. As a beginner to assembly language, I was quite impressed with how easy to read the Cortex M3 instructions were. Then again my last experience was reading 16-bit x86, which even the fans will tell you is a horrific architecture with no redeeming qualities, except maybe the power to induce vomiting. If the GL850G transplant works, we'll be sure to pick up where we left off and continue exploring the many nooks and crannies of the microKEY hardware.