First assembly on nRF51822 microcontroller
I’m having a lot of weird problems trying to put Rust programs on an nRF51822 microcontroller, and how to get them to run correctly. I decided I need to try going low level to assembly, and debugging instruction by instruction. Going down to “first principles”.
I found a great blog series on writing very simple Cortex-M assembly. It’s written for a Cortex-M4 processor, but it feels it should not need much adjustment to make it work for a Cortex-M0 one I have.
The code
Trying to use the code verbatim from the first part of the series resulted in a bunch of errors like below:
C:> arm-none-eabi-as -g -mcpu=cortex-m0 -mthumb try.s -o try.o try.s: Assembler messages: try.s:26: Error: cannot honor width suffix -- `sub r2,r1,r0' try.s:34: Error: lo register required -- `add r1,r1,#1' try.s:35: Error: lo register required -- `add r0,r0,#1' try.s:37: Error: lo register required -- `sub r2,r2,#1' try.s:43: Error: cannot honor width suffix -- `ldr r9,=apa' try.s:44: Error: cannot honor width suffix -- `ldr r9,[r9]' try.s:47: Error: cannot honor width suffix -- `ldr r8,=0x1337BEEF'
After some searching, it seems that
some commands need to be
marked with explicit clarification
that they change the flags registers,
which is done by adding a s
suffix.
Thus, sub
needs to become subs
,
and add
become adds
.
Also, on Cortex-M0 (a.k.a. ARMv6-M),
ldr
can only access registers up to r7
,
so I needed to tweak those commands
to use lower registers instead of r8
and r9
.
Taking into account also some improvements that were done in subsequent articles in the series, my code became:
.syntax unified .cpu cortex-m0 .thumb .global vtable .global reset_handler .section .text vtable: .word _estack .word reset_handler .word 0 .word hardfault_handler @ .size vtable, .-vtable .thumb_func hardfault_handler: b hardfault_handler .thumb_func reset_handler: ldr r0, =_estack mov sp, r0 ldr r0, =_dstart ldr r1, =_dend subs r2,r1,r0 ldr r1, =_flash_dstart cpy_loop: ldrb r3, [r1] strb r3, [r0] adds r1, r1, #1 adds r0, r0, #1 subs r2, r2, #1 cmp r2, #0 bne cpy_loop main: ldr r4, =apa ldr r4, [r4] ldr r5, =0xF00DF00D ldr r6, =0x1337BEEF done: b done .section .data apa: .word 0xFEEBDAED
The linker script also needed to be changed,
I based it on the one used e.g. in embassy-nrf
:
_estack = 0x20010000; MEMORY { FLASH (rx) : ORIGIN = 0x00000000, LENGTH = 256K RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 16K } SECTIONS { .text : { *(.text) }>FLASH _flash_dstart = .; .data : { _dstart = .; *(.data) _dend = .; }>RAM AT> FLASH /* Load into FLASH, but live in RAM */ } /* SECTIONS END */
(I might actually have done stack bad; given the smaller RAM, I’m now pretty sure I need to move it back a tad. But I’m not actually using it yet, so I didn’t have a chance to be hit by it…)
Flashing
I built and flashed it with:
C:> arm-none-eabi-as -g -mcpu=cortex-m0 -mthumb try.s -o try.o C:> arm-none-eabi-ld try.o -T ./try.ld -o try.elf C:> arm-none-eabi-objcopy -O ihex try.elf try.hex C:> openocd -f openocd.cfg Open On-Chip Debugger 0.12.0 (2023-01-14-23:37) Licensed under GNU GPL v2 For bug reports, read http://openocd.org/doc/doxygen/bugs.html WARNING: interface/stlink-v2.cfg is deprecated, please switch to interface/stlink.cfg Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD Info : clock speed 1000 kHz Info : STLINK V2J27S6 (API v2) VID:PID 0483:3748 Info : Target voltage: 3.244149 Info : [nrf51.cpu] Cortex-M0 r0p0 processor detected Info : [nrf51.cpu] target has 4 breakpoints, 2 watchpoints Info : starting gdb server for nrf51.cpu on 3333 Info : Listening on port 3333 for gdb connections [nrf51.cpu] halted due to debug-request, current mode: Thread xPSR: 0xc1000000 pc: 0x00000012 msp: 0x20010000 ** Programming Started ** Info : nRF51822-QFAA(build code: CA/C0) 256kB Flash, 16kB RAM Warn : Adding extra erase range, 0x00000058 .. 0x000003ff ** Programming Finished ** ** Verify Started ** ** Verified OK ** ** Resetting Target ** shutdown command invoked
Live debugging
Now the fun part: connecting to the chip remotely and watching live code with a debugger! With openocd, this seems to need running two terminal windows. One of them hosts openocd, which then opens a TCP port for gdb to connect to:
C:> openocd -f openocd.cfg Open On-Chip Debugger 0.12.0 (2023-01-14-23:37) Licensed under GNU GPL v2 For bug reports, read http://openocd.org/doc/doxygen/bugs.html WARNING: interface/stlink-v2.cfg is deprecated, please switch to interface/stlink.cfg Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD Info : Listening on port 6666 for tcl connections Info : Listening on port 4444 for telnet connections Info : clock speed 1000 kHz Info : STLINK V2J27S6 (API v2) VID:PID 0483:3748 Info : Target voltage: 3.244149 Info : [nrf51.cpu] Cortex-M0 r0p0 processor detected Info : [nrf51.cpu] target has 4 breakpoints, 2 watchpoints Info : starting gdb server for nrf51.cpu on 3333 Info : Listening on port 3333 for gdb connections
In the next window, can now start gdb:
C:> arm-none-eabi-gdb GNU gdb (GNU Tools for ARM Embedded Processors) 7.10.1.20151217-cvs Copyright (C) 2015 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=i686-w64-mingw32 --target=arm-none-eabi". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word". (gdb)
The first command to enter in gdb, at the (gdb)
prompt,
is to connect to the openocd server:
(gdb) target remote :3333 Remote debugging using :3333 0xfffffffe in ?? ()
To see the contents of the registers, we can now run i r
:
(gdb) i r r0 0x20000004 536870916 r1 0x58 88 r2 0x0 0 r3 0xfe 254 r4 0xfeebdaed -18097427 r5 0xf00df00d -267522035 r6 0x1337beef 322420463 r7 0xffffffff -1 r8 0xffffffff -1 r9 0xffffffff -1 r10 0xffffffff -1 r11 0xffffffff -1 r12 0xffffffff -1 sp 0x20010000 0x20010000 lr 0xffffffff -1 pc 0x34 0x34 xPSR 0x61000000 1627389952
And indeed, they contain the expected contents!
r4
:0xfeebdaed
r5
:0xf00df00d
r6
:0x1337beef
sp
:0x20010000
(buggy)pc
:0x34
- which is the address of the infinite loop at the end of our program, as can be seen in:
C:> arm-none-eabi-objdump -s -d try.elf ... 0000002c <main>: 2c: 4c06 ldr r4, [pc, #24] ; (48 <done+0x14>) 2e: 6824 ldr r4, [r4, #0] 30: 4d06 ldr r5, [pc, #24] ; (4c <done+0x18>) 32: 4e07 ldr r6, [pc, #28] ; (50 <done+0x1c>) 00000034 <done>: 34: e7fe b.n 34 <done> ...
I was however also interested if I could step through the program, from its start - not just observe the state at the end of the execution of the code. This led me to [another hint][soft reset] on a “magic” trick how to do a “soft reset” on an ARM chip: [soft reset]: stackoverflow.com/a/47599728/98528
(gdb) monit reset halt [nrf51.cpu] halted due to debug-request, current mode: Thread xPSR: 0xc1000000 pc: 0x00000012 msp: 0x20010000 (gdb) set {unsigned int}0xe000ed0c = 0x05fa0004
After executing this, the registers are now zeroed out:
(gdb) i r r0 0xffffffff -1 r1 0xffffffff -1 r2 0xffffffff -1 r3 0xffffffff -1 r4 0xffffffff -1 r5 0xffffffff -1 r6 0xffffffff -1 r7 0xffffffff -1 r8 0xffffffff -1 r9 0xffffffff -1 r10 0xffffffff -1 r11 0xffffffff -1 r12 0xffffffff -1 sp 0x20010000 0x20010000 lr 0xffffffff -1 pc 0x12 0x12 xPSR 0xc1000000 -1056964608
As we can see, the pc
is at 0x12
,
which is the beginning of our reset_handler
function:
C:> arm-none-eabi-objdump -s -d try.elf ... 00000012 <reset_handler>: 12: 4809 ldr r0, [pc, #36] ; (38 <done+0x4>) 14: 4685 mov sp, r0 16: 4809 ldr r0, [pc, #36] ; (3c <done+0x8>) 18: 4909 ldr r1, [pc, #36] ; (40 <done+0xc>) 1a: 1a0a subs r2, r1, r0 ...
The above matches what we get if we try to disassemble the instructions at that address:
(gdb) x/5i $pc => 0x12: ldr r0, [pc, #36] ; (0x38) 0x14: mov sp, r0 0x16: ldr r0, [pc, #36] ; (0x3c) 0x18: ldr r1, [pc, #36] ; (0x40) 0x1a: subs r2, r1, r0
Similarly, we can dump the values at address 0
,
to compare the contents of the vector table
with the one in our try.elf
file:
(gdb) x/4w 0 0x0: 0x20010000 0x00000013 0x00000000 0x00000011
And in try.elf
again:
C:> arm-none-eabi-objdump -s -d try.elf try.elf: file format elf32-littlearm Contents of section .text: 0000 00000120 13000000 00000000 11000000 ... ............ ... Disassembly of section .text: 00000000 <vtable>: 0: 20010000 .word 0x20010000 4: 00000013 .word 0x00000013 8: 00000000 .word 0x00000000 c: 00000011 .word 0x00000011 ...
Let’s now try stepping through the program. An extra helper option will be useful to show the following instruction after each previous one was executed:
(gdb) set disassemble-next-line auto (gdb) monit reset halt [nrf51.cpu] halted due to debug-request, current mode: Thread xPSR: 0xc1000000 pc: 0x00000012 msp: 0x20010000 (gdb) nexti halted: PC: 0x00000014 0x00000014 in ?? () => 0x00000014: 85 46 mov sp, r0 (gdb) nexti halted: PC: 0x00000016 0x00000016 in ?? () => 0x00000016: 09 48 ldr r0, [pc, #36] ; (0x3c)
The reset_handler
is not super interesting to me,
so I want to jump further down to main
.
This can be done by setting a breakpoint and then running continue
:
(gdb) br *0x2c Breakpoint 1 at 0x2c (gdb) c Continuing. Note: automatically using hardware breakpoints for read-only addresses. Breakpoint 1, 0x0000002c in ?? () => 0x0000002c: 06 4c ldr r4, [pc, #24] ; (0x48) (gdb) i r r0 0x20000004 536870916 r1 0x58 88 r2 0x0 0 r3 0xfe 254 r4 0xffffffff -1 r5 0xffffffff -1 r6 0xffffffff -1 r7 0xffffffff -1 r8 0xffffffff -1 r9 0xffffffff -1 r10 0xffffffff -1 r11 0xffffffff -1 r12 0xffffffff -1 sp 0x20010000 0x20010000 lr 0xffffffff -1 pc 0x2c 0x2c xPSR 0x61000000 1627389952 (gdb) nexti halted: PC: 0x0000002e 0x0000002e in ?? () => 0x0000002e: 24 68 ldr r4, [r4, #0] (gdb) i r r4 r4 0x20000000 536870912 (gdb) x/w 0x20000000 0x20000000: 0xfeebdaed (gdb) nexti halted: PC: 0x00000030 0x00000030 in ?? () => 0x00000030: 06 4d ldr r5, [pc, #24] ; (0x4c) (gdb) i r r4 r4 0xfeebdaed -18097427
Useful gdb commands
A short list of a bunch of gdb commands from this session that I found useful:
target remote :3333
- connect to runningopenodb
i r
- show all registersi r r4
- show single selected registerx/5i $pc
- dump 5 instructions atpc
x/4w 0
- dump 4 words at address0
set {unsigned int}0xe000ed0c = 0x05fa0004
- [soft reset][]monit reset halt
- halt the processor (?)set disassemble-next-line auto
- show to-be-run instruction after executing onenexti
- execute current instruction and go to the next onebr *0x2c
- set breakpoint at address0x2c
c
- continue running the code until a breakpoint, or forever