2

-- Edit with latest update on the problem, it is not solved yet :( --

While playing with an ARM embedded Linux system (Version 3.8.13), out of curiosity, I created a small "debug" kernel module.

This "debug" module branches a specific flow in another kernel module to a function in this "debug" module, which executes some logic and then resumes execution peacefully.

To avoid clutter of the stack area, the first executed logic is changing SP to point at an empty area inside the "debug" module code section.

The system crashes couple of seconds after executing mov sp, r2, and entering an infinite loop. (r2 holds the address to the empty area).

Note that the crash happens before the "debug" module executes any stack-related opcode (or actually any other opcode, for that matter).

So I conducted the following checks:

  • Made sure the address is 4-byte aligned
  • Changed area to write-able, and big enough to hold a full context switch that is saved on the stack (about 0x300 bytes)
  • Changing sp value by small amounts (+-0x100) - did not crash the system
  • Changing sp to some arbitrary value - crashed the system.
  • Changing sp momentarily and then immediately restoring its value does not crash the system
  • Changing sp momentarily, executing an STMFD operation and then immediately restoring sp's value does not crash the system!

From the last test, it's pretty clear that when a context-switch happens, some values are being saved on the stack. If it points to that free area, the system crashes

It seems that there isn't any problem with using stack operations on the free area, but yet when some context-switch happens, the system crashes.

Feasible ideas

  • Is there a constraint on possible values of the sp register? (Kernel configuration, perhaps? Specific bits of the address?)
  • The code section has some sort of protection against being used as a stack when used outside of the current module?
  • Context-switch requires more than 0x300 bytes of space?

Thanks!

4
  • You say "The system crashes right after executing mov sp, r2". Yet you are able to "momentarily" change sp.
    – xvk3
    Commented Dec 12, 2017 at 5:34
  • Yes, what I meant that executing mov sp, r2 and then entering a loop crashes the system after couple of seconds, where executing mov r3, sp, mov sp, r2 and then mov sp, r3 does not crash the system at all. I added a clarification in my post.
    – Tals
    Commented Dec 12, 2017 at 8:27
  • Are there equal numbers of push-pop / call-ret pairs in the loop? Is the stack pointer preserved between each loop iteration?
    – xvk3
    Commented Dec 12, 2017 at 8:35
  • Said loop is just a branch to the same address of the loop. Like I mentioned in the post - there aren't any stack-related opcodes executed after any of my tests.
    – Tals
    Commented Dec 12, 2017 at 8:44

1 Answer 1

1

If you are setting the stack pointer to a point in the code section, most likely that page does not have write access permission enabled so when the processor starts to write on the stack you immediately generate a permission data abort fault.

You need to set the stack pointer to a valid page in memory which has read/write permissions set instead of read-only. Take a look at the memory access control section under the virtual memory system architecture chapter for the ARM architecture reference manual.

3
  • That sounded like a possibility so I set the SP to that empty spcae, pushed two values and then immediately restored its value. The system did not crash. This means that the code page is writable and stack operations are working.
    – Tals
    Commented Dec 10, 2017 at 8:46
  • Can you post the kernel panic you are getting?
    – cimarron
    Commented Dec 13, 2017 at 3:22
  • In this specific embedded system the panics are written to a specific file. For some reason nothing is written on these crashes...Best guess is that since SP has been changed, whatever tries to dump the kernel panic message tries to access data from it, fails, and crashes itself.
    – Tals
    Commented Dec 13, 2017 at 8:01

Not the answer you're looking for? Browse other questions tagged or ask your own question.