5

I'm trying to reverse engineer the GNU libc x86 (32 bit) setjmp / longjmp (re a vuln which may allow arbitrary overwrite of the jmp_buf env.

There's a great writeup of the musl setjmp but I can find almost nothing online about the GNU. I've tried to navigate the source, but it's a spaghetti ball of macros, probably due to being so sys dependent. The asm is unusual, using things like:

CALL       dword ptr GS :[0x10 ]

which I don't fully understand (I thought segments were for 16 bit 8088 code! What is the GS:?).

A priori, I would expect that setjmp would simply save a few registers, but it seems much more complicated. I've found posts claiming GNU intentionally obfuscates it, either to prevent programmers from relying on internals, or for some security purpose, both of which I'm skeptical of.

Experimenting with a debugger has shown one thing: the jmp_buf env changes with each invocation, such as that even the same program, with the same params, and the same stack pointers, if you're using a debugger to load the jmp_buf from one invocation into another you get a SIGV. The contents are clearly not a pure function of the program and stack, but somehow change (randomly?) with each invocation.

Are any of the crack REs here able to penetrate setjmp?

1
  • 1
    Check out Later developments. A few of the segment selectors have been repurposed to hold base addresses to specific pieces of data. In Windows, for example fs plays a role in retrieving the TIB/TEB pointer and ultimately the PEB pointer, but also when setting up SEH in a function. Specific to your question read this.
    – 0xC0000022L
    Commented Nov 2, 2021 at 9:12

1 Answer 1

5

I'll start with answering a few basic questions, some of which you didn't even ask!

What are segment registers doing in modern code?

It's been a while since we've needed extra registers to address a memory region. 32, and especially 64, bits are more than enough. OS developers took advantage of those unused registers and nowadays most modern OSes use at least some of the registers to hold OS related data. As mentioned in the comments, on amd64 processors segment registers cannot be used for segmentation but OSes have been doing it on 32 bit processors as well.

You can read more about it here regarding linux, here and here regarding windows, etcetera.

Why can't I restore data from a previous execution of a program

Although you may control some variables of a program's execution (parameters, stack addresses, process loading addresses and heap location) you're still not controlling all variables (locations of specifica allocations, values returned from "external" sources such as the kernel and as we'll see soon, anti-exploitation mitigations might interfere with that sort of thing too).

Generally, you should never expect such a thing to work without taking the necessary adjustments. Let alone in something as low-level and nuanced as setjmp/longjmp.

Why isn't the setjmp/longjmp implementation documented?

Firstly, we're in a reverse engineering community, avoiding documentation does not guarantee confidentiality. Secondly, documentation is in the code :)

I would imagine documentation is rather difficult for such low-level details that may change frequently and are very architecture specific. Which leads us to your next question -

Why is setump/longjmp architecture-dependent?

Obviously, this goes without saying, but for completeness I thought it'd be better to be explicit here. Here are some reason this has to be done on an per-architecture level:

  1. As these functions touch the some of a CPU's ABI (specifically the calling convention), the code has to follow a different conventions.
  2. Accessing registers by name, for their specific purpose is abstracted in C.
  3. C is a procedural language, therefore setjmp/longjmp in their core are direct contradiction to the nature of C since it breaks the boundaries of procedures (functions).
  4. Additional architecture-specific features (that are implemented differently, shadow stack and pointer guard are such examples) might change how setjmp/longjmp need to handle specific cases.

I'll discuss amd64 from now on.

How is setjump implemented

Now, although this isn't C (and isn't the most readable assembly either), the code for setjmp on amd64 can be found in setjmp.S, longjmp is __longjmp.S. It's even quite commented and the code is pretty straight forward!

You can clearly see the registers as they're saved onto the structure (For example, movq %r12, (JB_R12*8)(%rdi)). You can see PTR_MANGLE is called if the aformentinoed pointer guard feature is enabled.

Because your question mostly revolved around finding the code and not reading the code and since the code is quite straight-forward, I'll leave reading the functions as an exercise for the reader for now. I'll come back and add more details later on, so feel free to ask follow-up questions.

How is the jmp_buf structure defined

Since we're dealing with assembly we don't have structures. Instead, there are several #define preprocessor directives to define the jmp_buf structure. Those are located in a dedicated header: jmpbuf-offsets.h

Where are those files located? Where can I find different architecture implementations?

These files are located in the sysdep module, which holds subdirectories for each supported architecture-specific components. aarch64 stands for arm 64-bit, x86 for 32-bit intel 8086 compatible processors, 86_64 for 64 bit intel 8086 CPUs, etcetera.

4
  • Superb answer. Where is PTR_MANGLE defined? gdb suggests it does two things: rol and xor. What is the point of the rotation? It doesn't add any true encryption, of course. Commented Nov 3, 2021 at 1:55
  • Is definition on Intel x64 different than AMD? Or is it just called AMD since they were first? Where is x86-32 bit defined? What is the scheme / layout to find these files? Commented Nov 3, 2021 at 1:56
  • Intel x64, x86_64 and amd64 are practically the same. amd64 indeed comes from amd being the "first". PTR_MANGLE is a MACRO used for Linux's "pointer guard" feature I mentioned in the answer. The SHADOW_STACK_ MACROs (also used by setjmp) are used for the "Shadow stack" feature (also mentioned above).
    – NirIzr
    Commented Nov 3, 2021 at 8:49
  • PTR_MANGLE is defined in the architecture's sysdep.h. x86_64 for example. You can browse for them here
    – NirIzr
    Commented Nov 3, 2021 at 9:01

Not the answer you're looking for? Browse other questions tagged or ask your own question.