1

Following the question here (3 questions on assembly - syntax, meaning, and equivalent in high level code (eg C++)), I would like to know for the same x86 code in AT&T syntax:

xor $0x20, (%eax) 
and $0x20, %ah 
or $0x20, %dh 
dec (%edi) 
dec %si 
dec %sp 
dec %bp
  1. What are the implications of the last two lines of code (decrementing stack pointer and base pointer)?

  2. What are the lines of code doing from a higher level perpective? - Eg "Takes an input and outputs a string"

  3. What are some Linux commands (that come with distros) that decompile assembly code? - I have only found downloadable software suggestions thus far.

This is not a homework question - I am new to assembly. Example code is not from actual code - it's to help me get a better understanding & illustrate my questions.

3
  • You should post real examples, not something you made up, otherwise the answers would be useless.
    – Igor Skochinsky
    Commented Feb 20, 2017 at 8:12
  • 2
    "I'm learning English, so can you explain what 'gut what excite' means? I just made it up because I know some of these words but please translate it" - this is what your questions look like.
    – Igor Skochinsky
    Commented Feb 20, 2017 at 8:22
  • Ok point made. I will post another question with actual code. Commented Feb 20, 2017 at 17:15

1 Answer 1

2

What are some Linux commands (that come with distros) that decompile assembly code?

There are none. You can see this for yourself if you look at the functionality provided by tools in the GNU binutils collection.

Stack Frames

On System V i386 systems %esp and %ebp are used by the compiler to manage stack frames on the runtime stack (it is the compiler that transforms source code into assembly). Stack frames are created on the runtime stack when functions are called.

From the System V Application Binary Interface Intel386 Architecture Processor Supplement, chapter 3 "Low-Level System Information" section 9 "Function Calling Sequence" (page 37):

  • %esp

    The stack pointer holds the limit of the current stack frame, which is the address of the stack’s bottom-most, valid word. At all times, the stack pointer should point to a word-aligned area.

  • %ebp

    The frame pointer optionally holds a base address for the current stack frame. Consequently, a function has registers pointing to both ends of its frame. Incoming arguments reside in the previous frame, referenced as positive offsets from %ebp, while local variables reside in the current frame, referenced as negative offsets from %ebp. A function must preserve this register’s value for its caller

Here is a picture of a standard stack frame (from the System V Application Binary Interface Intel386 Architecture Processor Supplement, page 36): Standard Stack Frame

And here is a different diagram of a portion of a process runtime stack (from CSAPP chapter 3 "Machine-Level Representation of Programs": Stack with multiple frames

The runtime stack is a region high in a process's virtual memory. For reference, here is a diagram of virtual memory (from TLPI, chapter 6 "Processes"): Layout of a Process in Virtual Memory

Now to your question:

What are the implications of the last two lines of code (decrementing stack pointer and base pointer)?

The code that you have provided is not from a called function, so no stack frame would be created for this code. In other words, no function calls means no stack frame creation. This code would be mapped from an executable ELF binary's .text section to the text segment in virtual memory when executed and its process image is created. This means that in the context of the code you have provided, the statements dec %sp and dec %bp are of no consequence since there are no function calls and no stack frames to be managed.

What are the lines of code doing from a higher level perpective? - Eg "Takes an input and outputs a string"

There is not much going on here. xor $0x20, (%eax) is an example of indirect addressing, where the value in %eax is treated as a memory address and whatever is at that address is xor'ed with the integer value 32. dec (%edi) results in the value in %edi being treated as a memory address, and whatever is at that address has 1 subtracted from it. The other statements are just arithmetic performed on values in CPU registers. I am not sure how this sequence of computations would be represented in a high level language.

Conclusion

The best thing you could do for yourself is learn how to create some basic functional assembly code and step through the code with a debugger like gdb using the stepi and info registers commands. This will allow you to see for yourself what happens as a result of each statement. It will also speed up the learning process and deepen your understanding of assembly and virtual memory.

Not the answer you're looking for? Browse other questions tagged or ask your own question.