2

So I'm trying to write a reversing tool that parses symbol table to find the main function's address

right now all the binaries I'm checking the name of main function is still main in symbol table

my question is can this change? because right now my check to finding the main function is that string being equal to main

if it can change, what are the possible values? if it can have too many possible values then how can i find it in the binary?

3 Answers 3

4

Since you mentioned ELF in the tag. WinMain, DllMain, etc should not be a concern for you. They're name conventions for Windows.

The main function is the first function executed for a C/C++ program. However, it doesn't mean this is the real first function / code executed. You will usually find some initialization code before this function is called.

my question is can this change? because right now my check to finding the main function is that string being equal to main

Yes. Actually, you find this static symbol because your executable is not stripped. If you run strip(1) on the executable, you'll lose this information.

if it can change, what are the possible values? if it can have too many possible values then how can i find it in the binary?

It can be anything.

For instance: without static symbol and if your executable is not compiled with the -static switch, you can still retrieve the address of the main function by finding the first parameter of __libc_start_main. This function is normally imported from the shared object libc.so, so you can find its dynamic symbol.

2
  • you mean in function _start before calling the function you mentioned, we always push the absolute address of main correct? ( no matter the compiler? ) because i want to write a tool that works no matter how the C/C++ code is compiled and what the attacker has done
    – Max
    Commented Apr 17, 2019 at 5:40
  • It's not necessary pushed. It depends on the calling convention. For instance, for x86-64 (system v) the first parameter will be set in the register rdi.
    – wisk
    Commented Apr 17, 2019 at 12:08
3

Just a quick note in case you would not be aware of this:

$ cat tiny.c
#include <unistd.h>
void _start() {
  _exit(42);
}

on x86-64, here is what I get (you need a static libc: libc.a):

$ gcc -static -ffreestanding -nostartfiles  -s -o tiny tiny.c
$ ./tiny || echo $?
42

Pay attention that:

$ file tiny
tiny: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=5557e6655b77976b7c248711af6f508d931fc3af, stripped

but only:

$ objdump -x tiny

tiny:     file format elf64-x86-64
tiny
architecture: i386:x86-64, flags 0x00000102:
EXEC_P, D_PAGED
start address 0x0000000000400180

Program Header:
    LOAD off    0x0000000000000000 vaddr 0x0000000000400000 paddr 0x0000000000400000 align 2**21
         filesz 0x0000000000000240 memsz 0x0000000000000240 flags r-x
    LOAD off    0x0000000000001000 vaddr 0x0000000000601000 paddr 0x0000000000601000 align 2**21
         filesz 0x0000000000000018 memsz 0x0000000000000018 flags rw-
    NOTE off    0x0000000000000158 vaddr 0x0000000000400158 paddr 0x0000000000400158 align 2**2
         filesz 0x0000000000000024 memsz 0x0000000000000024 flags r--
     TLS off    0x0000000000001000 vaddr 0x0000000000601000 paddr 0x0000000000601000 align 2**2
         filesz 0x0000000000000000 memsz 0x0000000000000004 flags r--
   STACK off    0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4
         filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .note.gnu.build-id 00000024  0000000000400158  0000000000400158  00000158  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .text         0000006a  0000000000400180  0000000000400180  00000180  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .eh_frame     00000050  00000000004001f0  00000000004001f0  000001f0  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .tbss         00000004  0000000000601000  0000000000601000  00001000  2**2
                  ALLOC, THREAD_LOCAL
  4 .got.plt      00000018  0000000000601000  0000000000601000  00001000  2**3
                  CONTENTS, ALLOC, LOAD, DATA
  5 .comment      0000002c  0000000000000000  0000000000000000  00001018  2**0
                  CONTENTS, READONLY
SYMBOL TABLE:
no symbols
2
  • I have no idea what you are pointing out with this comment. Care to elaborate? Commented Apr 18, 2019 at 23:13
  • 1
    @JohannAydinbas I was simply pointing out that elf executable may not have a main symbol (eg. $ objdump -x tiny | grep main return nothing in my case).
    – tibar
    Commented Apr 19, 2019 at 7:42
-1

main() means your program is a console application.

WinMain() means the program is a GUI application -- that is, it displays windows and dialog boxes instead of showing console.

DllMain() means the program is a DLL. A DLL cannot be run directly but is used by the above two kinds of applications.

2
  • so these are the only possible values correct and i need to check for these only?
    – Max
    Commented Apr 16, 2019 at 6:57
  • there is wmain() for unicode apps, don't know if other values exists. I think, other than main() is windows specific
    – mailwl
    Commented Apr 16, 2019 at 8:30

Not the answer you're looking for? Browse other questions tagged or ask your own question.