8

I am looking to reverse engineer an 8051 firmware binary file and am not certain where to start. The firmware is for the Real RTL8188EE wireless card. It is located here: https://github.com/lwfinger/rtlwifi_new/tree/master/firmware/rtlwifi

I know (think?) it is 8051 because there are several references throughout the Linux code of this chip. I have identified from the Linux code that the header portion is the first 32-bytes.

How do I go about disassembling it? I have run it through IdaPro with the offset of 0x20 (The 32 bit header), but I get a lot of NOPs and then binary junk at the end that Ida seems to properly interpret the instructions. I ran it through and an obscure disassembly tool I found online called d52v336, so I deleted the first 32 bytes. It began with a lot of NOPs and series of ORGs. I put the printout here (https://pastebin.com/1M11wXiF). The first bit of code seems to begin with a 'reti' and 'lcall X4204'. When I look up X4202, it is 'X4204 equ 4204h', which confuses me. Perhaps this is the beginning of the code, but I am not certain.

How do I know if my offset is off? Is this the real beginning of the code? If so, what do I do with that lcall? Confused!

3 Answers 3

8

I recently made a tool to make the early stuff in 8051 reverse engineering easier, called at51 and will shamelessly use this answer as a showcase.

First off, you want the image to be properly aligned. Rarely is 8051 firmware aligned, and for this file this is also true. By using the base subcommand, you get the offsets most likely to be the offset the file is loaded at:

$ at51 base rtl8188efw.bin
Index by likeliness:
    1: 0x3fe0 with 139
    2: 0x2526 with 63
    3: 0x6a5 with 58

Note that the score of the first match is more than double that of the second match, so it is probably loaded at 0x3fe0 (actually at 0x4000 because of the 32 byte header).

You could use now ghidra or radare2, which both have 8051 support. To help with this, since this firmware seems to be compiled with C51 as most 8051 firmware is, you can also use the libfind subcommand to find standard library functions in the image. For that you use the C51 library files of the form C51*.LIB (you can obtain them by downloading the trial version of C51, because no one would ever leave a file named C51L.LIB anywhere open on the internet).

Anyway, using the aligned image (for example by using dd if=rtl8188efw.bin of=fw_aligned bs=$((0x3fe0)) seek=1) and the library files, one gets

$ at51 libfind fw_aligned /path/to/lib/C51*.LIB
Address | Name                 | Description
0x42dd    (MAIN)                
0x44a9    ?C?IILDX              
0x44bf    ?C?LAND                long (32-bit) bitwise and
0x44cc    ?C?LOR                 long (32-bit) bitwise or
0x44d9    ?C?LLDXDATA            long (32-bit) load from xdata
0x44e5    ?C?LLDXDATA0           long (32-bit) load from xdata into r3-r0
0x44f1    ?C?OFFXADD            
0x44fd    ?C?PLDXDATA            general pointer load from xdata
0x4506    ?C?PSTXDATA            general pointer store to xdata
0x450f    ?C?CCASE              
0x4573    ?C_START              

Now you have some offsets where some functions start and a description for some of them, which should help. Note that MAIN is in parenthesis because it is found not in the library itself, but referenced by it.

One last thing is that firmware generated by C51 often contains a structure where values of memory locations to be initialized on startup are stored. One can use the kinit subcommand to read that structure. You can find the offset of that structure easily because it is loaded at the start of ?C_START with mov dptr, #0x45b8. But it seems that for this firmware image, this is actually disabled (by inserting a 0 at that location)? Or maybe the linker messed up and inserted the 0 before the structure and not after it? Anyway, if they didn't zero it out (the structure actually exists one byte behind it), you would get

$ at51 kinit -o $((0x45b9)) fw_aligned
xdata[0x8197] = 0x00
xdata[0x8198] = 0x00
xdata[0x81a4] = 0x00
xdata[0x3457..0x3468] = [0x4a, 0x57, 0x36, 0x58, 0x29, 0xc0, 0xe0, 0xc0, 0xf0, 0xc0, 0x83, 0xc0, 0x82, 0xc0, 0xd0, 0x75, 0xd0]

That last one seems to be garbage since it contains 8051 code, so maybe the terminating 0 did accidentally land at the beginning.

2

radare2 supports 8051 if you think it is 8051 your link points to a directory

there is no ee file there is an efw file i ran it in radare2 and it seems to be disassembling it

radare2 -AA -a 8051 firmware.bin

[0x00000000]> px 16
- offset -   0 1  2 3  4 5  6 7  8 9  A B  C D  E F  0123456789ABCDEF
0x00000000  e188 1000 0800 0000 1025 2156 b02b 0000  .........%!V.+..
[0x00000000]> pd 1
            ;-- b:
            ;-- r4:
            ;-- r5:
            ;-- psw:
            ;-- r7:
/ (fcn) fcn.00000000 14
|   fcn.00000000 ();
\       ,=< 0x00000000  ~   e188           ajmp loc.00000788
[0x00000000]> s 0x788
[0x00000788]> pd 5
|- loc.00000788 9
|   loc.00000788 ();
|       |      ; JMP XREF from 0x00000000 (fcn.00000000)
|       |      ; JMP XREF from 0x00000782 (loc.00000788)
|       |      ; CALL XREF from 0x0000077a (loc.00000788)
|       `=< 0x00000788      80f0           sjmp loc.0000077a
            0x0000078a      e0             movx a, @dptr
/ (fcn) fcn.0000078b 27
|   fcn.0000078b ();
|              ; UNKNOWN XREF from 0x000006db (fcn.0000061c + 191)
|              ; CALL XREF from 0x000006db (fcn.0000061c + 191)
|           0x0000078b      75f003         mov b, #0x03                ; [0x100f0:1]=255
|           0x0000078e      a4             mul ab
|           0x0000078f      24fe           add a, #0xfe
[0x00000788]>
1
  • Pardon, I should have clarified. I'm working on rtl8188efw.bin. However, any one should work. From your knowledge of Intel 8051, does this look the legitimate start of a program rather than radare2 interpreting another another chip's ROM? And, I'll look into radare2, that looks promising. Thanks a lot! Commented Mar 5, 2018 at 2:20
2

Yes, this looks like 8051 firmware.

Reset entry point is 0x0000, which in this case jumps to 0x0788, which continues with valid 8051 code. There are many calls to addresses above 0x4000, so you probably are missing ROM code.

In my experience, the automatic analysis functionality of radare2 (aa, aaa) struggles a lot with 8051. I usually only run aar to get all cross references and then manually markup the code as I go along.

I also configure e asm.jmpsub=true for better 8051 disassembly (once you started assigning flags).

See this page for more information about using radare2 to reverse 8051: https://github.com/radareorg/radare2-book/blob/master/src/arch/8051.md

Not the answer you're looking for? Browse other questions tagged or ask your own question.