2

I've come to work with a strange database file format. Each DB comes with two files: one is "database.db" and the other is "database.key".

The ".db" file always starts with a 0x78 0x9C binary header, while the ".key" always contains, in a random part of the file, the string "1.00 Peter's B Tree" inside.

Looking online I found that the header 0x78 0x9C could refer to compression Zlib, but have not found any way to view the contents of the database.

Does anyone here know something that could help me with this format ? Thnaks :)

Edit 1: It appears that the ".db" file contains more than one zlib deflated streams: The signature 0x78 0x9C is not only present at the beginning of the file but in different parts of it. Fo example this are some of the streams i can find in one file:

78 9C CB 63 40 07 33 76 5B 6A AF 78 DD 54 23 CE C9 90 C4 78 89 81 89 81 F1 22 86 9A ED 6A D7 44 F6 03 D5 B0 31 30 94 60 91 F6 D4 2A 76 3B 0C 94 E6 63 60 2C 51 B6 63 00 00 22 13 11 57
78 9C CB 63 40 07 2F 53 D7 B8 9F EC 8B B2 E1 7A F1 32 87 F1 12 03 23 03 E3 45 0C 35 4B B7 68 5B CD 90 2E E7 65 67 60 2A 51 B6 63 00 00 A6 E8 0C 5D

By inflating thoose 2 streams i get 2 new uncompressed streams.

What i did then is a C# program that loaded a ".db" file and created a list of byte arrays; a byte array is a deflated stream. To do this I simply split the file at every 78 9C.

This seems to work with some of the ".db" files but, in other situations it gave me some errors like "Invalid distance code", with this stream

78 9C E2 13 FD 2F 14 9F CD 9B 29 3E 65 9F A0 F8 BC 7C 92 E2 93 EF 29 8A CF B0 A7 29 3E 8D FE 4A F1 B9 F2 0C C5 27 C4 B3 14 EF F5 5B 28 DE B5 B7 52 BC FF 6E A3 78 27 DD 4E F1 9E B8 83 E2 DD 6D 27 C5 FB D4 2E FA F0 6A EE A6 78 EF 78 EE EA 2F AA D3 91 FE 1F 2F 94 78 6C

or "Invalid stored block lenght", with this stream

78 9C 90 35 CE 34 2F 0C 7D FE A5 57 C9 FF D5 2B 47 5B B7 C4 7F 69 EA 3F 0F AC 25 F4 45 49 3D CC FF 00 E5 AE 30 40

Maybe simply splitting the file at each 78 9C is not the correct way of doing it ...

As for the ".key" files: I was able to open them using the library of Peter Graf "PBL". With the "pblKfGetAbs ()" (http://www.mission-base.com/peter/source/pbl/doc/keyfile.html) I managed to get all records related to each key in the file. These records are of 4-byte values. Searching for these values on a decompressed ".db" file (In a file that did not give me errors during the inflate process) with an hex editor I was able to get some results but nothing more. I don't understand wat thoose records on the key file means...

Thank you for the help !

3

4 Answers 4

6

Yes, those are very likely zlib streams stored in the database.

There is nothing keeping 78 9c from appearing in the compressed data, so simply searching for that is not a good way to extract the contents of the file. Also 78 9c is not the only valid zlib header. The easiest way to find the valid zlib streams is to simply start decompressing at every byte. zlib will very quickly rule out most as not having a valid zlib header. For the rest you can decompress until it completes or fails. If it completes with a good integrity check (returning Z_STREAM_END), then it is extremely likely that that was an intentional compressed zlib stream.

You are trying to reverse-engineer a data base format with what appears to be relatively little to go on. This is a detective job that stackoverflow can't help with, unless someone here knows the format and recognizes it.

4
  • Sorry for the late response. Right now the language or enviroment are not important. Thoose are "old" db files that where created by someone inside my company that now is no longer working here, and we must open them. I've tryed to decompress the zlib streams with java in Windows but the output isn't less confusing .. Commented Sep 16, 2016 at 7:55
  • When you tried to decompress, did it succeed? Success would be no error codes.
    – Mark Adler
    Commented Sep 16, 2016 at 14:44
  • You are right, an error code appeared. The .db file however seems composed of more deflated data stream.. I can find multiple 0x789C inside the file. I've made a program to split the binary file at each 0x789C occurrence,and then inflate each one of these data streams. However this does not seem to work completely ... Commented Sep 20, 2016 at 12:20
  • Please expand your question with what you have tried and in what way it appears to have not fully worked.
    – Mark Adler
    Commented Sep 20, 2016 at 15:19
2

These are zlib magic headers widely used by different utilities (such as Git, Memcached, etc).

To uncompress the file, you can use the following command:

printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - zlib-file.dump | gunzip

To skip some bytes before, use dd, e.g.

cat <(printf "\x1f\x8b\x08\x00\x00\x00\x00\x00") <(dd skip=100 if=zlib-file.dump bs=1 of=/dev/stdout) | gunzip

If the data got crc/length error, consider as faulty.

0

the .db files are compressed data , the .key files are key_informations to find the wanted data in those .db (like an index file) after you open them,you may not find string data in those .db files,because they are a runtime databases, these .db files containt hex data like 'packets'and they are compressed as he said

0

78 9C is the zlib magic headers with Default Compression.

Try Aluigi's offzip commandline tool to extract the data.

Not the answer you're looking for? Browse other questions tagged or ask your own question.