I have binary data which I review by xxd -ps
in hex format.
I notice that the byte distance is 48300 (=805*60) bytes between two headers where the separator is fafafafa
.
There is the beginning of the file which should be skipped.
Example hex data where 48300 bytes between headers fafafafa which you can get here called data26.6.2015.txt where three headers and its nearly equivalent binary here called test_27.6.2015.bin which has only first two headers. In both files, the data of last header is not of complete length; otherwise, you can assume that the byte offset is fixed i.e. the length of data between headers.
Pseudocode of algorithm
- look header end position
- look first two header positions and set the difference of these positions (d2 - d1) the distance between events; this event length is the fixed (777)
- split data by byte position (777) - TODO should I split binary format or as
xxd -ps
converted data? by byte position (777)
I can convert data back to binary by xxd -r
like xxd -ps | split and store | xxd -r
but I am still unsure if this is necessary.
In which stage can you split binary data?
Only in xxd -ps
converted format or as binary data.
If splitting in xxd -ps
converted format, I think for loop is to only way then go through the file.
Possible tools for splitting csplit
, split
, ..., not sure.
However, I am uncertain.
Output from grep (ggrep is gnu grep) on the hex data
$ xxd -ps r328.raw | ggrep -b -a -o -P 'fafa' | head
49393:fafa
49397:fafa
98502:fafa
98506:fafa
147611:fafa
147615:fafa
196720:fafa
196725:fafa
245830:fafa
245834:fafa
while doing the similar grep in the binary file giving emptyline only as an output.
$ ggrep -b -a -o '\xfa' r328.raw
Documentation
Documentation given to me is found here and here as a picture the general SRS data format:
In which stage can you split binary data (as binary data or as xxd -ps
converted data)?
dd
, but without understanding exactly what chunk you want to extract I'm limiting to making this a comment rather than an answer.fafafafad0
starts at character 195 of the hex dump, meaning it's byte 98 of the binary file, butfafafafa6a
starts at character 968 of the hex dump, 773 characters of hex later, which means it's 386.5 bytes later, which means it's across a byte boundary. Your "file 001.txt" is 773 characters long, which isn't normally a valid length for a hex dump - hex dumps must have an even number of characters, since each byte of the input is 2 characters.-P
ONgrep
- it's not doing you any favors there. In general, just dump the file withod
orstrings
or whatever andgrep
the results - you don't need to save a copy of the whole encoded file, though - you already have the other. Andgrep
ping stuff like that is already going to be tedious enough, and so maybe just keep that actual searches basic if you can.