I split a huge folder:

$ tar cvpf - somedir | split -b 50000m

… and then I transferred the split files to another server and merged it:

$ cat x* > somedir.tar.gz

but when I tried to extract the file, it shows errors:

$ tar xvf somedir.tar.gz
tar: This does not look like a tar archive
tar: Skipping to next header
tar: Archive contains obsolescent base-64 headers
tar: Error exit delayed from previous errors

How can I transfer/extract it to be folder again? thanks


2 Answers 2

  1. You have not compressed so do not name the target file somedir.tar.gz -- call it somedir.tar
    • this is not really relevant to your problem -- tar xvf should have worked nevertheless
  2. Recheck your transfer (specifically if over FTP) is a binary one (and not a text transfer)
  3. Finally do check that the files have actually transferred correctly
    • for this, you might check source and target side file sizes
    • or even better, match their MD5 sums on both sides (this will check point 2 also)

If you've checked the above points and the MD5 sums match on both ends, you should first try to check your split. So, does the join and tar-tf work on the source machine?

If your somedir is very large, try the work-flow with a small directory.
Use tar cvpf to get a simple tarball first, split and test after a rejoin.
Now, copy the split to your target machine as before and test after a rejoin there.
This should speed up locating your problem.

  • I check MD5 sums, it is ok
    – user167043
    Commented May 1, 2012 at 2:44
  • 1
    I would do the md5 sum before splitting and after catting.
    – fstx
    Commented May 1, 2012 at 7:27

Don't name your concatenated tarball somefile.tar.gz, since it is not compressed. To avoid confusing tar's compression auto-detection, you really do need to name it somefile.tar.

Quoting from the tar info manual:

The format recognition algorithm is based on "signatures", a special byte sequences [sic] in the beginning of file, that are specific for certain compression formats. If this approach fails, `tar' falls back to using archive name suffix to determine its format ...

So, what's probably happening is that your tarball doesn't begin with any of the signatures of known compression methods, so tar falls back to using the filename to determine the compression. In your case, the name ends with .gz, so it assumes gzip compression. This leads to the errors you're getting, since the file is not actually compressed.

tl;dr: Name the tarball with the .tar suffix, not .tar.gz.

  • When tar cannot determine format based on the [sic] signatures in the file header, I've usually found the file to be corrupted -- naming with different extensions has never really helped. Getting the correct format archive has always helped.
    – nik
    Commented May 1, 2012 at 4:44

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .