Uncompress files when rsync'in

Question

I have 22,000 gzip files on a remote server, which I would like to trasnfer to my machine. I would also like to uncompress these files before/while transferring. Is it possible?

My rsync command currently is:

rsync -chavzP test@server1:/var/www/html/reports/id00003383.gz /home/myself/reports/id00003383.pdf

Note: file id00003383.pdf resides within the gz file id00003383.gz.

If it is not possible, do I have any other alternative?

This example shows you rsyncing one file. Where are the rest? Are you calling rsync 22,000 times? How do you know what the file names are? — phemmer, Commented Feb 13, 2014 at 2:07
I'm going to assume from the lack of response that one of the 2 answers below solved your question. If that is indeed the case, you should mark one of them as accepted. — phemmer, Commented Feb 15, 2014 at 2:13

Anthon · Accepted Answer · 2014-02-13 07:30:46Z

As far as I know rsync doesn't have that capability. I would not recommend decompressing before transferring the files unless your client is really slow compared to the extra network transfer time before doing decompression.

If you are into programming the best advice I can give is to take a look at rdiff-backup, which is a Python based program on top of the libsync1 library. You should be able to hang in a test & decompress on each file transferred in there.

This seems a likely route to catch the data before being written. And it would be faster than a process watching rsync output and trying to decompress what has just been transferred and already written to disc.

Anthon · Accepted Answer · 2014-02-13 08:05:38Z

I can't think of an easy way to leverage rsync and perform the decompression. You can do the decompression via a custom preloaded library or a custom FUSE filesystem (on the remote side, the side with the gzipped files), but that would be a lot of work.

If you can assume that the timestamp of the compressed file matches the timestamp of the gzipped file, and it's acceptable to transfer whole files in case of modification, then you don't need to leverage rsync's incremental update capabilities. The same goes if the files are never modified after they are created.

The first hurdle is that this is a remote copy. Eliminate the problem by mounting the remote directory with SSHFS so that it is available as a local filesystem.

Now you can use find to traverse the directory tree, create directories as needed, and gzip files on the fly.

mkdir server1
sshfs test@server1:/var/www/html/reports server1
cd server1
find . \( -type d -o -type f -name .gz \) -exec sh -c '
  for x; do
    if [ -d "$x" ]; then
      [ -d "$0/$x" ] || mkdir "$0/$x"
    else
      zcat "$x" >"$0/$x"
      touch -r "$x" "$0/$x"
    fi
  done
' /home/myself/reports {} +

Stack Exchange Network

Uncompress files when rsync'in

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
rsync
gzip
.

Hot Network Questions

Uncompress files when rsync'in

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged rsyncgzip.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
rsync
gzip
.