9

I am moving some 20Gb within my local network (100 Mbit/sec). The files are from a typical Linux desktop system.

Would compressing them using Tar\Gzip and then sending them improve performance?

EDIT: I'm moving a Developer's workspace, meaning lots of source and PDFs, and not much multimedia.

1
  • 2
    A worthwhile upgrade may be to upgrade your LAN to gigabit Ethernet. I upgraded my switch and some NICs a few years ago and it was definitely worth it. Now most motherboards even come with on-board 10/100/1000 ports, so you might only need to upgrade your switch. Commented Aug 18, 2009 at 14:39

8 Answers 8

6

It largely depends on the type of files you are moving.

  • If your files are like PDF, JPEG movies, Installation files, etc,
    they are likely to be already compressed and will not give you a great advantage.
  • If its source files compressing will be quite useful.
  • If its lots of small files, at least a tar archive will be useful.

Finally, if your source machine has lots of processing power and memory,
compression would be in useful time -- else just a tar (based on above points) would suffice.

Since your network is just 100 Mbps, you should lean towards compression if that helps.
But, if you are transferring files that cannot be compressed much,
you should start accounting for the transfer time

Alternatively, you could consider other mediums for transfer (like USB/DVD).

6

Aside of the type of the file, this is especially dependent of the amount of files. While transferring bulk data is theoretically possible at network speed, there's a lot of overhead associated with file system operations such as enumerating files and properties, creating them and deleting them.

If you have a large amount of small files, the overhead can even get larger than the data to be transmitted.

In such cases, archiving the data before transmission can be a huge benefit. If it's badly compressible data (encrypted and/or already compressed data), I recommend to not compress the archive to save a lot of time - just use tar.

If the files are compressible (uncompressed bitmaps, text), also compressing might make sense.

1
  • Agreed. The data size saving from the compression will be limited. It is almost always faster to xfer a 1 20gb file then 100,000 files that when added together equal 20gb.
    – Tony
    Commented Aug 18, 2009 at 12:24
3

Probably the fastest technique is taring the data up, running it through a pipe, and then untaring at the other end.

Something like this

$ tar -czf - root_dir | ssh -c blowfish remote_machine  (cd parent_dir ; tar -xzf -)

The -z flag tells tar to compress, which should be very similar to a separate gzip step, which you include separately if you want.

If you need to copy or synchronize data a subsequent time, you can use rsync (-z gives compression). In particular, if the above command is interrupted, rsync will confirm your data, and send anything you missed.

It will be much cleaner if ssh is not asking you for passwords, but I think it will work even with passwords.

2

Nik is correct, it depends on the data. Generally speaking:

  • JPEG photos, movies, and music are not likely to be substantially further compressed by tar or zip, as these are already effectively compressed.
  • Compressing text files and program binaries will yield you significant space (and therefore transmission time) savings.
2

Technically yes, although around a LAN the gains would be small.

Basically in network transfers you go through several stages of requesting if the destination is ready to receiving your information, then you send some and check if it received it ok. Every individual file generates a extra step of 'new file incoming' and 'finished that file'. So if you zip/tar them all together you would only ever incur one 'new file incoming' message and one 'finished that file' instead hundreds or thousands of theses messages for sending non-compressed files.

Over a LAN it may take you longer to zip your files then send them instead of simply sending them. change your sending medium to a WAN and zipping is the way to go.

1

Compression might take longer to actually compress/send/decompress than to move the original files... However, often, a few large files get sent better than a large number of very small files, so compacting them together in a single file then transferring might be a good option.

1

In addition to the previous answers you should also bear in mind the load that you will be putting on the network and how this will impact other users. For such an amount of data, particularly if the same data is going to multiple destinations I would seriously consider using an external drive as the transfer medium.

1
  • +1 for common sense: External drives work great for local copying. Commented Aug 18, 2009 at 14:41
1

Yes. I have experienced this and use this method to take backup of large amount of data. If your sole purpose of copying to take back up of files on external hard disk then certainly compressing all files/folders in one or few ZIP/rar file and copy them on external drive save huge amount of time.

Writing one large file on external drive is far more efficient than writing millions of tiny individual files. There is large overhead of creating files in OS. When you compress and copy just single file, it will simply transfer & write continuously which saves huge amount of time.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .