I have a list of about 200 servers which contain links to individual files, for this question, let's just pretend they're .txt files. I need to download every file, however some servers only have the compressed version, and not the original and some servers have both, the compression used is bzip2.
That means a server could have the following files;
foo.txt.bz2
bar.txt
bar.txt.bz2
I've told wget to only download .txt files and .txt.bz2 files and I'm using no-clobber to prevent the same file being downloaded from each server. However, once a compressed file is downloaded, it is decompressed; the original is kept, the bz2 files is not. This means that wget is downloading the same bz2 files from every single server because it doesn't have a compressed version locally.
How do I tell wget to not download .bz2 files when it already has the decompressed version (eg, don't download foo.txt.bz2 if foo.txt already exists).
Thanks