0

A colleague (anonymized to protect the innocent, the guilty and wherever in this range accidental demon summoners fall) followed a tutorial that included combining a handful of gzipped files with

zcat *gz | pigz --fast -c -p 16 > outfile.gz

The files in question thus were all in the same directory, which was on an NTFS-formatted network share (accessible from both Linux and Windows machines).

He started the process on an Ubuntu machine, went for lunch, and came back to an implausibly large monster of a file and the process still running. He killed the process, deleted the file in the file explorer on his Windows machine (or so he thought), and asked me to help troubleshooting. When we combined the files more sensibly (cat *gz > new_outfile.gz), we noticed that cat complained about outfile.gz not existing.

Well, we'd just deleted it, so it shouldn't, but ls on the Ubuntu machine and a refresh of the file explorer on the Windows one revealed it was back.

I got curious and tried to see what was going on.

file outfile.gz told me this was a "writeable, regular file; no read access".

ls -l in the directory showed the file permissions as -rw-rw-rw-.

Trying to look at the start of the file with zcat outfile.gz | head gave me gzip: outfile.gz: no such file or directory.

After some more unsuccessful poking, I decided to just try and delete the file in the terminal (sudo rm outfile.gz since deleting as a regular user on Windows didn't work and I was hoping this'd make it stick).

And was met with rm: outfile.gz: no such file or directory.

I can exclude hidden characters that didn't get tab-completed (as suggested for another zombie file mystery) - ls -b shows the filename without any escape sequences. The Windows and Ubuntu machines mostly agree that the file is there, except for when I actually want to do something with it.

Having looked at the files, there's what looks to be the result of another attempt, and it behaves the same way.

What exactly happened here? Did we manage to summon Filethulhu with what just looked like a less efficient way to combine files? (Another colleague apparently managed to combine the files without a hitch, but had separate directories for input and output.) And how exactly do we get rid of this 70+ GB eldritch abomination sitting in our share?

3
  • 1
    These are two separate issues: (1) A filesystem (possibly network filesystem) incoherence. (2) Why the large file? If foo.test does not exist then echo *.test 3>foo.test will not show foo.test because *.test is expanded before 3>foo.test creates a new file. But zcat *gz | pigz --fast -c -p 16 > outfile.gz may read outfile.gz because the two parts of the pipeline run in parallel and the redirection in the second part may happen before *gz is expanded. Try rm foo.test; echo *.test >/dev/tty | : >foo.test many times. Sometimes you get foo.test printed, sometimes not. Commented Dec 9, 2022 at 17:48
  • @KamilMaciorowski: it's getting a bit late my timezone and it was a... slightly long day, so to make sure I got it correctly: what happens is that outfile.gz gets created once, and then the zcat/pigz pipeline keeps reading that file and feeding it into itself? Commented Dec 9, 2022 at 18:36
  • 1
    Yes, it does. It probably processes some other file(s) first, so outfile.gz grows before zcat starts reading it. Then zcat reads the file from the beginning, but pigz writes to the end. The reading process is always behind the writing one, so the file grows and grows. Commented Dec 9, 2022 at 19:05

2 Answers 2

0

The files in question thus were all in the same directory, which was on an NTFS-formatted network share (accessible from both Linux and Windows machines).

Behind this statement are a lot of moving parts:

  • samba or smbfs (Linux applications that I am betting are providing your network connectivity to a Windows fileshare)

  • Network

  • Windows general weirdness with filename lengths

  • Differences between Windows and Linux on things like file locking, etc. and samba/smbfs having to translate.

If, for example, there was a network issue, then the the Linux and Windows side might have different ideas of where things stand, so one side might claim a file exists when it doesn't on the other side.

And how exactly do we get rid of this 70+ GB eldritch abomination sitting in our share?

  • Stop samba or smbfs on your Linux box, restart Windows system, delete file on Windows, restart samba/smbfs on Linux.

To figure out what went wrong exactly, I would begin by looking at log files on the Linux side - possibly in /var/log/samba - there should be a file there with the computer name or IP address of the Windows system and looking at it could reveal clues.

1
  • Windows and Linux (at least when using ls or file) agree the file exists, and the filename is shorter than those of other, completely manageable, files in the same dir. (However, I just realized that my colleague actually tried twice and thus produced two files, so whatever it is is reproducible.) Commented Dec 9, 2022 at 17:49
0

The way to get rid of the files turned out to be simple: the files were undeletable because the CIFS kernel driver held a lock on them, and a reboot of the Linux machine took care of that.

(And while that wasn't the reason for it being undeletable, @KamilMaciorowski had a good explanation of how we'd gotten ourselves in that situation to begin with: outfile.gz was created before * gz was expanded in the first part of the pipeline; some of the other files in the directory got processed first, and then the growing outfile.gz was fed into itself until the process was killed.)

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .