28

After calling unrar on a RAR archive on my 1 TB NTFS drive, I am left with a file, that reportedly has a size of 86T.

Is it safe to delete such file? What is the best way to safely get rid of this file?

Of course, when removed, the file should be unlinked and any actual data belonging to other files should be left unaffected, but I really want to be sure on this one…

Edit 1:

du -h output in the directory containing the archive + the extracted file:

26M .
Edit 2: Progress and Findings
  • chkdsk F: /scan did find a corruption in the extracted file.
  • chkdsk F: /f did Deleting corrupt attribute record (0x80, "") (most likely belonging to this file as the record segment number matches) (attribute 0x80)
  • The file had 0 size afterwards
  • The 0 size file could be deleted
  • chkdsk F: /scan finds no problems
Edit 3: For the Curious

Berore fixing the file with chkdsk F: /f:

  • the file could not be deleted with Double Commander (file is already in use by no other than Double Commander – probably trying to read the whole file)
  • the file could not be deleted with rm (on Windows) due to permission denied (I had the cmd running as the Administrator)
2

2 Answers 2

40

That's not necessarily erroneously bigger than your whole drive. Many filesystems including NTFS and ext4 support sparse files, in which areas consisting entirely of 'zero' bytes (00 00 00 00 ...) do not have any disk extents allocated to them – such files can easily have an "apparent" size larger than the filesystem, while the real data allocation (aka Windows "size on disk") is smaller.

You can check whether the file is sparse by comparing du and du --apparent, or by listing files with the ls -s/--size option, or using xfs_io to list the individual extents:

$ echo Test > large.bin
$ truncate -s 10G large.bin
$ ls -l -s
4.0K -rw-r--r-- 1 root users 10G Jan 10 12:44 large.bin

$ du -h large.bin; du -h --apparent large.bin
4.0K    large.bin
10G     large.bin

$ xfs_io -r -c "fiemap -v" large.bin 
large.bin:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..7]:          4118216..4118223     8   0x1
   1: [8..20971519]:   hole             20971512

If in doubt, connect the filesystem to a Windows system (a VM might do) and run chkdsk X: /scan or chkdsk X: /f and let it verify that no files overlap – or, after deleting the file, use it to verify that the "free space bitmap" doesn't disagree with existing files.

In general, programs would not be able to create actually larger files than the filesystem can fit: even file archivers do not have that kind of direct access. If you do end up with an erroneously large file that isn't sparse, this can only be the fault of the OS filesystem driver, in which case nothing you do with that filesystem from that point onwards can be guaranteed to be safe (as deletion is done by the same "bad" driver, after all). Use Windows' CHKDSK to verify the filesystem.

For NTFS on Linux, consider switching between the built-in ntfs3 and the earlier ntfs-3g drivers to verify whether they behave the same. Try also extracting your archive on a Linux native filesystem (e.g. on ext4) to see whether it also creates a large file.

17
  • 1
    Interesting. But I still feel like the file size is a mistake, because 7zip on Windows produced a file with a reasonable file size (13M). I will try checking and let you know.
    – IsawU
    Commented Jan 10, 2023 at 11:02
  • 7
    It appears that chkdsk F: /f did the trick. The file size afterwards was 0 and it could be deleted.
    – IsawU
    Commented Jan 10, 2023 at 15:58
  • 9
    For completeness: truncate -s 100T is a common way to create a sparse file with arbitrary "apparent size" and minimal allocated size (usually a single extent, like 4 KiB).
    – iBug
    Commented Jan 10, 2023 at 19:12
  • 1
    @IsawU: you mean 7zip on Windows extracted the same file from the same RAR archive and it was only 13M? Matching what unrar v foo.rar or 7z l foo.rar showed? The fact that this happened in the first place is worrying; test your RAM to see if it's failing; that's one possible source of data corruption, in this case of data what was soon going to be written to disk. Commented Jan 12, 2023 at 2:50
  • 1
    @PeterCordes unrar v myarchive.rar shows indeed one file with around 13 M of size. Running sudo memtester 10000 5 (although I have 32 GB of RAM so that was probably not the most "scientific" size to chose). I'm 4 out of 5 tests deep and so far no errors. I might attempt the unrar again as there wasn't any long lasting negative effect and I do have good enough backups.
    – IsawU
    Commented Jan 13, 2023 at 12:23
10

In case there is a problem with the disk, I suggest as first step to ensure that you have backups for the data on the disk.

As the disk is NTFS, it's best handled on Windows as follows.

As second step, run a Command Prompt and enter the following command :

chkdsk C:

If it finds any errors, the next step will be to fix the problems using the command :

chkdsk /f C:

As the last step, if everything completes correctly, you could delete this file.

0

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .