ext4 disk cleanly unmounted, but fsck.ext4 claims it was not and finds errors

Question

I've got a (new) 8TB SATA hard disk that I bought to backup my Ubuntu 16.04 data disks. It's connected to my box via a Technet USB3 docking station (that I've used before and not had problems with).

My backup script is essentially

/sbin/mkfs.ext4 -c  /dev/sdg1
mount /dev/sdg1 /mnt/backup
mkdir -p /mnt/backup/data1
cp -a /my-pool/my-data1/* /mnt/backup/data1
mkdir -p /mnt/backup/data2
cp -a /my-pool/my-data2/* /mnt/backup/data2
umount /dev/sdg1

This time I ran mkfs separately after partitioning the disk rather than as part of the script. mkfs said it was using 4K blocks when formatting the filesystem. It seemed to finish quite quickly compared to when I've formatted smaller drives with ext3

This copies about 4.5Tb from the ZFS filesystems data1 & data2. No errors were shown at all during this process.

Since this was a new disk I thought I would be paranoid and run /sbin/fsck.ext4 against /dev/sdg1 expecting everything would be fine. However, fsck complained that the disk was not cleanly unmounted (even though it was and the system had been powered on continuously) and then proceeded to find large numbers of errors. I got fed up manually responding so I quit fsck and re-ran as /sbin/fsck.ext4 -y /dev/sdg1 and fsck found hundreds of errors (multiply claimed blocks, checksum and others which I did not manage to note down before fsck filled the terminal history).

Does this mean the hardrive itself is likely bad or is there another explanation for this?

My understanding is that umount should have forced any pending blocks to have been written so I don't understand why fsck would claim the drive was not cleanly unmounted.

Are there some special gotchas with ext4 and large drives or something else I'm likely doing wrong?

File ??? (inode #85393418, mod time Thu Jul 20 08:01:35 2017) 
  has 1 multiply-claimed block(s), shared with 1 file(s):
    ... (inode #208650243, mod time Tue Dec 24 21:15:08 2097)
Clone multiply-claimed blocks? yes

File ??? (inode #85393421, mod time Thu Jul 20 07:58:56 2017) 
  has 1 multiply-claimed block(s), shared with 1 file(s):
    ... (inode #208650243, mod time Tue Dec 24 21:15:08 2097)
Clone multiply-claimed blocks? yes

File ??? (inode #85393422, mod time Thu Jul 20 07:59:38 2017) 
  has 3 multiply-claimed block(s), shared with 2 file(s):
    ... (inode #208650243, mod time Tue Dec 24 21:15:08 2097)
    ... (inode #211472387, mod time Sat Jul 15 23:06:03 2056)
Clone multiply-claimed blocks? yes

File ??? (inode #85393423, mod time Thu Jul 20 07:58:28 2017) 
  has 3 multiply-claimed block(s), shared with 1 file(s):
    ... (inode #208650243, mod time Tue Dec 24 21:15:08 2097)
Clone multiply-claimed blocks? yes

Also weird is that the same inode (208650243) appears in all the errors that I've looked at.

UPDATE: fsck.ext4 is still running! I left it going since I felt there was at least a chance it would complete and leave me with a useful filesystem, but it's been going 16+ hours now.

UPDATE2: I had thought it might be because I started using the drive before jbd2 had finished creating all the journalling structure (which apparently happens lazily, see https://askubuntu.com/questions/119742/ext4-jbd2-journaling-active-even-on-empty-filesystem). I wiped the existing partition table, created a new partition and waited for ~6 hours until jbd2 appeared to stpo writing to the disk. I then used rsync to backup the filesystem. All seemed good. I waited an hour or so and then used umount to umount the device. However same error when I tried to re-mount it:

mount: mount /dev/sdg1 on /mnt/bdavepool failed: Structure needs cleaning

Found this which suggests that jbd2 might be writing to the disk for a long time after it is mounted for the first time askubuntu.com/questions/119742/… it's possible I unmounted the disk before it had finished and this is what caused fsck to complain. I've wiped the disk and re-partitioned with gparted and am leaving the disk mounted. Will see if jbd2 stops after about 6 hours — dsnowdon, Commented Apr 12, 2020 at 19:10

dsnowdon · Accepted Answer · 2020-05-20 08:28:37Z

Given that the new disk was set up with a GPT partition table the most likely answer appears to be that there is a fault with the disk.

I've ordered a new disk and will make another attempt. Will update this answer once I've verified whethere there are issues with a replacement disk.

UPDATE 2020/5/20 This issue actually seems not to have been the drive itself but the use of a USB3 drive cradle to write large amounts of data to the disk. I swapped the drive cradle for a caddyless drive bay cradle with a SATA interface and since then I've not encountered when backing up my disks.

Stack Exchange Network

ext4 disk cleanly unmounted, but fsck.ext4 claims it was not and finds errors

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
linux
ubuntu-16.04
ext4
.

Hot Network Questions

ext4 disk cleanly unmounted, but fsck.ext4 claims it was not and finds errors

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged linuxubuntu-16.04ext4.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
linux
ubuntu-16.04
ext4
.