Some time ago one of my hard drives, a 3TB Seagate ST3000DM001, wouldn't mount anymore. It worked fine up to the day before, except the week before there were some errors from a single file. Some GUI applications gave me errors about it, but cp created a copy that I was able to validate, even when moved back to the original path to replace the original file. I didn't get any more errors about that file.
I've got Kubuntu 16.04. Starting up the computer with the drive attached takes an unsually long time, and there's a strange clicking sound from the drive. Since the drive wouldn't mount anymore, I've checked and tried a few things.
syslog:
/var/log/syslog.1:Dec 12 23:06:45 grimripper-desktop systemd[1]: dev-disk-by\x2duuid-32798b9c\x2da158\x2d42f4\x2db0d0\x2dec13f1d9f287.device: Job dev-disk-by\x2duuid-32798b9c\x2da158\x2d42f4\x2db0d0\x2dec13f1d9f287.device/start timed out.
Currently there are seven of those from last night but none from today.
sudo fdisk -l /dev/sdb
fdisk: laitetta /dev/sdb ei voi avata: I/O-virhe
(It says device /dev/sdb can't be opened: I/O error)
sudo fsck -n /dev/sdb
fsck – util-linux 2.27.1
e2fsck 1.42.13 (17-May-2015)
fsck.ext2: Attempt to read block from filesystem resulted in short read yritettäessä avata /dev/sdb
Could this be a zero-length partition?
(It says short read while trying to open /dev/sdb)
sudo file -s /dev/sdb
/dev/sdb: ERROR: cannot read `/dev/sdb' (Input/output error)
sudo smartctl -a /dev/sdb
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-210-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: ST3000DM001
Serial Number: W1F1SHNJ
LU WWN Device Id: 5 000c50 05da67bd0
Firmware Version: CC24
User Capacity: 137 438 952 960 bytes [137 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s
Local Time is: Fri Oct 22 22:25:31 2021 EEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Read SMART Data failed: scsi error badly formed scsi parameters
=== START OF READ SMART DATA SECTION ===
SMART Status command failed: scsi error badly formed scsi parameters
SMART overall-health self-assessment test result: UNKNOWN!
SMART Status, Attributes and Thresholds cannot be read.
Read SMART Log Directory failed: scsi error badly formed scsi parameters
Read SMART Error Log failed: scsi error badly formed scsi parameters
Read SMART Self-test Log failed: scsi error badly formed scsi parameters
Selective Self-tests/Logging not supported
I tried Analyse -> Quick Search with TestDisk, but it didn't find any partitions.
Today I've had ddrescue running (the drive changed from sdb to sdd when I added a new drive):
sudo ddrescue /dev/sdd sdd.dd sdd.log
GNU ddrescue 1.19
Press Ctrl-C to interrupt
rescued: 0 B, errsize: 4142 MB, current rate: 0 B/s
ipos: 129520 kB, errors: 1, average rate: 0 B/s
opos: 129520 kB, run time: 13.47 h, successful read: 13.48 h ago
Scraping failed blocks... (forwards)
It hasn't been able to rescue anything, and it's been going at about 9MB/h which would mean several hundred years to finish a 3TB drive at that speed.
I've got about 40 million lines from today in syslog and kern.log, mostly repeating this:
Dec 13 08:27:45 grimripper-desktop kernel: [ 825.485325] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Dec 13 08:27:45 grimripper-desktop kernel: [ 825.485336] ata5.00: irq_stat 0x40000001
Dec 13 08:27:45 grimripper-desktop kernel: [ 825.485344] ata5.00: failed command: READ DMA
Dec 13 08:27:45 grimripper-desktop kernel: [ 825.485359] ata5.00: cmd c8/00:00:00:00:00/00:00:00:00:00/e0 tag 28 dma 131072 in
Dec 13 08:27:45 grimripper-desktop kernel: [ 825.485359] res 51/04:00:00:00:00/00:00:00:00:00/e0 Emask 0x1 (device error)
Dec 13 08:27:45 grimripper-desktop kernel: [ 825.485367] ata5.00: status: { DRDY ERR }
Dec 13 08:27:45 grimripper-desktop kernel: [ 825.485373] ata5.00: error: { ABRT }
Dec 13 08:27:45 grimripper-desktop kernel: [ 825.485537] ata5.00: configured for UDMA/133
Dec 13 08:27:45 grimripper-desktop kernel: [ 825.485551] ata5: EH complete
There's a few other things I've found, similar to ddrescue, but looking at what's happening with that it seems a bit pointless and probably would take a long time.
Is there any way to recover the data on this drive?
EDIT 2021-12-17
It works! Maybe? After a combination of tapping (as per confused's answer), spinning, turning it upside down (as in the question davidgo linked), and running ddrescue in reverse (like davidgo suggested), ddrescue appears to be reading from the drive. Or maybe it was the Lidl chicken fillet I sacrificed the night before I started this ddrescue run that did the trick.
GNU ddrescue 1.19
Press Ctrl-C to interrupt
rescued: 4858 GB, errsize: 0 B, current rate: 55705 kB/s
ipos: 3142 GB, errors: 0 B, average rate: 35796 kB/s
opos: 3142 GB, run time: 1.57 d, successful read: 0 s ago
Copying non-tried blocks... Pass 1 (backwards)
There's none of those syslog and kern.log errors since I interrupted the previous ddrescue run (after the tapping, spinning and turning upside down, with non-reverse ddresue).
Except now it has read some 4,9TB from a 3T drive, and from the slowly diminishing ipos and opos it seems it still has 3,1T to go? The target file is shown as 7,3TiB in size, even though it's on a 5,5T drive with 4,5T used. Other content on the drive is about 62G.
I've no idea what's happening here. Is it going to try and read that 7,3 TiB reported as the target file's size, until there's no more space on the drive? Is it going to keep reading indefinitely or otherwise until there's no more space? Do I dare try just mounting it and reading it normally since it appears to be working now?
I don't even remember what ddrescue originally showed as ipos and opos so I could try and estimate based on how much those have changed how long this might take or how much it might write. But at least I have two more chicken fillets to sacrifice over the weekend.
ST3000DM001
HDDs were part of a class-action lawsuit against Seagate a few years back due to over a third of these drives critically failing for no reason. Try a different known-good SATA cable and an external caddy, but if neither work and you need the data off the HDD, you'll likely need to pay for the recovery of the RAW data directly from the platters from a data recovery company with a cleanroom. (The clicking sound is the heads reaching the end of the platters and being reset again by the voice coil.)