0

So to make a long story short(er), I suspect I had a bad SATA card. I have a 8 drive storage pool in Windows 11 in Storage Spaces (1 parity) that's been running great for a few months. Suddenly it started to go offline, but restarting the computer and making double sure the cables were secure fixed it for a while. Then it stopped working entirely and Storage Spaces would just freeze if I tried to open it. Today I swapped a new SATA card in and all the drives popped back up, no more freezing - great. CrystalDiskInfo, HDSentinal, and Windows all report the drives as OK/PERFECT. The pool showed it was in an offline state, but healthy otherwise. The Event Viewer had a few of these:

enter image description here

I figured, no problem, and ran those commands, but the volume didn't come back up. After a bunch of fiddling, I realized running this command:

Get-VirtualDisk | ?{ $_.ObjectId -Match "{bb97ba58-8273-4e7d-95c1-9eb0fa705f15}" } | Get-Disk | Set-Disk -IsOffline  $false

Brings the drive back up just fine in a read-only state. Big relief. At least if all else fails, I can get the data off and try again.

But when I run this command:

Get-VirtualDisk | ?{ $_.ObjectId -Match "{bb97ba58-8273-4e7d-95c1-9eb0fa705f15}" } | Get-Disk | Set-Disk -IsReadOnly $false                  

Or otherwise try to bring the drive online in a read/write fashion via DiskManager or Store Spaces, I get an error and the volume goes away. The pool still reads as healthy, but it won't go into write mode. I can't seem to find any other errors that hint at a specific drive being bad or anything. Here's some additional diagnostics:

get-storagepool -isprimordial 0

FriendlyName OperationalStatus HealthStatus IsPrimordial IsReadOnly      Size AllocatedSize
------------ ----------------- ------------ ------------ ----------      ---- -------------
Storage pool OK                Healthy      False        False      160.07 TB      16.48 TB
get-storagepool -isprimordial 0 | get-physicaldisk

Number FriendlyName         SerialNumber MediaType CanPool OperationalStatus HealthStatus Usage           Size
------ ------------         ------------ --------- ------- ----------------- ------------ -----           ----
2      ST22000NM001E-3HM103 ZX20E51C     HDD       False   OK                Healthy      Auto-Select 20.01 TB
4      ST22000NM001E-3HM103 ZX201Y4F     HDD       False   OK                Healthy      Auto-Select 20.01 TB
6      ST22000NM001E-3HM103 ZX207NJP     HDD       False   OK                Healthy      Auto-Select 20.01 TB
9      ST22000NM001E-3HM103 ZX204FW5     HDD       False   OK                Healthy      Auto-Select 20.01 TB
8      ST22000NM001E-3HM103 ZX205GC4     HDD       False   OK                Healthy      Auto-Select 20.01 TB
7      ST22000NM001E-3HM103 ZX203GLT     HDD       False   OK                Healthy      Auto-Select 20.01 TB
3      ST22000NM001E-3HM103 ZX214L6H     HDD       False   OK                Healthy      Auto-Select 20.01 TB
5      ST22000NM001E-3HM103 ZX2078LS     HDD       False   OK                Healthy      Auto-Select 20.01 TB
get-storagepool -isprimordial 0 | get-virtualdisk

FriendlyName  ResiliencySettingName FaultDomainRedundancy OperationalStatus HealthStatus      Size FootprintOnPool StorageEfficiency
------------  --------------------- --------------------- ----------------- ------------      ---- --------------- -----------------
Storage space Parity                1                     OK                Healthy      121.18 TB        16.48 TB            66.65%
get-storagepool -isprimordial 0 | get-volume

DriveLetter FriendlyName  FileSystemType DriveType HealthStatus OperationalStatus  SizeRemaining      Size
----------- ------------  -------------- --------- ------------ -----------------  -------------      ----
X           Storage space NTFS           Fixed     Warning      Full Repair Needed     110.23 TB 121.18 TB

The "Full Repair Needed" is replaced by "Healthy" when I take the pool offline. But I figured I'd try repairing. When I run:

Repair-Volume -DriveLetter X -Scan
NoErrorsFound

Same with all other troubleshooting recommendations here: https://learn.microsoft.com/en-us/powershell/module/storage/repair-volume?view=windowsserver2022-ps

At this point, I'm at a loss. If there is a bad drive, I don't see any way to identify it. If there isn't, I don't see how I can repair the space. My only option I can think of is to buy an extra drive, move all the data off, reformat/rebuild the pool, and hope this doesn't happen again. Anyone have any ideas on things I can check before going down that route?

0

You must log in to answer this question.

Browse other questions tagged .