1

I am trying to clean up a mess on an Ubuntu 12.04 LTS server system using ZFS. Here's what zpool status shows:

  pool: TB2
 state: UNAVAIL
status: One or more devices could not be used because the label is missing
        or invalid.  There are insufficient replicas for the pool to continue
        functioning.
action: Destroy and re-create the pool from
        a backup source.
   see: http://zfsonlinux.org/msg/ZFS-8000-5E
  scan: none requested
config:

    NAME        STATE     READ WRITE CKSUM
    TB2         UNAVAIL      0     0     0  insufficient replicas
      sdd       ONLINE       0     0     0
      sde       ONLINE       0     0     0
      sdf       ONLINE       0     0     0
      sdg       ONLINE       0     0     0
      sdh       ONLINE       0     0     0
      sdi       ONLINE       0     0     0
      sdj       ONLINE       0     0     0
      sds       ONLINE       0     0     0
      sdt       UNAVAIL      0     0     0

  pool: TB4
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: resilvered 2.52T in 16h41m with 0 errors on Tue Feb  6 09:27:46 2018
config:

    NAME                                              STATE     READ WRITE CKSUM
    TB4                                               DEGRADED     0     0     0
      raidz2-0                                        DEGRADED     0     0     0
        ata-Hitachi_HDS724040ALE640_PK1331PAG9MBVS    ONLINE       0     0     0
        ata-Hitachi_HDS724040ALE640_PK2311PAG8G71M    ONLINE       0     0     0
        ata-Hitachi_HDS724040ALE640_PK1331PAGH0LHV    ONLINE       0     0     0
        ata-Hitachi_HDS724040ALE640_PK2331PAG8MV3T    ONLINE       0     0     0
        spare-4                                       DEGRADED     0     0     0
          ata-Hitachi_HDS724040ALE640_PK2311PAG614MM  UNAVAIL      0     0     0
          ata-Hitachi_HDS724040ALE640_PK1331PAGH0EAV  ONLINE       0     0     0
        ata-Hitachi_HDS724040ALE640_PK2331PAGH2XRW    ONLINE       0     0     0
        ata-Hitachi_HDS724040ALE640_PK1331PAG7TGDS    ONLINE       0     0     0
        ata-Hitachi_HDS724040ALE640_PK1331PAGG3K0V    ONLINE       0     0     0
        ata-Hitachi_HDS724040ALE640_PK2311PAG59PYM    ONLINE       0     0     0
    spares
      ata-Hitachi_HDS724040ALE640_PK1331PAGH0EAV      INUSE     currently in use

errors: No known data errors

I want to do two things: 1. Replace the faulty drive in pool TB4. This I know how to do. 2. Completely destroy and recreate pool TB2.

Normally, I'd just do a zpool destroy TB2 and start over. However, the previous admin used sd* names for TB2 and disk id's for TB4. In looking at /dev/disk/by-id, I discovered that two of the TB4 drives (...71M and ...EAV) are symlinked to /dev/sdj and /dev/sds respectively. But these sdj and sds are both listed as part of the TB2 pool. I'm afraid that doing a zpool destroy TB2 will corrupt the drives in the TB4 pool, since the docs say that destroy writes to the member disks. Is there any way to get ZFS to simply forget about TB2 without actually writing?

I asked the previous admin why he used two different methods (/dev/sd* and by-id). He said that the assignment of drive letters to specific hard drives didn't seem to be repeatable from boot to boot, so when he created TB4, he used by-id. I guess this entanglement of TB2 and TB4 is a result of that.

My current thought is to do this:

  1. shut down the machine
  2. pull all drives.
  3. Reboot.
  4. zpool destroy -f TB2
  5. shut down and reinstall TB4 drives
  6. reformat the TB2 drives on another machine
  7. Reinstall the TB2 drives and create a new pool using disk id's (not sd*)

Does this seem reasonable? Is there an easier way?

Thanks to anyone who can help me out of this mess.

Michael

1
  • Why not just export TB2 and then create it correctly? Pools are exported anyway when you shut down. On a pool that is currently “unavailable”, this shouldn’t cause any disk changes.
    – Daniel B
    Commented Jun 6, 2018 at 7:55

1 Answer 1

0

Your proposed method seems like it would work. However, it’s also needlessly complex. Instead, I would suggest:

  1. zpool export TB2. This will unmount all resources associated with the pool and your system won’t try to remount them (and possibly write to them) unless you run zpool import first.
  2. Repair TB4. (Or you can do this later.)
  3. zpool create <new pool> ... (referencing the disks by id to avoid overlapping again). You may have to force-create since it may notice those disks were in use by an unimported pool.

If you want to do a dry-run of the procedure, I think you can create some volumes on TB4 (zfs create -V 5gb TB4/volume1) and make two “nested pools” out of those (zpool create testpool1 ...) with an overlapping volume. Everything else should work the same as above.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .