Based on your question and the comments below it, you're asking whether it's a good idea to have a pool striped between two unequal sized disks. The short answer is, there's nothing inherently problematic about this if:
- Your workload isn't performance-critical. If it is, use a uniform disk type across the entire pool. Otherwise, the disks could have different performance characteristics, which can create very subtle performance problems which are difficult to track down. (For instance, let's say you have two 10K RPM disks made by the same vendor in the same year, one which is 1TB and one which is 2TB. No problem, right? Unfortunately, no -- one of those is going to get ~twice the throughput, even though max IOPS will be the same between the drives.)
- You're ok without additional redundancy. Note that in any striping situation, you're increasing the likelihood of losing all your data, because you went from the probability of one disk failing, to the probability of either disk A or disk B (or both) failing. Even with ZFS keeping multiple metadata copies, with a random ~half of the data missing, you'll have a tough time recovering many complete / usable files from your pool.
That said, there are still unwise ways to set this up. If one of the disks is an SSD and the other is an HDD, striping will ruin the performance gains you got from using an SSD and probably make you quite sad. In that situation, I'd recommend either:
- Use the larger HDD as the "main data disk" and then split up the SSD into two partitions: one large partition used as an L2ARC (
cache
) device to speed up reads of frequently-read data, and one small partition used as a ZIL (log
) device to speed up synchronous write latencies. This solution is nice because it'll automatically cache the most beneficial stuff on the SSD, so you don't have to think too hard about balancing it. Also, you'll only lose all your data if you lose the HDD in this case (you could lose up to a few seconds of writes if the SSD dies, but that's much better than all your data, like in the striped case above).
- Creating a separate pool for each disk, and manually keeping stuff you want to be fast (OS, executables, libraries, swap, etc) on the SSD, and stuff that's ok being slow (movies, photo albums, etc) on the HDD. This is best if the machine will be rebooted frequently, because data cached in the L2ARC does not persist across reboots. (This is a big weakness in the current L2ARC story for personal computers IMO, but it is being actively worked on.) From a redundancy standpoint, you obviously only lose the stuff that was on the disk that failed.
—- Edit since these disks are virtualized —-
Since this is a VM, unless you’ve specified special parameters for performance of the disks, neither of the performance / redundancy criteria above should prevent you from creating the pool with two mismatched disk sizes. However, it’ll be much easier to manage if you just use your virtualization platform to resize the original disk to the sum of the proposed disk sizes. To use that additional space inside the guest, you’ll have to run zpool online -e <pool> <disk>
, and since this is ZoL you may have to fix your partition table first, like in the instructions here.
You should strongly prefer this approach because of the ease of management, but one very minor downside is that when you resize, ZFS can’t change its metaslab size. Metaslabs are an internal data structure used for disk space allocation, and until very recently ZFS always created 200 of them per disk regardless of the disk size (there is ongoing work to improve this). Therefore, when you increase the disk size from very small to very large, you could end up with a very high number of metaslabs, which uses a bit more RAM and a bit more disk space. This is not noticeable unless the disk size changes very dramatically (like 10G -> 1T), and even then only when you are pushing your machine to the limit on performance. The performance impact can usually be worked around by giving your VM a little more RAM.
primaryPool
is on an SSD? If it is an SSD, then you might be better off using some of it as a cache for the 1TB disk anyway.