Optimal ZFS Pool configuration for a home NAS

Question

I am looking to upgrade my home NAS, which I use to store movies (with Plex Media Server), pictures and backups.

Currently, it has 5 3TB disks in a raidz1 pool.

I was limited by the number of SATA ports, but with the aquisition of a 8-port controller, I have now a limit of 14 SATA ports. (-1 for the system so 13 ports).

As I intend to keep this server for a few years, I wanted to make it in a performant and future compatible way, so I read that it's recommended to use SSDs to store the ZIL and the L2ARC for better write and read performance.

For the ZIL, I read that a few GB is enough but it's recommended to mirror it.

So, this is the setup I thought about :

1 System SSD
11 Storage disks (raidz2)
2 SSDs sliced for ZIL/L2ARC (mirroring the ZIL, and adding the two other slices for caching)

My reasoning behing the sliced SSD is :

I don't need the fastest system ever, so I can accept the tradeoff of the ZIL and L2ARC sharing the IO
I still have the benefit of mirroring the ZIL so in case of a crash, I should be safe

The storage disks are currently 3TB drives, and as I understand the total size of the pool is (numberOfDisks-2)*sizeOfSmallestDisk which would mean : (11-2)*3TB = 27TB

So, my questions are :

Is this setup well balanced ?
Is it doable to slice the SSDs for the logs and cache ?
Is my formula for the pool size correct ?
Will I be able to replace the disks one by one with bigger disks and once they are all updated grow the pool ? For example replace the disks with 4TB disks and get 36TB storage
Will this work fine on Linux? I currently have OpenIndiana and want to migrate to Ubuntu Server.

Thanks in advance.

My main sources of inspiration :

How much memory does your system have? For home use, I doubt you would get an significant gains for an l2arc. You would probably be better off filling the system up with drives in mirror pairs (like raid10). That will perform faster then raidzn, and will be easier to upgrade. — Zoredache, Commented Dec 15, 2016 at 0:46
What kind of backups do you have? 11+2 for a RAIDZ2 pool is a bit large and a resilver could take a LONG time, assuming you even catch the first disk failure pretty quickly and get the failed disk replaced. — Andrew Henle, Commented Dec 15, 2016 at 3:04
I have 16GB of ram for the moment, I thought maybe I should have 32, but don't know for sure. How would I do it in mirror pairs ? For the backups I have Amazon Glacier for the important parts, but the server is already the backup for other machines, so many things can be rebuilt. You say long, but how many hours/days should I expect ? — Stéphane Goetz, Commented Dec 15, 2016 at 19:38
@StéphaneGoetz It depends on your current load (additional hits slow down resilver operations), your disk performance (RPM for HDDs, write IOPS for SSDs), pool fragmentation, amount of data in the pool (by far the largest point) and operating system (Solaris has sequential resilvering, which speeds it up, illumos/OpenZFS not yet). — user121391, Commented Dec 20, 2016 at 10:50

user121391 · Accepted Answer · 2016-12-21 15:30:37Z

Most of your points are correct, so I just focus on the rest:

Using an SLOG device for the ZIL does only help you with small synced writes, so it is pretty mandatory if you want to store virtual machines on it and pretty useless in most other home use cases, especially backups and streaming media. As you can always later on add it and remove it, you should start without and then add only if necessary.
L2ARC can increase your read performance, but it is slower than RAM, needs extra RAM and only helps if the same data is read. Again, bad for streaming a whole movie or music, but good if you host a website that is heavily accessed or have hundreds of users accessing file shares. Rule of thumb is: first max out your RAM (depending on your board 32, 64, 128 or 256 GB most likely), then think about L2ARC.
ZIL and L2ARC on the same device is usually not a good idea, as their needs are directly opposed:
- ZIL is written to constantly for small random synced IO (large and sequential IO bypasses it, async IO of any kind does not use it at all), which means you want an SSD with very low write latency (Intel is the only vendor I've found that specifies this characteristic even for the cheaper consumer SSDs), acceptable write IOPS (nearly all SSDs are sufficient here), and high amount of TBW so your SSD does not die each year from exhaustion. For size, < 10 GB is usually enough for small systems. Mirroring is preferred to prevent data loss when power and SSD fail at the same time.
- L2ARC on the other hand needs to be several times larger (> 64 GB is common, depending on RAM), is seldomly written to but read often, so you want high read IOPS, acceptable read latency and don't care about TBW that much. Mirroring is a waste of money in most cases, as it is only a cache device and can be lost and recreated without problems.
A single root pool is of course possible, but you save yourself some headaches if you mirror it. As it usually is hit not that much, two slow disks or even USB devices (each mainboard has at least two USB ports as headers internally) are perfectly fine for home use and you gain another usable disk slot. Especially when running without UPS two rpool devices really give you peace of mind.
Your pool size is correct, but it may be an option to go for 12 disks with either 2x Z2 (6 disks each) or 1x Z3 (12 disks each). As a rule of thumb, when using Z1/2/3, you should populate all your available disk slots at first, because while upgrading disk size one by one is trivial, adding more disks later on is impossible.
I don't know about Linux (should work fine), but have you looked at other illumos-based systems? OmniOS is small, simple and stable and can be customized to your needs (it also includes KVM and LX-branded zones). SmartOS is similar, but focused heavily on zones (containers), so you can run all your services independent from one another and can even run Linux guests in those zones for the few services that are not available on Solaris. There are also Delphix and NexentaStore Community Edition, but I have not tested them.

My personal suggestion:

Use whatever operating system your are comfortable with (if you like the stability of Solaris, try OmniOS or if you want virtualization, try SmartOS)
Use mirrored rpools on USB disks (USB3 HDDs or USB3 sticks with SLC memory) to expose more slots for disks
Use 6 ports from the mainboards and 6 ports from the HBA card so you can lose a controller and your system keeps running
2 free ports can be used in the future for SLOG or L2ARC devices depending on your needs
Layout (12 is a very nice number, 16 would be next best because most controllers are 4 or 8 port):
- If you need maximum performance: 6x2 mirrors, each on both controllers
- If you need maximum resilience: 4x3 mirrors or 1x RAID Z3(12) or 2x RAID Z2(6)
- If you need maximum space: 1x RAID Z2(12)
Maximize RAM first, then anything else

Regarding your follow-up-questions from comments:

I like the idea of using two mirrored USB3 sticks for the system, but is it bootable ?

USB sticks are essentially the same as USB disks, so you can boot from them without problems (except on very old mainboards, but everything from the last ten years should be fine). Some systems like SmartOS or ESXi even advertise it as best practice.

Some on the other hand (like FreeNAS) do not recommend it, because they are not customized for USB sticks and therefore constantly write to the disks and wear out cheaper sticks pretty fast (this is why so many Raspberry Pis fail early - the Linux system thinks it has an indestructible HDD and not some 5 EUR USB stick or SD card that is designed for infrequent writes like from a digital camera).

With SLC sticks (or real SSDs), you do not have these problems. Of course, they are more expensive, about 30 to 40 EUR for 16 GB sticks (MachExtreme MX-ES are about the only worthwile things in this sector). SSD can be cheaper (30 EUR for 32 GB), but you would need a USB adapter and they take up more space. You can use them outside of the case for quick backups/swaps or inside for access control (read: if you have children who like shiny toys).

it seems I don't need log/cache disks, should I use all 14 ports for disks ?

Depends on your needs and budget. If you use mirrors, you are flexible to add them later. If you use RAID-Zn, I would set the final amount before creating the pool, because you cannot easily add more. On the other hand, you might want to keep some ports free for backup (using slot-in caddys for 3.5" drives, for example) or for cache purposes if your needs change. It is up to you if you value space more than flexibility (and it depends on you much expansion cards your hardware supports).

Something like 2x7 RaidZ2 striped together ? If I do this and the controller fails, The pool will be failed, but if I replace the controller with an identical one, will it run again ?

Yes, and it works even if you use another controller, because everything is done in software. You just need to add enough disks so that each vdev works, and you can bring the pool back online.

In your case, if your 8-port controller fails, you need to add five (7 - 2) of those disks to the system in any way (the 8th is expendable anyway, because the other six disks are still running), for example with a 4 port controller and a single USB disk (not recommended, just to show that the connection is basically meaningless).

Usually you just replace the controller with the same model, because you know that configuration works without problems and performance is sufficient (with 8 disks per controller, the price of the controller itself is pretty small in relation anyway.

And if I do this, can I grow one part of the pool without touching the other one ?

You can grow the vdevs separately, but only as long as the pool itself is online (meaning after replacement of the controller and resilvering of any errors). Take into consideration that if you grow it "unbalanced", your data will not be rebalanced later on if the other vdev is expanded, except for any new and modified blocks (copy on write does not reorder data on read). This should be no problem for your performance needs, but I thought I'd mention it for completeness.

Thank you for your very detailed answer, it took me a few days to process everything :D I like the idea of using two mirrored USB3 sticks for the system, but is it bootable ? As you explained, it seems I don't need log/cache disks, should I use all 14 ports for disks ? Something like 2x7 RaidZ2 striped together ? If I do this and the controller fails, The pool will be failed, but if I replace the controller with an identical one, will it run again ? And if I do this, can I grow one part of the pool without touching the other one ? — Stéphane Goetz, Commented Dec 21, 2016 at 13:19
@StéphaneGoetz Glad I could help. Your question was very detailed and showed a great deal of research, so it deserved a longer answer than most of the other questions around here. Also, I sometimes (always) like to write long posts. ;) Also, for a nice and easy to deploy management system, take a look at OmniOS + napp-it (napp-it.de/setup_en.html), I can recommend it wholeheartedly. — user121391, Commented Dec 21, 2016 at 15:34
RAIDZ2 with 6 disks is also a sweet spot for low disk usage overhead, besides not putting too much stress on the remaining disks in case of a resilver. I don't remember off hand where I saw the summary, but someone put together a table with something like RAIDZ, RAIDZ2, RAIDZ3 for up to 20 disks each, calculating the overhead for each combination. It's a small detail, and not all that relevant if you are choosing between 2x6 or 1x12 RAIDZ2, but when you are choosing between similar numbers of disks, it's worth considering. — user, Commented Dec 27, 2016 at 12:44

Stack Exchange Network

Optimal ZFS Pool configuration for a home NAS

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
linux
zfs
raid-z
.

Hot Network Questions

Optimal ZFS Pool configuration for a home NAS

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged linuxzfsraid-z.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
linux
zfs
raid-z
.