I'm building a NAS for home/personal use that will use ZFS (probably running FreeNAS) over a SATA SSD array. In terms of usage, I expect the system to be "idle" (not counting ZFS background stuff e.g. scrubs) more often than not, and I expect the main performance bottleneck to be the 1Gb ethernet.
I'm vaguely familiar with the ZIL, but confused about the use of a SLOG / secondary ZIL storage device.
The system is going to be on a UPS, so it's probably more likely to experience something like a kernel panic than a sudden loss of system power. Regardless, as long as it doesn't eat my pool, I'm not particularly bothered that a catastrophic event might cause the last few minute's data to be lost. (Remember, this is a home system, not something mission critical.)
In particular, is it possible to have only the primary ZIL in RAM (and what would be the "safety" impact of that)? If I don't have a dedicated SLOG device, does that mean I am forced to use the storage pool for SLOG, and what is the actual impact (both performance and wear) of that? If I do need a dedicated device, is a high-performing NVMe SSD (e.g. a modest-sized WD Black SN750) sufficient or do I really need to spend $250 on an Intel 900p? (see update)
Most of what I've been able to find says "yes, that $250 900p is absolutely vital", but don't really explain what situation I'd be in if I omit it.
Update:
So, most of what I've read is that a) having the ZIL be on the primary pool is horrible (halves performance and, even worse for SSD's, doubles writes), and b) the main reason a ZIL is "needed" is to reduce latency for sync writes. Given that my pool is all-SSD (albeit SATA, but OTOH my users are all bottle-necked by, at best, 1Gb LAN), it seems like I have three options:
- Add an SLOG. My understanding is that any non-PLP device is worse than useless (at least, worse than the two other options to follow), but in theory I'd be okay with an Intel 900p, which seems to be the cheapest option.
- Use RAM-only ZIL. (Basically, lie about sync.) While this "sounds" bad, AFAIU it won't affect the integrity of my pool, and for my usage, I'm not sure it's worth the extra cost. Any sort of failure is going to risk that I lose data, just because the NAS is offline. Most likely I'll know right away that something went wrong and will be able to take remedial steps of some sort (e.g. save to some other location temporarily until I can get the NAS back up).
- Don't use a ZIL; force all writes to go straight to disk. While this would be a performance killer on spinning rust, it's not clear that the ZIL is even useful on an all-SSD system, since the main concerns that make it useful with spinning rust (slower drives, seek latency) don't apply.
As far as option #2, I've seen threads where folks note that "data integrity is data integrity"... except, really, it isn't. There is a significant difference between losing whatever file I just tried to write (and likely knowing immediately that something went wrong), when I stand an excellent chance of being able to manually recover somehow, and losing files that were created months or years ago. I can see how, in general, this could go either way, but in my case, I'm more concerned with archival integrity.