8

I have a tank consisting of several datasets, only one of which is configured to use deduplication.

How can I see the ratio for this dataset? I get a ratio of 1.00x for the whole pool but I imagine this is just reporting the ratio on what's in the tank, which is nothing (I only have datasets in the tank).

1 Answer 1

11

What you're referring to as a tank is really a ZFS pool and your datasets are ZFS filesystems within the pool.

ZFS deduplication has pool-wide scope and you can't see the dedup ratio for individual filesystems.

If you turn dedup on for a pool that already contains data, the existing data will not be automatically deduped and your ratio will still be 1.00x. Only newly written data will be deduped and then you may see the ratio increase.

6
  • But I only turned it on for a 'filesystem' within the pool, if it has pool wide scope does that mean it's on for the whole pool? I was expecting when I created the dataset only that dataset would be deduped, so shouldn't I be able to see the ratio for only the dataset?
    – deed02392
    Commented Dec 20, 2012 at 0:29
  • 4
    To improve deduplication, ZFS doesn't limit itself to duplicate blocks within just one filesystem. Instead it looks across the whole pool. If dedup is not enabled for a particular filesystem then block writes are performed without passing through the dedup pipeline, even if duplicate blocks exist. Commented Dec 20, 2012 at 1:21
  • But if a duplicate block does exist in the filesystem with it enabled, surely this still yields a greater-than-one deduplication ratio, if only for that filesystem? Why can't I just see the ratio for that filesystem then?
    – deed02392
    Commented Dec 20, 2012 at 11:43
  • 1
    I can't answer that, maybe it was a design decision, I don't know for sure. Deduplication has side effects on reported disk usage and free space and these tend to make more sense when considering the whole pool rather than just a filesystem within it. Refer to the ZFS Dedup FAQ: Deduplicated space accounting is reported at the pool level. You must use the zpool list command rather than the zfs list command to identify disk space consumption when dedup is enabled. Commented Dec 20, 2012 at 21:48
  • You can also use zdb -S on your pool to get a rough estimate of the deduplication ratio as if it was applied. For more information, see the excellent post about deduplication on the blog of Constantin Gonzalez: constantin.glez.de/blog/2011/07/zfs-dedupe-or-not-dedupe
    – user121391
    Commented Jun 21, 2016 at 13:58

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .