0

I have a proxmox cluster of a single node and I want to start a new VM with a PostgreSQL and TimescaleDB and after a lot of reading about how to tune ZFS volumes for this purpose I still have some doubts with the cache options. We have 3 caches: The proxmox one (ARC), the linux vm one (LRU) and the PostgreSQL one (clock sweep); in order from further to nearer to the DBs.

I have read a lot of information, some of them contradictory, so I don't know if this is true but it seems that the PG cache isn't designed in the same way as a kernel cache where it tries to catch it everything and evict only when there isn't enough room to continue caching. In fact seems to be more like a buffer for the data that is being processed at the moment and not a long term cache. Indeed, it's called shared-buffers. I guess that it's why the doc doesn't recommend to set shared_buffers to a high % of the available ram like ARC does, but somewhere between 25 and 50%. Seems that the real PG cache is the kernel one and not shared_buffers.

Taking this into consideration there are some possible configurations to take into account:

  1. Create a VM with a moderated ammount of RAM (let's say 12GB) and set shared_buffers to 10GB. Trying with that: 1) Have a good amount of memory to act as buffer to the ongoing queries. 2) Stifle the VM RAM to not use its cache, that with its LRU configuration should be the worst one and instead use ARC one with better weights. The problem with this configuration may come from that the cache it's ouside the VM and could reduce the performance instead of improving it. Also not sure about how many room I have to left over the shared_buffers size to let run the VM OS and the other DB processes.
  2. Create a VM with a high amount of RAM (let's say 48 GB) and keep shared_buffers in the same 10GB. Also zfs set primary cache to metadata. This way the cache will be nearer the DB and inside the VM but with a worst logic. Seems that LRU is kinda bad for DB.
  3. Create a VM with a high amount of ram and primarycache=all. I think that this will be a bad thing because: 1) VM and proxmox chaches will compete for resources. 2) Cache duplication.

In order to give some context, the node has 64GB of total RAM and the PG/timescaleDB will be the more demanding/priority application running on it.

So, are my initial assumptions correct? Which configuration will work better? What would you change?

Best regards, thanks for your time,

Héctor

1 Answer 1

0

My recommendation is to use Solution #4: Create a VM with a high amount of RAM and on the KVM (Proxmox) side use cache=none for the data disk. This will stop Proxmox from using the host page cache at all, effektively runing the real storage sync. This way you get as close to bare metal as possible in your VM and can fine-tune your Caches there.

Be advised, that for all databases I know of (including PostgreSQL) the RAM buffer is not just a disk cache, but will keep at least part of the data in a ready-to-read format a.o.t. the on-disk format. This implies, that RAM set aside for the DB process is more valueable than RAM just used as I/O buffers.

If your DB can answer a query from (its own) RAM, it will not run through the IO stack at all, saving hugely on latency.

7
  • You don't think that proxmox ZFS ARC cache will do better than LRU from the VM?
    – Héctor
    Commented Nov 29, 2020 at 18:27
  • In the other hand, I don't know, but I have read in several places that PostgreSQL relies heavily on the OS cache and to not oversize the PG buffer because it will harm performance because it will carry to kernel cache misses. EDIT: For example here in the official doc (first entry): postgresql.org/docs/13/runtime-config-resource.html
    – Héctor
    Commented Nov 29, 2020 at 18:29
  • The ZFS ARC cache is just a disk cache, nothing else. The PG buffers and the VM page cache will make vDisk access unnecessary in many cases. Commented Nov 29, 2020 at 18:45
  • Yes, it is a disk cache. But what I try to say is since PG needs a disk cache could make sense replace the VM LRU page cache with Host ZFS ARC cache? And keep only PG buffers and Host Cache.
    – Héctor
    Commented Nov 29, 2020 at 18:52
  • To access the HOST cache, the GUEST hast to go through the virtual disk stack - this will slow down things quite much Commented Nov 30, 2020 at 10:47

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .