We have a cluster deployment of Milvus on the k8s and our dataset sizes are of the order of a 150 million. The querynodes are distributed across 16 replicas with 40G of memory each. The current bottleneck for us in the partition load times and we exploring ways to improve the load times. We were referring to https://milvus.io/docs/chunk_cache.md and tried setting this up in our deployment. However, we do not see any noticeable improvements in the load times and from the logs, there is not indication that the chunk cache is being loaded. There are a couple of questions we have in this area:
Is chunk caching supposed to speed up partition load times in general or does it only come into play during vector search? How to verify if chunk cache is working as expected? Is the cache stored in the local storage of the querynodes under the localStorage directory which in our case happens to be /var/lib/milvus/data which is currently empty? Which grafana panel can be used as an indication of checking the cache? Which logs should specifically indicate that chunk cache is working? The user.yaml file on the query nodes have the below configuration in them:
> kubectl exec -it qa-milvus-querynode -- cat /milvus/configs/user.yaml
Defaulted container "querynode" out of: querynode, config (init)
common:
security:
authorizationEnabled: true
proxy:
maxUserNum: 500
maxRoleNum: 100
queryNode:
cache:
enabled: true
warmup: async