5

I'm attempting to setup a file share system for a system where users are given access to ephemeral virtual machines to work on projects that read/write a large amount of small files. For example, a new project may be 200MB with 12,000+ files. I'm looking to reduce the time it takes to create a new project, but believe I'm being bottlenecked by the overhead from the RTT from the requests of all these files.

Currently, I'm mounting the NFS share using the follow mount command.

sudo mount -t nfs -o nfsvers=3,nconnect=16,hard,async,fsc,noatime,nodiratime,relatime <drive>:/fsx /share

Additionally, the NFS server is configured with rw,async to ensure that we're actually using asynchronous writes.

After a few days of tuning and working with nfsstat and nfsiostat, this gave me the fastest results. I also have cachefilesd configured for read caches to speed up those operations. Unfortunately, I'm still getting ~20kb/s of write during the project creation. Writing a large single file yields > 250MB/s of throughput, and nfsiostat indicates around 1ms of latency per request, so it doesn't appear to be a throughput or network issue.

These files rarely will be accessed by anyone besides the file share owner, but application spec requires all of these files to be in a readable format on the file share, so creating this projects on the local disk and taring them up to the file share at the end of a session unfortunately isn't an option.

Is there a way to speed up the write operations for many small files in some other way? It'd really be great if new files could just be written locally and synced up to the NFS share when there's time. I'm not a sysadmin, so just looking for guidance and interested if there's any new takes on this issue.

2
  • Where are the files coming from, the local application or copying from NFS to NFS? Do you have any control over how the files are initially created? Remove nfsvers=3 to use v4, which runs better. Check out some of the tuning recommendations here: admin-magazine.com/HPC/Articles/…
    – Cpt.Whale
    Commented Sep 18, 2023 at 17:57
  • 1
    Thanks for the comment - the files are coming from the local application. I do not have much control over how the files are created unfortunately. I tried NFS 4 but saw better results with 3 oddly enough. To add to this - I found a solution that works fairly well given the parameters of my issue. I'll provide a writeup soon for any future StackOverflow users who are looking for a good solution to this problem. Commented Sep 18, 2023 at 23:07

1 Answer 1

2

I ended up finding a solution that worked well for me, so wanted to post a follow-up for anybody also trying to solve this solution.

I didn't have any luck trying to optimize the NFS connection, as it seemed to be bottlenecked mostly by calls that weren't necessarily sped up much with the async flags on the client/service side.

Instead, I decided to use overlayfs to set up a readonly layer for my NFS data, and a read/write layer on the local disk. This allows me to expose a single mount point that allows users to view all data in the share, but achieve high RW performance when actually working with the files. The script to achieve this looked a little like this.

sudo mkdir -p /application/nfs
sudo mkdir -p /application/overlay
sudo mkdir -p /application/local 
sudo mkdir -p /application/workdir

sudo mount -t nfs -o nfsvers=3,actimeo=60,nconnect=16,hard,rsize=1048576,wsize=1048576,async,fsc <endpoint>:/<share_name> /application/nfs

sudo mount -t overlay -o volatile,lowerdir=/application/nfs,upperdir=/application/local,workdir=/application/work none /application/overlay

This gives you a mount point on /application/overlay with much better read/write performance for new working projects. Additionally, you can speed all of this up by using cachefilesd and enabling the -fsc flag on your NFS share mount. This will cache your NFS files to the disk, and enable much faster subsequent reads of data coming over the network.

At the end of a user session, I just rsync the files from /application/local to /application/nfs to sync the changes in state of the local disk to the NFS share. This system wont' work for everybody depending on your application requirements, but it worked fantastically for what I needed it to do.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .