3

I have a HPE ProLiant DL360 Gen9 server, specs are:

  • CPU: Intel Xeon 2 CPUs E5-2687W v3 @ 3.10GHz, 25MB L3 cache, 10 cores ea
  • RAM: 8x 32GB PC4-17000 DDR4 2133MHz CAS-15 1.2V SDRAM DIMM (256 GB total)

(full server specs here)

The server is running CentOS 7.2 with kernel 3.10.0-327.36.3.el7.x86_64.

I mounted a tmpfs ramdisk on the server using the following entry in /etc/fstab:

tmpfs  /ramdisk  tmpfs  noauto,user  0 0

To test writing to this ramdisk, I then run the following command:

time sh -c "dd if=/dev/zero of=/ramdisk/120GB_testfile bs=4k count=30000000 && sync"

It reports that it wrote 122,880,000,000 bytes in 58.857s, which is a write speed of 1991 MiB/sec.

Considering that the write speed of this memory is 17GB/sec (according to this description of memory data rates), I am surprised by the considerably lower rate when writing to my tmpfs ramdisk. Can anyone explain the disparity, and suggest another way to write to a file in memory that is faster?

Thanks.

UPDATE

I disabled vm.swappiness, but that yielded no benefit (1712 MiB/sec).

I tried increasing the block size as well (bs=256k count=468750), but again, not much of an effect (2087 MiB/sec).

3
  • 1
    What if you increase the block size (while reducing the count)? Tmpfs is backed by swap memory, how certain are you that your data is not being paged out and thus slowing down? Have you set vm.swappiness to 0 or 1?
    – Mokubai
    Commented May 10, 2018 at 16:04
  • @Mokubai, swappiness was set to default (60). I thought I was avoiding swapping by making a file small enough so that it would fit entirely in RAM. I will try disabling swappiness and will update my post w/those results. Thanks.
    – atreyu
    Commented May 10, 2018 at 16:30
  • Repeatedly writing to tmpfs (like your test) results in different speeds for me, usually increasing, but I'm using smaller sizes so not sure if that's the cause, I'm guessing it could be from memory being freed for tmpfs. Any changes when running the same test a few times in a row? Or using ramfs? (FYI, ramfs should really stay in ram and be a more accurate "ramdisk", but doesn't have a size limit iirc so watch out)
    – Xen2050
    Commented Oct 24, 2018 at 2:01

1 Answer 1

4

There's more going on than just putting data in RAM when you're using an in-memory filesystem. You still have to handle the data structures associated with the file, including tracking where in memory all the allocations for it are. Writing this information takes time too (in particular, for the testing you're doing, your file size is being updated on every write, which immediately doubles the number of places data is changing in memory).

Also, allocating memory is extremely slow. In fact, it's about one of the slowest things you can do on most systems that doesn't involve I/O, with the only significantly slower thing being creating a new thread or process. Tools like ramspeed pre-allocate all the memory they will use right when they start up, so they can test the actual memory performance. In comparison, tmpfs has no idea how big of a file you are going to be creating, so it has to allocate everything on-demand, and does so in chunks no bigger than the dd block size (I think it caps out at 64k, but I'm not sure). Because of this, you have overhead in every block for allocating memory to store that block in.

2
  • Do you think copying a file might be faster, since it knows how big the file will be, while dd doesn't? Or how to measure cp's speed, or even create a fixed size file from something like /dev/zero without already having another file in tmpfs/ramfs to copy (slowing down the test by adding reading)?
    – Xen2050
    Commented Oct 24, 2018 at 2:01
  • At their core, cp and dd are doing essentially the same thing, they read data from the source and then write it to the destination. cp is usually a bit faster, but that's because it automatically figures out near optimal block sizes to work with. cp can be significantly faster in some very special circumstances too because it may use various copy-offload mechanisms (such as reflinks on filesystems that support it, or the SCSI XCOPY command on hardware that supports it). Commented Oct 24, 2018 at 12:58

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .