3

I require 64GB to fit an entire dataset in memory for deep learning, but have only 12GB RAM. Virtual memory being the next-best alternative, I learned it can be effectively increased via increasing the pagefile size - but this source suggests it'd increase system instability.

All other sources state to the contrary, only noting lowered SSD lifespan, which isn't a problem - but I rather not take chances; this said, is there a limit to how much pagefile size can be increased without yielding instability?


Additional info: Win10 OS, 26GB OS-allocated pagefile size (need 52GB + c, c = safe minimum)


PRE-ANSWER: proceeded as described here, with ~70GB memory-mapped data; the average data load speedup is 42-FOLD. I suspect this figure may be bumped to ~130, though won't work on it now unless someone answers this. Lastly, this is sustainable and won't degrade the SSD, as the use is 99.9%+ reads. Will post full answer with details eventually.

3
  • Possible duplicate of Any reason not to disable the Windows pagefile given enough physical RAM?
    – phuclv
    Commented Sep 3, 2019 at 2:20
  • $100 on more memory would be well spent. Asking a school or business for spare PCs or parts would be good too. In my experience, a large page file doesn't hurt, but doesn't help much either. Best to leave the OS on automatic. Commented Sep 10, 2019 at 1:21
  • @ChristopherHostage I would preallocate the page file so you don't get an out of disk space condition while swapping--things tend to puke when that happens. Commented Sep 10, 2019 at 2:27

2 Answers 2

2

The page file supports swapping (a.k.a paging) 4K blocks of data (which are called pages) in RAM out to disk and back.

Code that the CPU is running must live in physical RAM. Also, Windows, like other OSes, uses "unused" RAM to cache disk I/O until it is flushed (and if disk data is just read and re-read, it might just stay in "unused" RAM for a longtime).

In a multitasking operating system, there may be some code that is owned by tasks that are waiting on some event that hasn't happened recently, like user input. It helps system performance to page this out to a disk file and call it back in when the events happen, so that code that is actually doing something on your computer can leverage the free RAM.

Now of course, the operating system can page code that might actually be doing something, but is a lower priority, if a sudden request for more memory than the system has comes in. This is better in most cases than denying a program outright a request for memory, if it isn't too much more physical RAM than what is available.

At some point, if you keep allocating memory that isn't there, your program will be competing with basic Windows services and other programs running on your computer. Plus, you've removed all the unused RAM, so disk I/O won't be cached at all. You will experience a massive decrease in performance that will affect all processes on the system, including system ones.

The instability described as harmful can come from basic Windows functions becoming unresponsive because they are going back and forth from disk to RAM and swapping with your machine learning program and other programs. For example, clicking on a desktop icon may take minutes to respond. So you might think the system is frozen totally when it's just going through swapping like crazy and will eventually respond.

7
  • 1
    In other words, if I (1) keep a good amount of unused RAM, and (2) reserve a good amount of pagefile space for system processes, I'm good to go? Per (1), if RAM utility according to Task Manager is 8GB/12GB, is 4GB then "unused"? Per (2), any standard number for how much reserve space is sufficient, or it varies? Response's appreciated Commented Sep 2, 2019 at 21:47
  • 1
    I agree with this answer. However, I want to make it clear that if you need 64 GB of RAM and you only have 12, it is possible your application might not work properly, or at all. If a significant amount of paging is occurring and your deep learning app has some sort of timeouts implemented, you are going to run into problems. More physical RAM would be the solution.
    – Keltari
    Commented Sep 2, 2019 at 22:42
  • 1
    @Keltari The goal of more RAM is only to speed up training - it's not actually "required"; an NVMe PCIe SSD w/ 3.5GB/s read speed is already around 10% of RAM, which is huge - hence knowing the answer to my comment question matters Commented Sep 2, 2019 at 23:59
  • @OverLordGoldDragon, increasing your pagefile to increase your "RAM" won't gain you any more speed, if anything it'll make the system slower due to constant swapping to disk. The OS will already dedicate as much physical RAM to your program as it reasonably can, pushing whatever other processes/data it can to your page file.
    – kicken
    Commented Sep 3, 2019 at 2:04
  • 1
    @kicken That is false; accessing paged arrays is much faster than loading them from memory - on my system, by an order of magnitude. Commented Sep 3, 2019 at 2:14
0

It sounds like your program is going to be going all over that dataset while it's running. That is going to cause a tremendous amount of swapping. You point out a fast SSD can be at 10% of RAM speed--but your program might have wanted 100 bytes of data while the system proceeded to read 4096 bytes off the disk. Your 10% doesn't mean it merely takes 10x as long to run.

Furthermore, if your program is modifying the data as it works with it things get far worse--dirty pages get written. If there's much modification of data you'll deplete the drive of spare blocks and your write speed can become truly atrocious. (A page must erased before being written. Last I knew that was an operation measured in a substantial number of milliseconds, although I'm not finding current data. Normally a drive keeps a supply of empty pages around to handle writes but when the writes come in faster than it can wipe the pages the pool gets depleted and a write must wait until the eraser finishes with a page.)

1
  • No writes, only reads - the 64GB dataset's divided into 134 subsets, 500MB each, and is read into and out of RAM every 500ms. Any concerns in this case? Also, relevant issue - asked differently, can data be stored persistently in pagefile? Commented Sep 10, 2019 at 2:46

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .