First, the amount of RAM that needs to be saved is surprisingly small. In fact, only the set of mapped dirty pages ("lazy writeback") needs to be flushed, as well as all private pages that have been written to and relocated executable code need to be written.
- The .text segments of executables is always backed by file mapping. That is also true for at least some DLLs (but not all, depends on whether they need to be relocated).
- Memory that is similarly backed by file mappings can be discarded (presumed it's not CoW or RW and dirty).
- Lazy writeback will still have to occur, but other than that, caches can be discarded.
- Memory that has been allocated but not been written to (usually the greater part of application data!) is backed by the zero page and can be discarded.
- The larger part of memory pages that are on "standby" status (the actual per-process resident working set on Windows is suprisingly small, a mere 16MB) will have been copied to the page file in the background at some point and can be discarded.
- Regions of memory that are mapped by certain devices such as the graphics card may (possibly) not need to be saved. Users are sometimes surprised that they plug 8GiB or 16GiB into a computer, and 1GiB or 2GiB are just "gone" for no apparent reason. The major graphics APIs require applications being able with buffer contents becoming invalid "under some conditions" (without telling exactly what this means). It is thus not unreasonable to expect that the memory that is pinned by the graphics driver is just discarded, too. The screen is going to go dark anyway, after all.
Second, contrary to you copying a file, dumping the set of RAM pages that need to be saved disk is a single sequential, contiguous write from the point of view of the drive. The Win32 API even exposes a user-level function for this very operation. Gather write is directly supported by the hardware and works as fast as the disk is physically able to accept data (the controller will directly pull data via DMA).
There are a number of preconditions for this to work (such as alignment, block size, pinning), and it does not play well with caching and there is no such thing as "lazy writeback" (which is a very desirable optimization under normal operation).
That is the reason why not every write works like that all the time. However, when the system is saving the hibernation file, all preconditions are automatically met (all data is page-aligned, page-sized, and pinned) and caching has just become irrelevant because the computer is going to be turned off in a moment.
Third, doing a single contiguous write is very favorable both for spinning disks and for solid state disks.
The swap file and the hibernation file are usually some of the earliest files created and reserved on the disk. They usually have one, at most two fragments. Thus, unless sectors are damaged and the disk has to reallocate physical sectors, a logical sequential write translates to a physical sequential write on a spinning disk.
No read-modify-write operations are necessary on the disk when a huge amount of sequential, contiguous data is being written. This problem is less pronounced on a spinning harddisks which can write single sectors that are quite small (Provided that you don't write single bytes, which caching usually prevents, the device needs not fetch the original contents and write back the modified version.).
This is, however, something that is very noticeable on SSD where every write means that e.g. a 512kB block (that is an usual number, but it could be larger) has to be read and modified by the controller, and written back to a different block. While you can in principle write to (but not overwrite) smaller units on flash disks, you can only ever erase huge blocks, it's how the hardware works. This is the reason why SSDs fare so much better on huge sequential writes.
fsutil
.