3

I'm using robocopy to copy a large number of files (300,000 files totalling around 4TB) from one NAS drive to another. Here is a sample command:

robocopy \\nas1\myfolder \\nas2\myfolder /E /R:1 /W:5

I've done this hundreds of times before across a wide variety of folders and drives without any issue. In each case, as it finishes copying a file, it always reported 100% before moving on to the next.

However, I'm suddenly getting less-than-100% transfers being reported. Usually 98-99% but sometimes as low as 30%. See image:

robocopy file transfers

The robocopy command is still running (expected to take several hours) so I cannot yet determine the validity of the copied files or do any checks.

In what circumstances would robocopy do this? Is there a problem with the transfer or is it reporting partial progress in threads or something?

Update: Small files (< 1k) tend to be worst affected, reporting < 50% in many cases. Large files (> 10MB) tend to report 98%+

5
  • 1
    Windows network shares work with multiple clients. Therefore even if robocopy is running you can check one original and one copied file if they are equal. Just generate the SHA-1 or SHA-256 hash of both files and compare them or use fc.exe /B on command-line for a direct comparison.
    – Robert
    Commented Apr 30, 2020 at 9:51
  • @Robert thanks for this - I have spot checked several files and they are identical. However, it's hard to verify large numbers of files easily. I'm very puzzled why it is reporting < 100%. I'll post the summary at the end of the command, once it's finished running. Commented Apr 30, 2020 at 10:24
  • 2
    Verifying a large number of files is easy using sha1sum utility. use it on a directory to print a list of files and their sha-1 hash. Save that output and use it for verifying the files of a second directory. For win32 e.g. available here: lists.gnupg.org/pipermail/gnupg-announce/2004q4/000184.html
    – Robert
    Commented Apr 30, 2020 at 10:32
  • Thanks, will check it out. Commented Apr 30, 2020 at 10:53
  • 1
    My guess, but that is just a guess: robocopy uses the file size reported by windows as "size on disk" (a multiple of the allocation unit size) to obtain the size of the file; but counts the bytes that have been copied to calculate the completion. The minimum "size on disk" for non-empty files is 4KB (by default, on an NTFS volume or 256KB on an ex-FAT volume), while the minimum bytes to transfer are 1 (one). In which case, that would give a 0.024% completion for that file. A file smaller than 2KB would then report its completion as lower than 50%.
    – 7heo.tk
    Commented Mar 26, 2021 at 20:29

2 Answers 2

2
+100

I believe that robocopy will see files as different sizes if the underlying block size of the destination volume is different than the source volume (i.e. a several-byte file's size-on-disk may be 4k NTFS, but only 1k on the NAS)

This often shows up when people are mirroring from an NTFS to a NAS, but I have only seen comments about this regarding the compare phase rather than during the copy itself.

I think you can tell samba to use a specific block size so you can configure the destination to use the same block size as the source if they do not already match.

2

SPECULATION ON WHAT IS HAPPENING W/O ANY TESTING:

  1. Files with less than 100% may have copied over, but are awaiting some integrity test later in the processing (maybe when robocopy is almost finished)
  2. Files with less than 100% may have failed as you said retry once after 5sec, so a network hiccup could have caused
  3. It's copying multiple files concurrently & is truly listing the partial files it has copied over

Things you could try:

  • robocopy /J: Copies using unbuffered I/O (recommended for large files)
  • robocopy /NOOFFLOAD: Copy files without using the Windows Copy Offload mechanism
  • robocopy /z /MT 32 /Log+:<LogFile>:
    • /z = Copies files in restartable mode. In restartable mode, should a file copy be interrupted, Robocopy can pick up where it left off rather than re-copying the entire file.
    • /MT 32 = Creates multi-threaded copies with n threads. n must be an integer between 1 and 128. The default value for n is 8. For better performance, redirect your output using /log option.
    • /Log = creates a log file that is helpful to follow-up on any files that failed

Links I found helpful investigating this:

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .