I constantly transfer disk images and virtual machine images (usually 800GB to nearly 1 TB per file) to a cloud server via rclone using SSH, and I wonder how reliable are sha1sum and md5sum when it comes to verifying the integrity of very large files.
I found this: How can I verify that a 1TB file transferred correctly?
However it has something to do with performance rather than the reliability of the hashes generated.
Could there be a possibility that another file shares the same hashes generated considering there are so many distinct files out there?
So how reliable are MD5 and SHA-1 sums on very large files? Thanks.
I also found out this regarding collision: https://stackoverflow.com/questions/4032209/is-md5-still-good-enough-to-uniquely-identify-files
https://www.theregister.co.uk/2017/02/23/google_first_sha1_collision/