0

thanks for reading.

I intend to use Parchive to add redundancy to my storage and backup.

Why? I compute since 1980 (Apple II). I used cassette tapes, 360KB, 1.2MB, 1.44MB, etc. And I noticed that hard drives are not so reliable as I would like. Worse than that, I noticed that some times I get corrupted files without any warning. I read them from the HD with no errors, but they are corrupted.

I would like to have a way to check the integrity, and also recover.

My intention: have 2 directories:

  • "Data" - My files..

  • "Parchives" - A copy of the "Data" tree, with parchives of each file of the "Data" directories, on the same level.

So I have:

  • "X:\Data\Projects\test1.cpp"
  • "X:\Parchives\Projects\test1.cpp.par2"

This way I have all parchives separated from data. I can choose to backup "Data" on 3 external HDD and "Parchives" on 8 external HDDs (relax, it´s just an example...)

I intend to create a C# program to keep track of "Data" and "Parchives". It can verify the integrity, and it can also update the tree (files that have been moved, renamed, created, deleted, changed, etc.)

The problem is.... errr.... I don´t really know how to use parchive.....

I downloaded "par2cmdline-0.2.x86.win32.zip" and ran some tests.

It creates 9 "par2" files for each source file (9!!!!)

  40.408B   Test_1.par2
  44.012B   Test_1.vol000+01.par2
  87.924B   Test_1.vol001+02.par2
 135.440B   Test_1.vol003+04.par2
 190.164B   Test_1.vol007+08.par2
 259.304B   Test_1.vol015+16.par2
 357.276B   Test_1.vol031+32.par2
 375.296B   Test_1.vol063+37.par2

Total Size: 1.489.824 bytes

I can use the "-n1" option, but it still creates 2 files:

  40.408B   Test_2.par2
 642.656B   Test_2.vol000+100.par2

Total Size: 683.084 bytes

The total size is also smaller, I guess it is less secure..

Questions:

1) Can I reduce to only 1 "par2" file? No way?...

2) How can I get the same redundancy level when using the "-n1" option? I noticed that using option "-n1 -r15" I get almost the same file size of the 9 "par2" files with only 2 files:

   40.408B  Test_3.par2
1.444.072B  Test_3.vol000+300.par2

Total Size: 1.484.480 bytes

Is this the same thing? (the "-r15" option gives 15% redundancy instead of standard 5%)

3) Am I doing something really stupid? Is there a better way?

Thank you!

3
  • Maybe a dumb question, but why not use SATA RAID with 2 identical harddrives in a MIRROR? Files won;t get corrupted and once a disk is failing, the SATA RAID controller will let you know during boot. In addition, you can then take out the defective harddrive, replace it with an identical one, and the RAID Controller will rebuild the array.
    – LPChip
    Commented Aug 23, 2016 at 18:14
  • Not really a dumb question, but I don´t like SATA RAID because...
    – user633285
    Commented Aug 23, 2016 at 19:18
  • Not a dumb question, I tried NAS with that, but performance really sucks (NAS nature). I don´t like SATA RAID because... (1) You must have a special controller and/or a driver (2) I can´t tell when to start a Verify job (3) The SATA RAID will tell me I have 2 different files, but which is good? Some times data is read with no errors from HD. (4) It can´t really recover data, just copy good file over bad (5) I want to verify and repair on a PC that does not have the RAID controller. And finally... (6) Yes, I might use a MIRROR RAID, but I will not rely 100% on it. Thanks for your questioning.
    – user633285
    Commented Aug 23, 2016 at 19:35

1 Answer 1

0

Have to assume that no-one's answered this because the answers are readily available in the documentation (i.e man par2cmdline or https://github.com/Parchive/par2cmdline). Having said that, here's the quick answers for anyone else who happens across this question.

1) Firstly, it sounds like you're asking if you can have the index and parity information in the one file. The answer is no. The index file will always be created and can be used without parity data to test whether a file is damaged (though, you might want to consider md5sum or similar in that case). However, if you're asking if you can create just an index file without any parity data, the answer is yes by supplying a zero to the r option: -r0. So:

par2 c -r0 X:\Parchives\Projects\test1.cpp.par2 X:\Data\Projects\test1.cpp

2) As stated in the documentation, the -n option controls the number of parity files created and has no effect on the amount of parity data created nor the amount of damage it can repair. Keep in mind that parchive was originally created for usenet/news downloads where the structure of the parity files meant less damaged files could be repaired by smaller parity files, which also had to be downloaded. In other words, it was desirable not to have to download all the available parity information to repair a file with minor damage. However, this nesting has no real benefit where the files and parity data are available locally so generating multiple, nested parity files makes no sense. In that case, the only reason to create more than 1 parity file is if the amount of parity data itself is too large.

3) Personally, no, I don't think you're doing any stupid. You might consider using WinRAR can create it's own repair records (similar to parchive) which are then embedded in the rar itself, though of course that's a non-free option. The other option is to simply mirror the data to external, reliable storage, md5 the data (or the par2 index data which can cover multiple files and nested directories, which md5sum can't, at least not directly) and use that to test for corruption and restore from the mirror.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .