Overrank
Veteran Member
- Messages
- 7,628
- Solutions
- 14
- Reaction score
- 4,437
I don’t know what Windows tools are available. I wrote my own in Python - I generate the hashes on the NAS using a Linux find command on a regular basis, and then I have a Python program which can tell me if every hash in one list is contained in the master list (I use it to know that I’ve copied everything off SD cards etc)I know about hashes. What I don't know about are the specific (Windows) tools for regular folks to use for automating their creation and running such periodic checks. What are those?You just create a hash (a short representation of the data as a number - see for example https://en.wikipedia.org/wiki/MD5 ) of the image when you store it and add it to a database. Periodically you recalculate the hash of every image and check them against the database. This will tell you (with a reasonable degree of accuracy) if any files have changed. I do this (for a different reason) on my home NAS, it’s pretty standard stuff.I'm going to point out that the actual reason for his loss is not being addressed in this discussion. It wasn't addressed in his own video, either.Tony Northrup just posted a video about that he just found out that he has lost many old personal photographs without knowing about it, and despite doing backups of backups.
The reason for the loss was undiscovered corruption of data in his older stored files, and because he was not aware of the corruption, those damaged files were eventually replicated across all of his backups before he discovered the problem. Then it was too late.
That particular issue requires a special solution, not more backups, or different types of backups. It would be best addressed by having a method of periodically - and automatically - checking all those stored files to identify any corrupted ones so they can be restored from good backups while the backups are still good.
There should be a reasonable way to run such a process on a regular basis. Is there?
If it’s a Linux based NAS it’s pretty simple to generate the master list. If you sorted the master list and your new list into order then you could probably use something like diff ( https://en.wikipedia.org/wiki/Diff ) to tell you what had changed ( I’m sure Windows versions will be available )My main use would be to run these periodic checks on the primary repository for files, my NAS, and maybe some folders on the computer's internal drives. I'd run it on the external backup drives only occasionally. Those normally stay disconnected from everything until they need to be either written or read.Which sort of leads me on to another point that gets lost in these discussions, which is that you can only reasonably compare if you’re doing best in class for all storage mechanisms. So you can only reasonably compare museum quality storage in temperature controlled facilities vs rotated backups with off site storage and data integrity checking. At which point it’s down to which format will you be able to read in the future: film - yes, digital - probably if you choose the right format.
(*) https://en.wikipedia.org/wiki/Backup_rotation_scheme