Logical and economical storage of image archive

Look, this is pretty simple as I see it. I have about 40,000 NEFs which take up about 500 GB. That's my data. I need to keep two copies (one copy on each of two drives) in my system, and one in a different place. If your needs are somehow radically different from this then there is something I'm missing.
 
Last edited:
jamesdak wrote:

I think the storage of both copies of the file is critical. Maybe I am just confused as to what you are asking.
What's critical is having multiple copies - the more the better. And having them in physically separate locations reduces risk even further.

When you do that, any one copy isn't critical. You can loose one copy without any repercussions because you can recover from the other(s).

If it comes to a choice between having more copies or making your the copies you have more robust, choose more copies - it's safer, because no matter how robust your storage system is your data is still vulnerable to accidental deletion, corruption, theft, and other similar risks.


The only sound reason to choose in favour of more robustness is if you rely on the availability of your data for your livelihood. For example, if you're a wedding photographer and you can't afford to be without your client's photos for the time it takes after your drive fails (or is stolen, etc.) to get the other copy and restore it. That's when robustness earns its keep. Doesn't absolve you of the need for external backups, though...
 
Last edited:
BobSC wrote:
malch wrote:
jamesdak wrote:
"5+ years old" is my new stuff, LOL. My newest system was built in 2005.
The reliability of items like power supplies and disk drives plummets pretty badly after 5 years. So building systems out of components that old is unlikely to work out well without a great deal of good luck.

P.S. I generally dispose of disks once they get to 5 years old. The small capacity, high power consumption, and poor reliability makes them more of a liability than an asset.
Depends on what they were in the first place. Our Dell business class machines seem to just keep running.
Even enterprise class drives have a 5 year rating. They can and often will run long past that, but it defeats the point of "archiving" if you're using drives of that age. It would be another reason to use zfs with frequent disk scrubs (or some sort of automated checksumming) - if you start having to repair blocks each week, it's time to move on.

As for the overall thread, it's the same BS as ever. We really need to have a sticky where we can cover myths like

1) raid = backups

or my favorite

2) raid is too hard and expensive

and

3) sneakernetting my offsite backup once a month is actually an intelligent idea.
 
kelpdiver wrote:
BobSC wrote:
malch wrote:
jamesdak wrote:
"5+ years old" is my new stuff, LOL. My newest system was built in 2005.
The reliability of items like power supplies and disk drives plummets pretty badly after 5 years. So building systems out of components that old is unlikely to work out well without a great deal of good luck.

P.S. I generally dispose of disks once they get to 5 years old. The small capacity, high power consumption, and poor reliability makes them more of a liability than an asset.
Depends on what they were in the first place. Our Dell business class machines seem to just keep running.
Even enterprise class drives have a 5 year rating. They can and often will run long past that, but it defeats the point of "archiving" if you're using drives of that age. It would be another reason to use zfs with frequent disk scrubs (or some sort of automated checksumming) - if you start having to repair blocks each week, it's time to move on.

As for the overall thread, it's the same BS as ever. We really need to have a sticky where we can cover myths like

1) raid = backups

or my favorite

2) raid is too hard and expensive

and

3) sneakernetting my offsite backup once a month is actually an intelligent idea.
One man's BS is another man's treasure but the sticky would be valuable... :-)
 
Thanks to all that helped with this. I think my current method still appears to be the best approach. Multiple copies on multiple drives. I just need to consolidate my USB drives to make life easier for me.
 
malch wrote:
BobSC wrote:
The reliability of items like power supplies and disk drives plummets pretty badly after 5 years. So building systems out of components that old is unlikely to work out well without a great deal of good luck.
P.S. I generally dispose of disks once they get to 5 years old. The small capacity, high power consumption, and poor reliability makes them more of a liability than an asset.
Depends on what they were in the first place. Our Dell business class machines seem to just keep running.
AFAIK, Dell use the same basic disk drive models in their business class machines that they do in their consumer products.

However, the power supplies in the business class machines are definitely superior.



It depends on which you get. Some come with e.g. Velociraptor drives. I have a feeling (completely unsubstantiated) that superior power supplies help things run longer. Whatever.


Maybe we're just lucky, but for whatever reason, the stuff here just seems to keep going. At work we run Windows Home Server OEM on a Dell Optiplex that we bought in 2004. It's been running 24/7 since then, excepting power outages. The boot drive /just/ failed the other day. Took about 2 hours to replace. No big deal.

The data lives on a separate external drive (as does the server backup drive). But it also gets backed up by a client PC with excess storage.


The thing most people don't think about when it comes to backup PCs, is that if they fail, you've only lost the backup. So long as you don't lose both the original and the backup /at the same time/ it doesn't matter so much if one /or/ the other goes down. Yeah, it's a pain, but you don't lose anything. When you think about what will cause two PCs to stop functioning at the same time, they tend to be things other than the age of the equipment. Things like floods and fires.

It's cheap, and it works. Safety through redundancy. Haven't lost any file through hardware failure since 1991 (knocking on wood).
 
BobSC wrote:

The thing most people don't think about when it comes to backup PCs, is that if they fail, you've only lost the backup. So long as you don't lose both the original and the backup /at the same time/ it doesn't matter so much if one /or/ the other goes down. Yeah, it's a pain, but you don't lose anything. When you think about what will cause two PCs to stop functioning at the same time, they tend to be things other than the age of the equipment. Things like floods and fires.
Not quite. There was a noteworthy white paper a couple years back about the failure of raid to deliver its promise. A key argument made is that recovering from a failure is when other weaknesses in the backup get exposed. This was a particularly bad aspect of RAID 5 recovery - while one drive showed obvious failure, the massive reads/writes around a recovery exposed bad blocks on another, ones that weren't obvious before.

So if you backup to old hardware, you need to do more than just verify you can log into it or read a few files off of it.
 
BobSC wrote:

Maybe we're just lucky, but for whatever reason, the stuff here just seems to keep going. At work we run Windows Home Server OEM on a Dell Optiplex that we bought in 2004. It's been running 24/7 since then, excepting power outages.
Pretty much the same here. Two things that I do that really help:

1. Run critical systems on a good UPS (versus a $5 Walmart surge protector).

2. Clean the crud out of the vents and fans 3 or 4 times a year.
 
malch wrote:

2. Clean the crud out of the vents and fans 3 or 4 times a year.
I sprung for a very nice Antec P182 case for my latest system, and one of its features that unexpectedly pleased me greatly was dust filters on the air intakes. They work great, are easily removed for cleaning, and clean up very easily with a vacuum. It'll be a feature I look for in my next case...

Image from the Club Overclocker web site

Image from the Club Overclocker web site
 
Sean Nelson wrote:
kelpdiver wrote:
We really need to have a sticky where we can cover myths like
...sneakernetting my offsite backup once a month is actually an intelligent idea.
It's free and it's effective. What's the downside?
appropriate for the sticky, which if properly written, will have a fairly written pro and con for items like this.

IMO, it's neither free nor effective to need to transport that drive to a secure location that actually protects against earthquakes or large hurricanes. I need it 100 miles away, just as my prior company would create datacenters around the world in pairs with a minimum distance between them.

Having a month old backup is better than nothing, but it sucks.
 
kelpdiver wrote:
BobSC wrote:

The thing most people don't think about when it comes to backup PCs, is that if they fail, you've only lost the backup. So long as you don't lose both the original and the backup /at the same time/ it doesn't matter so much if one /or/ the other goes down. Yeah, it's a pain, but you don't lose anything. When you think about what will cause two PCs to stop functioning at the same time, they tend to be things other than the age of the equipment. Things like floods and fires.
Not quite. There was a noteworthy white paper a couple years back about the failure of raid to deliver its promise. A key argument made is that recovering from a failure is when other weaknesses in the backup get exposed. This was a particularly bad aspect of RAID 5 recovery - while one drive showed obvious failure, the massive reads/writes around a recovery exposed bad blocks on another, ones that weren't obvious before.

So if you backup to old hardware, you need to do more than just verify you can log into it or read a few files off of it.



But I keep mentioning the drives used for the backup are new -- it's the PCs that are old.
 
BobSC wrote:
Not quite. There was a noteworthy white paper a couple years back about the failure of raid to deliver its promise. A key argument made is that recovering from a failure is when other weaknesses in the backup get exposed. This was a particularly bad aspect of RAID 5 recovery - while one drive showed obvious failure, the massive reads/writes around a recovery exposed bad blocks on another, ones that weren't obvious before.

So if you backup to old hardware, you need to do more than just verify you can log into it or read a few files off of it.
But I keep mentioning the drives used for the backup are new -- it's the PCs that are old.
That's a good compromise on cost and age...though you still would want to be doing block validation checks even with newer drives. I recently replaced a 3 yo 2TB EADS drive that started out mostly working (with occasional bit rot detected) and progressed to total failure. I have another that I'm leery of - but the replacement I got let me pull it out of rotation and now I can do some offline disk scans.
 
kelpdiver wrote:
IMO, it's neither free nor effective to need to transport that drive to a secure location that actually protects against earthquakes or large hurricanes. I need it 100 miles away, just as my prior company would create datacenters around the world in pairs with a minimum distance between them.
Living in the SF Bay Area I've considered that.

If my home is destroyed to the point that disks can't be recovered and my backup location (20 miles away) is similarly devastated, I'm not sure I'll be too worried about the data.

Food, water, and survival will be higher priorities, most likely for a long time.

Of course, a Fortune 1000 company would want more resilience, as at least one major disk drive manufacturer discovered following the Thai floods.

OTOH, I'm preparing a Xmas parcel to send to family overseas. Maybe I should include a USB disk or a dozen or so DVD's with my most critical data...
 
malch wrote:
kelpdiver wrote:
IMO, it's neither free nor effective to need to transport that drive to a secure location that actually protects against earthquakes or large hurricanes. I need it 100 miles away, just as my prior company would create datacenters around the world in pairs with a minimum distance between them.
Living in the SF Bay Area I've considered that.

If my home is destroyed to the point that disks can't be recovered and my backup location (20 miles away) is similarly devastated, I'm not sure I'll be too worried about the data.

Food, water, and survival will be higher priorities, most likely for a long time.
For a few weeks sure. Life may suck after a 7+ quake. But after the essentials are taken care of, I'm going to want my lifetime of photos back. Losing it because my house collapsed and so did my office/bank/etc - unacceptable.

20 miles isn't enough. 50 is the minimum and Sacramento much better - above any flood plains. (coincidentally my last two companies had their primary local DCs in Rancho Cordova and Roseville.)
 
kelpdiver wrote:

For a few weeks sure. Life may suck after a 7+ quake. But after the essentials are taken care of, I'm going to want my lifetime of photos back. Losing it because my house collapsed and so did my office/bank/etc - unacceptable.
I reckon my data (and home) will survive a 7.0 okay and I'm not too worried about that.

But a 9 or 10 could be a real problem for a very long time.
 
kelpdiver wrote:
BobSC wrote:
Not quite. There was a noteworthy white paper a couple years back about the failure of raid to deliver its promise. A key argument made is that recovering from a failure is when other weaknesses in the backup get exposed. This was a particularly bad aspect of RAID 5 recovery - while one drive showed obvious failure, the massive reads/writes around a recovery exposed bad blocks on another, ones that weren't obvious before.

So if you backup to old hardware, you need to do more than just verify you can log into it or read a few files off of it.
But I keep mentioning the drives used for the backup are new -- it's the PCs that are old.
That's a good compromise on cost and age...though you still would want to be doing block validation checks even with newer drives. I recently replaced a 3 yo 2TB EADS drive that started out mostly working (with occasional bit rot detected) and progressed to total failure. I have another that I'm leery of - but the replacement I got let me pull it out of rotation and now I can do some offline disk scans.
When Retrospect does it's backups it does a binary validation. It reads every file, writes them to the drive, then reads everything again, and compares to what it reads from the backup. If it starts throwing errors I get concerned. The trick is to remember to check the Retrospect logs routinely.

Also, on Windows Home Server, there is no easy way to schedule a backup. It's easy enough to do -- log in and click the "backup now" button.

With every file on three different drives in the office and one at a remote location, we feel fairly safe.
 

Keyboard shortcuts

Back
Top