Skip NavigationView Sitemap

Is RAID 6 Safe Enough?

We have had a busy year this year recovering data for people and thought now would be a good time to revisit current thinking about how you store your media.  In the following paragraphs I will identify current issues with data storage and simple solutions you can take to safeguard your media.

In the good old days of offline editorial, storage integrity was not as critical due to the fact that you always had your master tapes.  In a pinch, you could always redigitize or reassemble to get back to where you were if your storage crashed. Where would we have been without timecode!  With today’s file based production pipelines, recovery from a disaster is not so simple. Storage integrity is THE most important thing to think about.

Many people say “no problem, my material is RAID protected” and in many cases, they go on for years with excellent results.  What could go wrong? As it turns out, a lot!

By way of definition, there are four basic RAID levels we care about in terms of our data storage:

  • RAID 0:  Data is striped across multiple drives to gain drive speed.  There is NO data protection and in the event of a drive failure, you lose all of your data, not just what was on the failed drive.
  • RAID 1: Known as mirroring, this format writes the same thing to two separate drives.  If one drive fails you have a complete replica on the other. Its great for data recovery but provides no speed increase and forces you to buy double the amount of storage you need.
  • RAID 5: Okay now we’re talking.  Using a minimum of three drives, a RAID 5 provides the benefit of data protection with the benefit of not having to double the amount of storage you have to buy.  Parity data is striped across all of the member drives and when one drive fails you simply replace it and the RAID rebuilds.
  • RAID 6:  RAID 6 is now the preferred RAID format for most enterprise storage solutions. It has all of the features of RAID 5 plus the added benefit that you can lose up to two drives from a RAID and still preserve the integrity of your data.

RAID 6 sounds great you say, but why is it not enough?

Here is a scenario which happened to one of our clients this year.  The names and manufacturer’s names have been changed to protect the innocent.  Our client produces episodic television and has over 180T of SAN/NAS storage online.  They have reliably cut hundreds of episodes using this storage without loss of speed or data.  Like most independent companies, they scale their employment based upon the number of shows they have in the door.  During a recent hiatus their IT manager took on another show and was not present in the machine room. At some point one of the owners walked in and noticed a blinking red light on the array.  Our technician came on site and realized that this array had actually lost two drives, not one, and if they lost one more they would have lost a huge amount of original media. Should just be a quick rebuild, right?  It didn’t turn out that way….

After inserting a new drive, the rebuild process started.  The problem was that during the rebuild process, the RAID controller inspected the sectors on the existing drives to make sure they were not corrupt.  As some of you may know, all hard drives have errors and the drives keep track of the bad sectors so that they are not used. These drives had been running for at least three years and as a result, certain sectors had degraded on the good drives.  During the rebuild process the RAID encountered these bad sectors and declared the drive bad. Uh oh…. now we had three drives down and we know what that means. No bueno!

Fortunately we were able to “trick” the controller into believing the drive was okay and we set about recovering the media.  We and our client were very thankful this turned out okay but it could have easily gone the other way.

There are some simple steps you can take to safeguard your media and avoid a perfect storm of data loss:

  • Set up Notifications – Almost all enterprise storage solutions include the ability to send out an email notification to responsible parties when the RAID detects a problem.
  • Refresh your Drives - Replace, or at a minimum reformat your RAID drives every three years to minimize down time.
  • Check your Environment – Bad power or insufficient cooling can really affect the long term health of your storage.
  • Back it Up – Always make an additional copy of your media.  Always.

 

This article was written by:

John Pankratz, President of Metro Digital Group

Metro Digital Group
Metro Digital Group - Logo
Back to top