RAID Explained (Simplified)

Whether it be an SSD, HDD, or a USB flash drive, all storage devices will stop working eventually. Data loss can be critical in any scenario. You might lose personal files like family pictures or important work documents. You can prevent data loss by splitting the job of storage onto multiple storage devices. A RAID(Redundant Array of Inexpensive Disks) configuration can make this easy and time-saving.

Disclaimer

By saying, "backup", I simply mean a copy of data. This post is extremely simplified and was made for the non-tech-savvy viewers. I will only explain RAID 0, 1, 5, 6, and 10 because I feel they are best for the common person. This may be helpful if all you plan to store is your scanned papers, personal photos, personal videos, and other minor files. Consult a storage engineer.

RAID 0

All drives basically merge into a single hard drive volume. For each storage device in the array, it will show up as one storage device on your computer. If one of the drives fails, you don't just lose the data contained in the failed drive, you lose the data in all drives.

You are not protected from data loss with RAID 0. Although this sounds pointless in terms of backing up your data, combining RAID setups like seen in RAID 10 is where RAID 0 shines.

This requires at least two drives.

RAID 1

Imagine you have two hard drives. When writing to your storage, e.g saving a file, the data will be written to both hard drives. The two drives contain the same data. This means that if one of your hard drives breaks/fails, you still have a copy. This also implies that at most, only half of your total storage space will be usable.

To increase your peace of mind, you can add more drives to the array. The more drives you have, the more copies of data you have. You can have up to 32 storage drives in the array, meaning 32 copies of the same data. The chances of all drives failing to get lower as you add more. So long as you have at least one drive alive, you still have all of your data. Adding more drives also means your write speeds will decrease because the computer will be writing a multiplied volume of data.


Adding more drives means more redundant drives, not more usable storage space.

This requires at least two drives.

RAID 5

Like RAID 0, all of the drives are merged so that your computer will only see one storage volume. The difference is RAID 5 can tolerate the failure of one drive. If one drive is lost, you are still able to access your data. There is failure tolerance because the data is backed up through reserved storage space. Because some storage space is reserved, you will be left with less usable storage space compared to RAID 0.

This requires at least three drives.

RAID 6

This behaves almost everything like RAID 5. While RAID 5 only tolerates the loss of one drive, RAID 6 tolerates the loss of two drives.

This requires at least four drives.

RAID 10

RAID 10 is basically RAID 1 configurations inside of a RAID 0 configuration.

Thanks to RAID 0, you can increase storage simply by adding more drives. Thanks to RAID 1, you can create as many duplicates of drives as you want. Thanks to RAID 10, you get both. You can have expandable storage space and have copies of the expanded storage.

This requires at least four drives.

What should I pick?

Picking the best solution depends on many variables. Consult a storage engineer for complex situations. Here is my general guideline:

If you do not care for having extra copies of your data and all you want is to make your multiple drives appear virtually as one, RAID 0 is easily the way to go.

RAID 1 is ideal where data loss is absolutely intolerant, such as for data archival. Having many duplicated drives mathematically means you have a better chance of one at least one storage drive surviving. It is unlikely that all drives would fail before your fast-acting of replacing the failed drives.

RAID 5 or 6 is ideal where high storage space with some fault tolerance is your main concern. The nature of RAID 1 makes it so only half of the storage drives offer usable storage space. RAID 5 and 6 use optimized storage efficiency. You still have some fault tolerance, but not nearly as much as RAID 1's potential.

When you need more of something else, consider combining RAID configurations like done through RAID 10, RAID 51, or RAID 61.

Misc. Noteworthy

In any RAID array, it is recommended to use the same storage drives.

  • Not all storage drives are good to run 24/7. Getting a random storage drive may have a shorter lifespan than expected, which would be a critical issue with an unexpected failure. A drive like this Seagate Ironwolf was rated for 24/7 operation.
  • RAID will only read as fast as the drive with the slowest reads. RAID will only write as fast as the drive with the slowest write. Throwing in an SSD into an array of HDDs does not mean you will increase speed.
  • The max storage capacity of the smallest storage drive will be the max used storage capacity on the other drives.
  • It is sometimes possible to restore data on failed drives through data recovery services. It is not always guaranteed as sometimes, data is simply unrecoverable. Even if your data is recoverable, you should expect to pay a large price.
  • Proper maintenance is needed. Such as when a drive fails, you should replace it ASAP. Anticipate that a drive will fail eventually, so replace them on a schedule. Perhaps every 3 years.
  • Replacing multiple storage drives at once could result in data loss. Replace them one by one and wait for RAID processing to complete before you replace the next.

My hardware for archival storage

For a simple, fast, and cost-effective archival solution, all you need is a few hard drives and a direct-attached storage(DAS) enclosure. A DAS I use is the QNAP TR-004. It is like a case that you plug storage drives into. It has a USB port so you can hook it up to your computer as if it were a USB flash drive. It has up to four storage drive slots and can support RAID. Being a typical archival storage solution, I would not care for extremely high speeds as found on SSDs. The modern HDDs like the Seagate Ironwolf offers more than plenty in terms of performance. The Seagate Ironwolf in particular is well tested for situations with RAID and 24/7 operation.

I bought 2 2TB HDDs and put them into the 4 slot DAS. I then chose to use RAID 1. I left two slots available in the DAS in case I need to expand my storage space by adding another drive. In that case, I'd switch to RAID 10.

Summary

All drives are expected to fail eventually. They don't last forever. Picking a RAID solution with supported redundancy can help prevent data loss. A lot of thought and research should be put into making your decisions to get the best results.