RAIDs, SSDs, iCLOUD & Performance
A question I am getting more and more these days revolves around something Jerry Thompson asks:
While I am interested in performance and speed between [Thunderbolt and USB 3], I find I am not completely understanding all I need to regarding RAID technology.
Or, as Craig McKenna writes:
[I recently bought] a 120 GB external SSD with Thunderbolt, I’m wondering how you would go about organizing my media.
I’ve spent a lot of time reviewing specific storage products. In this article, I want to take a step back and discuss storage performance in general.
A RAID (Redundant Array of Inexpensive Drives) is a collection of hard drives that create a pool of storage this is both very large and very fast. To the computer, and on your desktop, it looks like a single very big, very fast hard drive. A RAID builds all the drives into a single box with a single connection to the computer. (Yes, you can custom-build a RAID using stand-alone disks, but let’s just keep this simple.)
RAIDs are categorized into “levels,” which describe a combination of speed, redundancy, and price.
NOTE: “Redundancy” is defined as the ability to recover data in the event one, or more, hard drives dies.
For the purposes of this example, let’s assume each of the RAIDs below contains 1 TB drives which transfer data at 100 MB/second. (For comparison, the fastest a single drive can transfer data is around 120 MB/sec assuming it is connected internally to a recent-issue Mac.)
- RAID 0 is fast and inexpensive, but provides no redundancy. It generally contains two drives. So, in our example, a RAID 0 would store 2 TB of data, transferring at 200 MB/sec. However, if you lose either drive in a RAID 0, you lose all your data. These are a good choice for speed at a low cost.
- RAID 1 is slow and inexpensive, but with full redundancy. This generally contains two drives, however it is only as fast as the slowest drive and only holds as much as the smallest drive. So, in our example, a RAID 1 would store 1 TB of data and transfer data at 100 MB/sec. The good news is that if you lost either drive all your data is safe. I use these on set when capturing tapeless media from the camera.
- RAID 5 is very fast and more expensive, but with full redundancy. RAID 5 means that you can lose one drive, and still recover all your data. A RAID 5, though costing more, is the preferred system for video editing. It requires a minimum of three drives, but can contain as many as ten. A RAID 5 provides the combined storage of all drives in the RAID, minus one – which is reserved for data redundancy in the case of a drive failure. So, in our example, a five drive RAID 5 would store 4 TB (5 TB – 1 TB). It provides 400 MB/sec of data transfer, (500 MB/sec – 100 MB/sec).
- RAID 6 is essentially the same as RAID 5, except that you can lose two drives at the same time, without losing your data. To do this, RAID 6 reserves two drives for data redundancy, and requires a minimum of four drives in the unit. A RAID 6 provides the combined storage and speed of all drives in the RAID enclosure, minus two. So, in our example, a five drive RAID 6 would provide 3 TB of storage (5-2) and 300 MB/sec of data transfer (500-200).
There are other RAID levels – 3, 10, 11, 50 and 60 – but these four are the most relevant to individual video editors. From my point of view, while RAID 6 provides more security, I am happy with the balance of speed vs security of RAID 5, which is what I recommend most.
NOTE: Drobo is a special case. In general, all RAIDS must use drives of the same size and speed. As well, all drives need to be installed at the time you first power up the system. Drobo, on the other hand, has invented a new technology which allows you to add drives, or mix and match drives of different sizes, even after you’ve put the RAID into operation. While Drobo does not provide the fastest RAIDs, this flexibility can be a huge benefit.
SIDEBAR: HOW DATA REDUNDANCY WORKS
This is so cool… This works because all digital data is stored as either a 1 or a 0.
Imagine a 3D checkerboard — let’s make it 5 stories high. Look down on the top left square and count the number of checkers on that square for each of the top four layers.
If they total an odd number, put a checker on the same square on the bottom layer. If they total an even number, don’t put a checker on the same square on the bottom layer.
Now, remove the second layer and all it’s checkers, and put in an empty new checkerboard to take its place. By counting the number of checkers on the remaining top three layers and comparing the total to the indicator on the bottom layer, you can exactly rebuild all the missing checkers on the second layer. For example, if the total of the other three layers is even, and there’s a checker on the bottom layer, add a checker to the new layer. If the total of the other three layers is odd, and there’s a checker on the bottom layer, don’t add a checker to the new layer.
This is exactly how RAID redundancy works. Except each checkerboard represents a hard drive in the RAID. The bottom layer, which provides data redundancy, doesn’t need to know which drive failed, it only needs to compare the totals on all the different hard disks with the total stored on the redundancy disk in the RAID. This technique works whether you have three drives – the minimum – or twenty drives. The only difference is that more drives take longer to count and only one drive can fail at a time.
An SSD (Solid State Drive) drive is essentially RAM that has been configured to act like a regular hard disk. You copy and move files around in it the same as a hard disk. And, unlike RAM, it remembers your data when the power is turned off. Depending upon which version of the operating system you are using, an SSD drive ranges from “so-so” performance to blinding. Later versions of the Mac operating system do a much better job supporting SSD drives.
The big benefit an SSD provides is speed. Its two big limitations are cost and limited storage size.
While you can put an SSD drive anywhere you can put a “normal” hard disk – which we often call “spinning media,” the best place to put an SSD drive is inside your computer as a replacement for your boot drive.
Attaching an SSD drive externally via FireWire 800 will severely limit its performance and is not recommended. You won’t see any significant speed improvement because FireWire 800 is too slow.
NOTE: There is a limitation of SSD, however, in that it only allows a certain number of read/writes before the unit starts to fail. While the overall longevity of SSD is still being determined, for now, assume that you will need to replace an SSD drive sooner than a spinning media drive – probably after 3-4 years of normal use.
iCLOUD, and other Internet services like DropBox and YouSendIt, are essentially file servers that store your files outside of your computer.
If we ignore issues like file security, these services are excellent for backing up data, sharing files between devices, and moving files between computer systems. However, they are not good for storing source media files for editing. It isn’t because they don’t store enough. Just the opposite, these services can store a vast amount of data. The problem is that the connection speed – called the “data transfer rate” – between your computer and the iCloud is too slow. Video editing requires data transfer rates far beyond anything supplied by even the fastest DSL or cable modem.
Use the Cloud for sharing, but not for storing or editing source files.
WHAT IS THUNDERBOLT?
Thunderbolt is a method for connecting monitors and hard disks to your system. In this regard it is just like FireWire or USB – its a cable and communication protocol that move data to and from your computer and storage.
The big benefit to Thunderbolt is that it is REALLY fast! More than 1 GB/sec of data transfer speed! However, in order for that speed to be realized, you need a REALLY fast RAID. A two-drive RAID 0 won’t begin to fill a Thunderbolt “pipe.”
Thunderbolt is how you connect your drive to your computer. The speed you get will depend upon the speed of the RAID you have attached. Here are some very general expectations for data transfer:
- A two-drive RAID 0 – which is the most popular RAID right now – should support about 180 – 210 MB/sec via Thunderbolt.
- A five-drive RAID 5 should support about 350 – 425 MB/sec.
- A ten-drive RAID 5 should support about 850 MB/sec – 1.0 GB/sec.
NOTE: A single drive connected via Thunderbolt will be only marginally faster than the same drive connected via Firewire. In order to see significant performance improvement, you’ll need to use a RAID that contains at least four hard disks.
GETTING THE BEST PERFORMANCE
For best performance, I recommend replacing the spinning media hard drive inside your computer – this is also called the boot drive – with an SSD drive.
In general, media should not be stored on your boot drive. This means that only applications and the operating system are stored on the boot drive – along with other files that tend to be small, like email or word processing documents. If you have a large iTunes collection, or large iPhoto library, moving them to an external drive may allow better performance.
If I were setting up a new system, which I am doing next month, I would get a Mac with a SSD drive as the boot drive, and a Thunderbolt RAID 5 drive for media and project files.
My current boot drive uses 148 GB to store all applications and operating system files. I have hundreds of apps which don’t take a lot of storage. So, you don’t need to get a gigantic SSD drive – 250 – 500 GB is more than sufficient.
My media RAID, though, can’t be big enough. I haven’t decided exactly what I’m getting, but I’m looking for 6-10 TB (that’s Terabytes) of storage. I’ve discovered that hard drives have two states: empty or full. I want this one to remain as empty as possible for as long as possible.
This configuration provides a huge speed boost for the operating system and applications, while providing extremely fast access to huge amounts of media, with full redundancy in case of drive failure. This setup also offers a good balance between price and performance.
FOR MORE INFORMATION
Here is an article that explains hard disk and RAID performance and video formats in more detail. I highly recommend you read this article to understand the speeds you can expect from a storage device, how much space it takes to store media, and the data transfer rates of popular video codecs.