It’s a beautiful Saturday morning … the sun is out, and the birds are chirping … and here I am, thinking about RAID-5/6. What’s wrong with me?
Anyway, have you ever wondered almost all your volumes are in a RAID-5/6 configuration? Like an obedient child, the answer would probably be “Oh, my vendor said it is good for me …”
In storage, the rule is applications-read, applications-write. And different applications have different behaviors but typically, they fall under 2 categories:
- Random access
- Sequential access
The next question to ask is how much Read/Writes ratio (or percentage) is in that Random Access behavior and how much of Read/Write ratio in Sequential Access behavior.
We usually pigeonhole transactional databases such as SQL Server, Oracle into OLTP-type characteristics with random access being the dominant access method. Similarly, email applications such as Exchange, Lotus and even SMTP into similar OLTP-type characteristics as well. We typically do a 2:1 or 3:1 ratio for OLTP-type applications with Read heavy and less of Writes. Data warehouse type of databases tend to be more sequential.
However, even within these OLTP applications, there are also sequential access behaviors as well, as the following table for a database shows:
Operation |
Random or Sequential |
Read/Write Heavy |
Block Size |
DB-Log |
Random (Sequential in log recovery) |
Write Heavy unless you are doing log recovery |
1KB – 64KB |
DB-Data Files |
Random |
Read/Write mix dependent on load |
4KB – 32KB |
Batch insert |
Sequential |
Write Heavy |
8KB – 128KB |
Index scan |
Sequential |
Read Heavy |
8KB – 128KB |
We will look into 4 RAID-levels in this scenario and see how each RAID-level applies to an OLTP-type of environment. These RAID levels are RAID-0, RAID-1 (1+0, 0+1 included), RAID-5 and RAID-6.
RAID-0 is the baseline, with 1 x Read and 1 x Write being processed as per normal.
In RAID-1, it would require 2 x Writes and 1 x Read, because the write operation is mirrored. The RAID penalty is 2.
To avoid the cost of RAID-1, RAID-5 is almost always the RAID level of choice (unless you speak to those NetApp fellas). RAID-5 is a parity-based RAID and require 2 x Read (1 to read the data block and 1 to read the parity block) AND 2 x Write (1 to write the modified block and 1 to write the modified parity). Hence it has a RAID penalty of 4.
RAID-6 was to address the risk of RAID-5 because disk capacity are so freaking large now (3TB just came out). To rebuild a large-TB drive would take longer time and the RAID-5 volume is at risk if a second disk failure occurs. Hence, double parity RAID in RAID-6. But unfortunately, the RAID penalty for RAID-6 is 6!
To summarize the RAID write penalty,
RAID-level |
Number of I/O Reads
|
Number of I/O for Writes
|
RAID Write Penalty |
0 |
1 |
1 |
1 |
1 (1+0, 0+1) |
1 |
2 |
2 |
5 |
1 |
4 |
4 |
6 |
1 |
6 |
6 |
So, it is well known that RAID 0 has good performance for reads and writes but with absolutely no protection. RAID-1 would be good for random reads and writes but it is costly. RAID-5 is good for applications with a high ratio of sequential reads vs writes (2:1, 3:1 as mentioned), and RAID-6, errr … should be taken similarly as RAID-5 with some additional performance penalty.
With that in mind, a storage administrator must question why a particular RAID-level was proposed to the database or any like-applications.
I am going out to enjoy the Saturday now … and today, August 13th is the World’s Left-Handed Day. More about this RAID penalty and IOPS in my next entry.