OpenZFS dRAID has risen!

We await the 3rd iteration of TrueNAS® SCALE 23.10 codenamed Cobia. 23.10 means October 2023, and we are within weeks of its announcement.

One of the best features I have been waiting for is dRAID or distributed RAID. I have written about it dRAID a couple of years back. It was announced in 2021, in OpenZFS 2.1, but we have not seen an commercial implementation of dRAID … until iXsystems™ TrueNAS® SCALE 23.10. Why am I so excited?

I have followed the technology since Isaac Huang presented dRAID at the OpenZFS Summit in 2015. Through the years ahead, I have seen Isaac presenting dRAID at the summits, and with each iteration, dRAID got closer and closer to be developed into OpenZFS. It was not until 2021, in OpenZFS 2.1 when dRAID became part of filesystem. And now, dRAID is finally in the TrueNAS® SCALE offering.

Knowing RAID resilvering

RAID rebuilding or reconstruction is a painful and potentially risky process. In OpenZFS and ZFS speak, this process is called resilvering. In simple laymen terms, when a drive (or drives) failed in a parity-based RAID volume (eg. RAID-Z1 or RAID-Z2 vdev), the data which was previously in the failed drive is recreated in the newly integrated spare drive. The structural integrity of the RAID volume (and the storage pool) is preserved but the data that was lost is painstakingly remade through the mathematical algorithm of the parity function of the RAID volume.

When hard disk drives were small in capacity like 2TB or less, the RAID resilvering process was probably faster to complete, returning the parity RAID volume to a normal, online state. But today, drives are 22TB and higher, leaving the traditional RAID resilvering process to take days and even weeks. This leads the RAID volume vulnerable to another possible drive failure, weakening the integrity of the RAID volume. Even worse, most of modern day storage arrays have many disk drives, into the thousands even. And yes, solid state drives would probably be faster in the resilvering, but the same mechanics pretty much apply in OpenZFS.

At the same time, the spare drives are assigned physically and designated to the OpenZFS storage pool, and are not part of the vdev until the resilvering process kicks in.

Yes, this is pretty much a physical process that takes time, computing resources and patience. Note the operative word of “physical” here.

dRAID resilvering

dRAID speeds up the RAID resilvering process several folds, returning the RAID volume (or vdev) much faster than traditional OpenZFS RAID resilvering process. It uses a logical (as opposed to physical) RAID layout concept and uses “logical spare drives”. Thus, there will be many spares “blocks” distributed across the entire dRAID zpool, as shown in the diagram below.

Traditional RAID vdev vs dRAID vdev

As you can see, the spare green blocks are distributed across the entire vdev instead of just dedicated as physical spare drives. Using many disk spindles to increase parallelism, the dRAID resilvering is much faster. And this is critical in large scale deployments of TrueNAS® SCALE.

The case of logical RAID resilvering isn’t new. I wrote about RAID declustering technology in 2012. Dr. Garth Gibson, one of the co-creators of RAID, already started tackling this RAID reconstruction/rebuilding/resilvering challenge more than decade ago. Here are some early diagrams (shown below) I used in my blog in 2012, which explained IBM GPFS declustered RAID.

And the result and the impact of dRAID are significant. As shown in the diagram below, the resilvering process is more fairly distributed compared to the traditional RAID resilvering.

dRAID resilvering

The parallelism applied in dRAID shortens the resilvering process significantly.

Moving forward

Having described and introduced dRAID, the application of dRAID in the TrueNAS™ SCALE storage array is obvious. dRAID is meant for arrays with large disk counts. Anything lesser than 30 drives (my opinion but iXsystems™ has best practices to be known soon) are not good candidates for dRAID. The maximum size for a dRAID vdev is 256 drives. And there will impact on the computing resources as well. Real world results will be known once dRAID is out and used by TrueNAS® users.

For now, this is what I am sharing in this blog. This is a primer to dRAID, coming soon in the TrueNAS® SCALE “Cobia” release.

Tagged , , , , , , , , , , , . Bookmark the permalink.

About cfheoh

I am a technology blogger with 30 years of IT experience. I write heavily on technologies related to storage networking and data management because those are my areas of interest and expertise. I introduce technologies with the objectives to get readers to know the facts and use that knowledge to cut through the marketing hypes, FUD (fear, uncertainty and doubt) and other fancy stuff. Only then, there will be progress. I am involved in SNIA (Storage Networking Industry Association) and between 2013-2015, I was SNIA South Asia & SNIA Malaysia non-voting representation to SNIA Technical Council. I currently employed at iXsystems as their General Manager for Asia Pacific Japan.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.