The burgeoning world of NVMe

When I wrote this article “Let’s smoke this storage peace pipe” 5 years ago, I quoted:

NVMe® and NVM®eF‰, as it evolves, can become the Great Peacemaker and bringing both divides and uniting them into a single storage fabric.

I envisioned NVMe® and NVMe®oF™ setting the equilibrium at the storage architecture level, finishing the great storage fabric into one. This balance in the storage ecosystem at the storage interface specifications and language-protocol level has rapidly unifying storage today, and we are already seeing the end-to-end NVMe paths directly from the PCIe bus of one host to another, via networks over Ethernet (with RoCE, iWARP, and TCP flavours) and Fibre Channel™. Technically we can have an end point device, example a tablet, talking the same NVMe language to its embedded storage as well as a cloud NVMe storage in an exascale storage far, far away. In the past, there were just too many bridges, links, viaducts, aqueducts, bypasses, tunnels, flyovers to cross just to deliver a storage command, or a data in a formats, encased and encoded (and decoded) in so many different ways.

Colours in equilibrium, like the rainbow

Simple basics of NVMe®

SATA (Serial Attached ATA) and SAS (Serial Attached SCSI) are not optimized for solid state devices. besides legacy stuff like AHCI (Advanced Host Controller Interface) in SATA, and archaic SCSI-3 primitives in SAS, NVM® has so much to offer. It can achieve very high bandwidth and support 65,535 I/O queues, each with a queue depth of 65,535. The queue depth alone is a massive jump compared to SAS which has a queue depth limit of 256.

A big part of this is how NVMe® handles I/O processing. It has a submission queue (SQ) and a completion queue (CQ), and together they are know as a Queue Pair (QP). The NVMe® controller handles tens of thousands at I/Os (reads and writes) simultaneously, alerted to switch between each SQ and CQ very quickly using the MSI or MSI-X interrupt. Think of MSI and MSI-X as a service bell, a hardware register that informs the NVM® controller when there are requests in the SQ, and informs the hosts that there are completed requests in the CQ. There will be plenty of “dings” by the MSI-X service register but the NVMe® controller can perform it very well, with some smart interrupt coalescing.

NVMe I/O processing

NVMe® 1.1, as I recalled, used to be have 3 admin commands and 10 base commands, which made it very lightweight compared to SCSI-3. However, newer commands were added to NVMe® 2.0 specifications included command sets fo key-value operations and zoned named space.

NVMe® – the milestones

Many things that are happening the storage world today was concocted and devised years ago. The industry standards people – NVMe®, INCIT™/T11, among many – already knew the impact of solid state devices will have. Since its inception, we have seen numerous development upgrades to the protocol, each bringing NVMe® to its objectives and beyond.

The development milestones of NVMe

The NVMe®oF™ networking siblings

NVMe®oF™ (Non-volatile Memory Express over Fabrics) is a natural progression of NVMe. There were already support for NVMe® over Ethernet (notably the iWARP and RoCE v2 flavours) early on. NVMe® of Fibre Channel™ was right along side with the Ethernet specifications and development. As of version 1.4, NVMe® over TCP was ratified as well,

The networking options of NVMe-over-Fabrics

Golden age of SAN to rise again

NVMe®oF™, specifically NVMe® over TCP is gaining a lot of tractions. Several large storage vendors have thrown their weight behind NVMe® over TCP, just like they did over NVMe® over Fibre Channel™. This is SAN (Storage Area Network) once again, this time carrying the NVMe® payload instead of SCSI. These bring forth FC-SAN and IP-SAN in a completely new light.

Early days of NVMe Storage Area Network (SAN) using Linux as the initiator and the target

With disaggregation and composability storage technologies maturing quickly, and Infrastructure-as-Code with Kubernetes orchestration of all things containers and all things cloud native, think of NVMe® storage resources across any networks, anywhere, on every device, all the time on a mega, mega scale, with speed. Think like LinkedIn®, or Twitter®, handling billions and billions of datasets on all their storage resources across the world, delivered with NVMe® in incredible scale and performance. NVMe® will enable all this, seen as a concept in the diagram of 2019 below:

A proposed NVMe over Ethernet disaggregated deployment model (circa 2019)

This disaggregated NVMe®oF™ deployment model is almost already here.

What is next? 

Over the course of the next 2-3 years, we will see NVMe completely dominate the storage media landscape. This was reported by IDC Worldwide Solid State Drive Forecast 2020-2024, Doc #US4590920, summarized in the chart below:

IDC Worldwide Solid State Drive Forecast 2020-2024, Doc #US4590920, December 2020

What makes NVMe so exciting is it operates at the PCIe bus layer. Therefore, NVMe has to ability to merge and communicate bytes, pages and blocks at the CPU (and other burgeoning processors like DPUs, IPUs, xPUs, GPUs, VPUs, SmartNICs etc) and memory complex in a same common dialect. Along with CXL 1.1./2.0, PCIe 5.0, and NVMe 2.0, I am seeing the possibility of a memory cloud, something I have been saying for almost a decade. Memory, the last bastion for storage, is about to be amalgamated into the storage layer of computing.

This mixture of this memory cloud concoction is almost ready. I cannot wait to see what the future of storage holds.

 

 

 

 

Tagged , , , , , , , , , . Bookmark the permalink.

About cfheoh

I am a technology blogger with 25+ years of IT experience. I write heavily on technologies related to storage networking and data management because that is my area of interest and expertise. I introduce technologies with the objectives to get readers to *know the facts*, and use that knowledge to cut through the marketing hypes, FUD (fear, uncertainty and doubt) and other fancy stuff. Only then, there will be progress. I am involved in SNIA (Storage Networking Industry Association) and as of October 2013, I have been appointed as SNIA South Asia & SNIA Malaysia non-voting representation to SNIA Technical Council. I currently run a small system integration and consulting company focusing on storage and cloud solutions, with occasional consulting work on high performance computing (HPC).

One Response to The burgeoning world of NVMe

  1. Paul Rosham says:

    Nice article! It’s great to be able to tell the story of the progression of these standards.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.