Lately, I have been getting deeper and deeper into low-level implementation related to storage technologies. In my previous blog, I was writing my learning adventure with Priority Flow Control (PFC) and intend to further the Data Center Bridging concepts with future blog entries.
Before I left for Sydney for a holiday last week, I got sidetracked into exciting stuff that’s happening in my daily encounters with friends and new friends. 2 significant storage related technologies fell onto my lap. One is NVMe (Non-Volatile Memory express) and the other FPGA (Field Programmable Gate Array).
While this blog is going to be about NVMe, I actually found FPGA much more exciting to me. Through conversations, I found that there are 2 “biggies” in the FPGA world, and they are designed and manufactured by Xilink and Altera. I admit that I have not done my homework on FPGA yet, having just returned from Sydney last night. I will blog about FPGA in future blogs.
But NVMe is also an important technology direction to the storage world as well.
I think most of us are probably already mesmerized by solid state drives. The bombardment of marketing, presentations, advertising and whatever else the vendors do to promote (and self-promote) solid state drives are inundating the intellectual senses of consumers and enterprises alike. And yet, many vendors do not explain both the pros and cons of integrating solid states into their IT environment. Even worse, many don’t even know the strengths and weaknesses of solid states, hence creating some exaggeration that continues to create a spiral vortex of inaccuracies. Like a self-feeding frenzy, the industry seems to have placed solid state storage as the saviour of the enterprise storage world. Go figure with that!
With all the promotions and hypes around solid state storage, especially the ones that are connected to SATA controllers and backplanes, one of the weaknesses of solid state drives has surfaced. You see, we “push” performance bottlenecks around the entire infrastructure ecosystems. Sometimes, it’s the CPU, sometimes, it’s the memory. As technology improves and innovation replaces older ones, the performance bottleneck is pushed to another segment of the infrastructure. And most of the time, storage is blamed for poor performance of the application.
That blame on storage is not without validity because storage is a significant contributing factor in how well applications behaving when exacting I/O to the storage infrastructure. I recalled a funny (yet truthful) rule while reading the “Sun Performance and Tuning” book by Adrian Cockcroft and Richard Pettit.
Rule#1: Check I/O
Rule #2: Check I/O again
That less that pleasant rule is a reflection of the storage infrastructure most of the time. With the advent of solid state storage, we seem to be caught up in the promise of solid states. Yet, many do not seem to notice the SATA is the weakness in the solid state storage conversation. In the case of solid state drives using SATA interfaces, we have just found a “anvil on the head moment“.
(Picture courtesy of http://drawception.com/viewgame/RaGyxEBr1F/this-is-going-to-hurt/)
That’s a conversation killer right there. SATA specifications and technology were designed for the spinning mechanical hard disk drives.
For example, the SATA concept of Native Command Queuing (NCQ) was designed specifically for spinning drives as shown in the diagram below:
It lets the hard drive controller decide the best method to retrieve data from the disk sectors. Unfortunately, NCQ does not apply to SSDs, at least not that I know of.
A second example is SATA only have one command queue per AHCI (Advanced Host Controller Interface) channel with a maximum of 32 commands per queue. And in most Intel-based x86 system board, there is only one AHCI channel. NVMe, on the other hand, presents 64K queues, each queue with the ability to hold 64K commands.
I found a whitepaper that describes the differences between NVMe and SATA/AHCI and the 2 diagrams below show the inadequacies of SATA/AHCI when compared to NVMe.
There’s still plenty more to come with NVMe and I have not described NVMe in more details yet. I intend to learn and share more in the blog entries to come.
Likewise SATA is also coming out with SATAe (SATA Express) that addresses the current SATA limitations.
It’s going to be exciting times as we move forward and hopefully, that performance bottleneck blame on storage will get pushed away to another part of the ecosystem.
Hmmmm…. Nah! It’s not going to happen!
I thought it should read : ‘most of the time, network is blamed for poor performance of the application.’ ^^
😀 To each its own.
All the best Jay!
NCQ is also useful for SSDs as they can utilize multiple flash chips in parallel. This obviously depend on the internal and temporal layout of data in the flash chips.
Hi Baruch
You are absolute right. I am correct that right now.
Cheers!
/Chin-Fah