Disaggregation and Composability vital for AI/DL models to scale

New generations of applications and workloads like AI/DL (Artificial Intelligence/Deep Learning), and HPC (High Performance Computing) are breaking the seams of entrenched storage infrastructure models and frameworks. We cannot continue to scale-up or scale-out the storage infrastructure to meet these inundating fluctuating I/O demands. It is time to look at another storage architecture type of infrastructure technology – Composable Infrastructure Architecture.

Infrastructure is changing. The previous staid infrastructure architecture parts of compute, network and storage have long been thrown of the window, precipitated by the rise of x86 server virtualization almost 20 years now. It triggered a tsunami of virtualizing everything, including storage virtualization, which eventually found a more current nomenclature – Software Defined Storage. Both storage virtualization and software defined storage (SDS) are similar and yet different and should be revered through different contexts and similar goals. This Tech Target article laid out both nicely.

As virtualization raged on, converged infrastructure (CI) which evolved into hyperconverged infrastructure (HCI) went fever pitch for a while. Companies like Maxta, Pivot3, Atlantis, are pretty much gone, with HPE® Simplivity and Cisco® Hyperflex occasionally blipped in my radar. In a market that matured very fast, HCI is now dominated by Nutanix™ and VMware®, with smaller Microsoft®, Dell EMC® following them.

From HCI, the attention of virtualization has shifted something more granular, more scalable in containerization. Despite a degree of complexity, containerization is taking agility and scalability to the next level. Kubernetes, Dockers are now mainstay nomenclature of infrastructure engineers and DevOps. So what is driving composable infrastructure? Have we reached the end of virtualization? Not really.

Evolution of infrastructure. Source: IDC

It is just that one part of the infrastructure landscape is changing. This new generation of AI/ML workloads are flipping the coin to the other side of virtualization. As we see the diagram above, IDC brought this mindset change to get us to Think Composability, the next phase of Infrastructure.

Continue reading

Project COSI

The S3 (Simple Storage Service) has become a de facto standard for accessing object storage. Many vendors claim 100% compatibility to S3, but from what I know, several file storage services integration and validation with the S3 have revealed otherwise. There are certain nuances that have derailed some of the more advanced integrations. I shall not reveal the ones that I know of, but let us use this thought as a basis of our discussion for Project COSI in this blog.

Project COSI high level architecture

What is Project COSI?

COSI stands for Container Object Storage Interface. It is still an alpha stage project in Kubernetes version 1.25 as of September 2022 whilst the latest version of Kubernetes today is version 1.26. To understand the objectives COSI, one must understand the journey and the challenges of persistent storage for containers and Kubernetes.

For me at least, there have been arduous arguments of provisioning a storage repository that keeps the data persistent (and permanent) after containers in a Kubernetes pod have stopped, or replicated to another cluster. And for now, many storage vendors in the industry have settled with the CSI (container storage interface) framework when it comes to data persistence using file-based and block-based storage. You can find a long list of CSI drivers here.

However, you would think that since object storage is the most native storage to containers and Kubernetes pods, there is already a consistent way to accessing object storage services. From the objectives set out by Project COSI, turns out that there isn’t a standard way to provision and accessing object storage as compared to the CSI framework for file-based and block-based storage. So the COSI objectives were set to:

  • Kubernetes Native – Use the Kubernetes API to provision, configure and manage buckets
  • Self Service – A clear delineation between administration and operations (DevOps) to enable self-service capability for DevOps personnel
  • Portability – Vendor neutrality enabled through portability across Kubernetes Clusters and across Object Storage vendors

Further details describing Project COSI can be found here at the Kubernetes site titled “Introducing COSI: Object Storage Management using Kubernetes API“.

Standardization equals technology adoption

Standardization means consistency, control, confidence. The higher the standardization across the storage and containerized apps industry, the higher the adoption of the technology. And given what I have heard from the industry over these few years, Kubernetes, to me, even till this day, is a platform and a framework that are filled and riddled with so many moving parts. Many of the components looks the same, feels the same, and sounds the same, but might not work out the same when deployed.

Therefore, the COSI standardization work is important and critical to grow this burgeoning segment, especially when we are rocketing towards disaggregation of computing service units, resources that be orchestrated to scale up or down at the execution of codes. Infrastructure-as-Code (IAC) is becoming a reality more and more with each passing day, and object storage is at the heart of this transformation for Kubernetes and containers.

Continue reading

I built a 6-node Gluster cluster with TrueNAS SCALE

I haven’t had hands-on with Gluster for over a decade. My last blog about Gluster was in 2011, right after I did a proof-of-concept for the now defunct, Jaring, Malaysia’s first ISP (Internet Service Provider). But I followed Gluster’s development on and off, until I found out that Gluster was a feature in then upcoming TrueNAS® SCALE. That was almost 2 years ago, just before I accepted to offer to join iXsystems™, my present employer.

The eagerness to test drive Gluster (again) on TrueNAS® SCALE has always been there but I waited for SCALE to become GA. GA finally came on February 22, 2022. My plans for the test rig was laid out, and in the past few weeks, I have been diligently re-learning and putting up the scope to built a 6-node Gluster clustered storage with TrueNAS® SCALE VMs on Virtualbox®.

Gluster on OpenZFS with TrueNAS SCALE

Before we continue, I must warn that this is not pretty. I have limited computing resources in my homelab, but Gluster worked beautifully once I ironed out the inefficiencies. Secondly, this is not a performance test as well, for obvious reasons. So, this is the annals along with the trials and tribulations of my 6-node Gluster cluster test rig on TrueNAS® SCALE.

Continue reading

Computational Storage embodies Data Velocity and Locality

I have been earnestly observing the growth of Computational Storage for a number of years now.  It was known by several previous names, with the name “in-situ data processing” stuck with me the most. The Computational Storage nomenclature became more cohesive when SNIA® put together the CMSI (Compute Memory Storage Initiative) some time back. This initiative is where several standards bodies, the major technology players and several SIGs (special interest groups) in SNIA® collaborated to advance Computational Storage segment in the storage technology industry we know of today.

The use cases for Computational Storage are burgeoning, and the functional implementations of Computational Storage are becoming vital to tackle the explosive data tsunami. In 2018 IDC, in its Worldwide Global Datasphere Forecast 2021-2025 report, predicted that the world will have 175 ZB (zettabytes) of data. That number, according to hearsay, has been revised to a heady figure of 250ZB, given the superlative rate data is being originated, spawned and more.

Computational Storage driving factors

If we take the Computer Science definition of in-situ processing, Computational Storage can be distilled as processing data where it resides. In a nutshell, “Bring Compute closer to Storage“. This means that there is a processing unit within the storage subsystem which does not require the host CPU to perform processing. In a very simplistic manner, a RAID card in a storage array can be considered a Computational Storage device because it performs the RAID functions instead of the host CPU. But this new generation of Computational Storage has much more prowess than just the RAID function in a RAID card.

There are many factors in Computational Storage that make a lot sense. Here are a few:

  1. Voluminous data inundate the centralized architecture of the cloud platforms and the enterprise systems today. Much of the data come from end point devices – mobile devices, sensors, IoT, point-of-sales, video cameras, et.al. Pre-processing the data at the origin data points can help filter the data, reduce the size to be processed centrally, and secure the data before they are ingested into the central data processing systems
  2. Real-time processing of the data at the moment the data is received gives the opportunity to create the Velocity of Data Analytics. Much of the data do not need to move to a central data processing system for analysis. Often in use cases like autonomous vehicles, fraud detection, recommendation systems, disaster alerts etc require near instantaneous responses. Performing early data analytics at the data origin point has tremendous advantages.
  3. Moore’s Law is waning. The CPU (central processing unit) is no longer the center of the universe. We are beginning to see CPU offloading technologies to augment the CPU’s duties such as compression, encryption, transcoding and more. SmartNICs, DPUs (data processing units), VPUs (visual processing units), GPUs (graphics processing units), etc have come forth to formulate a new computing paradigm.
  4. Freeing up central resources with Computational Storage also accelerates the overall distributed data processing in the whole data architecture. The CPU and the adjoining memory subsystem are less required to perform context switching caused by I/O interrupts as in most of the compute/storage architecture today. The total effect relieves the CPU and giving back more CPU cycles to perform higher processing tasks, resulting in faster performance overall.
  5. The rise of memory interconnects is enabling a more distributed computing fabric of data processing subsystems. The rising CXL (Compute Express Link™) interconnect protocol, especially after the Gen-Z annex, has emerged a force to be reckoned with. This rise of memory interconnects will likely strengthen the testimony of Computational Storage in the fast approaching future.

Computational Storage Deployment Models

SNIA Computational Storage Universe in 2019

Continue reading

Time to Conflate Storage with Data Services

Around the year 2016, I started to put together a better structure to explain storage infrastructure. I started using the word Data Services Platform before what it is today. And I formed a pictorial scaffold to depict what I wanted to share. This was what I made at that time.

Data Services Platform (circa 2016)- Copyright Heoh Chin Fah

One of the reasons I am bringing this up again is many of the end users and resellers still look at storage from the perspective of capacity, performance and price. And as if two plus two equals five, many storage pre-sales and architects reciprocate with the same type of responses that led to the deteriorated views of the storage technology infrastructure industry as a whole. This situation irks me. A lot.

Continue reading

How well do you know your data and the storage platform that processes the data

Last week was consumed by many conversations on this topic. I was quite jaded, really. Unfortunately many still take a very simplistic view of all the storage technology, or should I say over-marketing of the storage technology. So much so that the end users make incredible assumptions of the benefits of a storage array or software defined storage platform or even cloud storage. And too often caveats of turning on a feature and tuning a configuration to the max are discarded or neglected. Regards for good storage and data management best practices? What’s that?

I share some of my thoughts handling conversations like these and try to set the right expectations rather than overhype a feature or a function in the data storage services.

Complex data networks and the storage services that serve it

I/O Characteristics

Applications and workloads (A&W) read and write from the data storage services platforms. These could be local DAS (direct access storage), network storage arrays in SAN and NAS, and now objects, or from cloud storage services. Regardless of structured or unstructured data, different A&Ws have different behavioural I/O patterns in accessing data from storage. Therefore storage has to be configured at best to match these patterns, so that it can perform optimally for these A&Ws. Without going into deep details, here are a few to think about:

  • Random and Sequential patterns
  • Block sizes of these A&Ws ranging from typically 4K to 1024K.
  • Causal effects of synchronous and asynchronous I/Os to and from the storage

Continue reading

The burgeoning world of NVMe

When I wrote this article “Let’s smoke this storage peace pipe” 5 years ago, I quoted:

NVMe® and NVM®eF‰, as it evolves, can become the Great Peacemaker and bringing both divides and uniting them into a single storage fabric.

I envisioned NVMe® and NVMe®oF™ setting the equilibrium at the storage architecture level, finishing the great storage fabric into one. This balance in the storage ecosystem at the storage interface specifications and language-protocol level has rapidly unifying storage today, and we are already seeing the end-to-end NVMe paths directly from the PCIe bus of one host to another, via networks over Ethernet (with RoCE, iWARP, and TCP flavours) and Fibre Channel™. Technically we can have an end point device, example a tablet, talking the same NVMe language to its embedded storage as well as a cloud NVMe storage in an exascale storage far, far away. In the past, there were just too many bridges, links, viaducts, aqueducts, bypasses, tunnels, flyovers to cross just to deliver a storage command, or a data in a formats, encased and encoded (and decoded) in so many different ways.

Colours in equilibrium, like the rainbow

Simple basics of NVMe®

SATA (Serial Attached ATA) and SAS (Serial Attached SCSI) are not optimized for solid state devices. besides legacy stuff like AHCI (Advanced Host Controller Interface) in SATA, and archaic SCSI-3 primitives in SAS, NVM® has so much to offer. It can achieve very high bandwidth and support 65,535 I/O queues, each with a queue depth of 65,535. The queue depth alone is a massive jump compared to SAS which has a queue depth limit of 256.

A big part of this is how NVMe® handles I/O processing. It has a submission queue (SQ) and a completion queue (CQ), and together they are know as a Queue Pair (QP). The NVMe® controller handles tens of thousands at I/Os (reads and writes) simultaneously, alerted to switch between each SQ and CQ very quickly using the MSI or MSI-X interrupt. Think of MSI and MSI-X as a service bell, a hardware register that informs the NVM® controller when there are requests in the SQ, and informs the hosts that there are completed requests in the CQ. There will be plenty of “dings” by the MSI-X service register but the NVMe® controller can perform it very well, with some smart interrupt coalescing.

NVMe I/O processing

NVMe® 1.1, as I recalled, used to be have 3 admin commands and 10 base commands, which made it very lightweight compared to SCSI-3. However, newer commands were added to NVMe® 2.0 specifications included command sets fo key-value operations and zoned named space.

Continue reading

Storage Elephant Compute Birds

Data movement is expensive. Not just costs, but also latency and resources as well. Thus there were many narratives to move compute closer to where the data is stored because moving compute is definitely more economical than moving data. I borrowed the analogy of the 2 animals from some old NetApp® slides which depicted storage as the elephant, and compute as birds. It was the perfect analogy, because the storage is heavy and compute is light.

“Close up of a white Great Egret perching on top of an African Elephant aa Amboseli national park, Kenya”

Before the animals representation came about I used to use the term “Data locality, Data Mobility“, because of past work on storage technology in the Oil & Gas subsurface data management pipeline.

Take stock of your data movement

I had recent conversations with an end user who has been paying a lot of dollars keeping their “backup” and “archive” in AWS Glacier. The S3 storage is cheap enough to hold several petabytes of data for years, because the IT folks said that the data in AWS Glacier are for “backup” and “archive”. I put both words in quotes because they were termed as “backup” and “archive” because of their enterprise practice. However, the face of their business is changing. They are in manufacturing, oil and gas downstream, and the definitions of “backup” and “archive” data has changed.

For one, there is a strong demand for reusing the past data for various reasons and these datasets have to be recalled from their cloud storage. Secondly, their data movement activities still mimicked what they did in the past during their enterprise storage days. It was a classic lift-and-shift when they moved to the cloud, and not taking stock of  their data movements and the operations they ran on these datasets. Still ongoing, their monthly AWS cost a bomb.

Continue reading

Storage IO straight to GPU

The parallel processing power of the GPU (Graphics Processing Unit) cannot be denied. One year ago, nVidia® overtook Intel® in market capitalization. And today, they have doubled their market cap lead over Intel®,  [as of July 2, 2021] USD$510.53 billion vs USD$229.19 billion.

Thus it is not surprising that storage architectures are changing from the CPU-centric paradigm to take advantage of the burgeoning prowess of the GPU. And 2 announcements in the storage news in recent weeks have caught my attention – Windows 11 DirectStorage API and nVidia® Magnum IO GPUDirect® Storage.

nVidia GPU

Exciting the gamers

The Windows DirectStorage API feature is only available in Windows 11. It was announced as part of the Xbox® Velocity Architecture last year to take advantage of the high I/O capability of modern day NVMe SSDs. DirectStorage-enabled applications and games have several technologies such as D3D Direct3D decompression/compression algorithm designed for the GPU, and SFS Sampler Feedback Streaming that uses the previous rendered frame results to decide which higher resolution texture frames to be loaded into memory of the GPU and rendered for the real-time gaming experience.

Continue reading