Vast Data – Storage Gaga

Intelligent Data Movement and Data Placement dictate the future of AI Data Infrastructure

By cfheoh | July 29, 2025 - 7:48 am |July 29, 2025 100Gigabit Ethernet, Algorithm, Analytics, Artificial Intelligence, BeeGFS, Big Data, Big Switch Networks, Broadcom, compression, Computational Storage, Containers, CXL, Data Direct Networks, Data Management, DDN, Filesystems, Flash, Hammerspace, High Performance Computing, Machine Learning, NFS, nVidia, NVMe, Parallel NFS, Performance Benchmark, Performance Caching, pNFS, Scale-out architecture, Software Defined Storage, Storage Optimization, Storage Tiering, Vast Data, WekaIO

1 Comment

I have been reading a couple of articles over the weekend which started by placing the weights of outdated networking infrastructure slowing down AI ambitions. The 2 articles are:

The AI Infrastructure bottleneck no one talks about (which turned out to be a not-so-subtle play for Netris, a secure multi-tenant network provisioner technology).
Data Infrastructure: The missing link in successful AI adoption (a more subtle introduction of Indicium, an AI data services company).

I did not fully agree that networking infrastructure is the main inhibitor of AI ambitions per se. Not from the experiences and the present development in high performance networking of what I know so far. In fact, AI networking infrastructure has been growing leaps and bounds, laying down ultra-high throughput plumbing between the GPUs (inadvertently up the stack to the AI models and applications) and the data storage infrastructure.

The NVIDIA-heavy GPU compute infrastructure is of course, dominated by its own NVIDIA’s networking infrastructure. Both NVIDIA Spectrum (Ethernet) and Quantum (InfiniBand), BlueField (data processing units), ConnectX and LinkX are the mainstays of DGX Cloud, a big part of NVIDIA NCPs as well.

In fact, in one of DDN’s NCP customers, I have seen a 10-node DDN EXAscaler cluster deliver almost 1.1TB/sec read and 750GB/sec write throughput to the GPU compute cluster, out-of-the-box, all with 200Gbps networking gear.

Continue reading →

Rethinking Storage OKRs for AI Data Infrastructure – Part 1

By cfheoh | January 6, 2025 - 8:00 am |January 13, 2025 Algorithm, Analytics, API, Appliance, Artificial Intelligence, Big Data, Cloud, Clusters, Composable Infrastructure, Containers, Data, Data Direct Networks, Data Management, DDN, Deep Learning, Dell, Digital Transformation, Filesystems, HDS, High Performance Computing, HPE, Infiniband, Kubernetes, Linux, Machine Learning, Mellanox Technologies, NetApp, nVidia, NVMe, Object Storage, Parallel NFS, Performance Benchmark, Pure Storage, Reliability, Scale-out architecture, Software Defined Storage, Storage Optimization, Vast Data, WekaIO

Leave a comment

[ Preamble: This analysis focuses on my own journey as I incorporate my past experiences into this new market segment called AI Data Infrastructure, and gaining new ones.

There are many elements of HPC (High Performance Computing) at play here. Even though things such as speeds and feeds, features and functions crowd many conversations, as many enterprise storage vendors like to do, these conversations, in my opinion, are secondary. There are more vital and important operational technology and technical elements that an organization has to consider prudently, vis-a-vis to ROIs (returns of investments). They involve asking the hard questions beyond the marketing hype and fluff. I call these elements of consideration Storage Objectives and Key Results (OKRs) for AI Data Infrastructure.

I had to break this blog into 2 parts. It has become TL;DR-ish. This is Part 1 ]

I have just passed my 6-month anniversary with DDN. Coming into the High Performance Storage System (HPSS) market segment, with the strong focus on the distributed parallel filesystem of Lustre®, there was a high learning curve for me. I spend over 3 decades in Enterprise Storage, with some of the highest level of storage technologies there were in that market segment. And I have already developed my own approach to enterprise storage, based on the A.P.P.A.R.M.S.C.. That was already developed and honed from 25 years ago.

The rapid adoption of AI has created a technology paradigm shift. Artificial Intelligence (AI) came in and blurred many lines. It also has been evolving my thinking when it comes to storage for AI. There is also a paradigm shift in my thoughts, opinions and experiences as well.

AI has brought HPSS technologies like Lustre® in DDN EXAscaler platform , proven in the Supercomputing world, to a new realm – the AI Data Infrastructure market segment. On the other side, many enterprise storage vendors aspire to be a supplier to the AI Data Infrastructure opportunities as well. This convergence from the top storage performers for Supercomputing, in the likes of DDN, IBM® (through Storage Scale), HPE® (through Cray, which by-the-way often uses the open-source Lustre® edition in its storage portfolio), from the software-defined storage players in Weka IO, Vast Data, MinIO, and from the enterprise storage array vendors such as NetApp®, Pure Storage®, and Dell®.

[ Note that I take care not to name every storage vendor for AI because many either do OEMs or repacking and rebranding of SDS technology into their gear such as HPE® GreenLake for Files and Hitachi® IQ. You can Google to find out who the original vendors are for each respectively. There are others as well. ]

In these 3 simplified categories (HPSS, SDS, Enterprise Storage Array), I have begun to see a pattern of each calling its technology as an “AI Data Infrastructure”. At the same time, I am also developing a new set of storage conversations for the AI Data Infrastructure market segment, one that is based on OKRs (Objectives and Key Results) rather than just features, features and more features that many SDS and enterprise storage vendors like to tout. Here are a few thoughts that we should look for when end users are considering a high-speed storage solution for their AI journey.

AI Data Infrastructure

GPU is king

In the AI world, the GPU infrastructure is the deity at the altar. The utilization rate of the GPUs is kept at the highest to get the maximum compute infrastructure return-on-investment (ROI). Keeping the GPUs resolutely busy is a must. HPSS is very much part of that ecosystem.

These are a few OKRs I would consider the storage or data infrastructure for AI.

Reliability
Speed
Power Efficiency
Security

Let’s look at each one of them from the point of view of a storage practitioner like me.

Continue reading →

Storage IO straight to GPU

By cfheoh | July 5, 2021 - 9:00 am |July 3, 2021 100Gigabit Ethernet, Algorithm, Analytics, API, Artificial Intelligence, Composable Infrastructure, compression, CXL, Deduplication, Deep Learning, Filesystems, High Performance Computing, Hyperconvergence, Machine Learning, Mellanox, Mellanox Technologies, Microsoft, nVidia, NVMe, RDMA, Vast Data, WekaIO

Exciting the gamers

The Windows DirectStorage API feature is only available in Windows 11. It was announced as part of the Xbox® Velocity Architecture last year to take advantage of the high I/O capability of modern day NVMe SSDs. DirectStorage-enabled applications and games have several technologies such as D3D Direct3D decompression/compression algorithm designed for the GPU, and SFS Sampler Feedback Streaming that uses the previous rendered frame results to decide which higher resolution texture frames to be loaded into memory of the GPU and rendered for the real-time gaming experience.

Continue reading →

Is Software Defined right for Storage?

By cfheoh | April 19, 2021 - 9:00 am |April 18, 2021 Acquisition, API, Appliance, ATA over Ethernet, Ceph, Composable Infrastructure, Coraid, Cray Inc, Data Direct Networks, Deduplication, deduplication, Filesystems, FreeNAS, Gluster, HDS, High Performance Computing, HPE Simplivity, Hyperconvergence, Infiniband, Intel, iXsystems, Liqid, Mellanox Technologies, Minio, NetApp, Nexenta, Nutanix, nVidia, OpenIO, Openstack, Oracle, PCIe, Performance Benchmark, Performance Caching, Pure Storage, RAID, RDMA, Redhat, Simplivity, SNIA, SoftIron, Software Defined Storage, Solaris, Solid State Devices, Storage Optimization, TrueNAS, Vast Data, Virtualization

The prudence needed for storage technology companies

By cfheoh | August 3, 2020 - 9:15 am |August 2, 2020 Acquisition, Actifio, Amazon Web Services, Apple, Blitzscaling, Cloud, Cloudian, Clumio, Cohesity, Data Direct Networks, Datrium, Digital Transformation, Druva, Falconstor, Forrester, Gartner, Google, Huawei, Hyperconvergence, Kaminario, Nasuni, Nutanix, Pure Storage, Qumulo, Rubrik, SolidFire, Tegile, Tintri, Vast Data, Veeam, Veritas, Violin Memory, VMware, WekaIO

2 Comments

Blitzscaling has been on my mind a lot. Ever since I discovered that word a while back, it has returned time and time again to fill my thoughts. In the wake of COVID-19, and in the mire of this devastating pandemic, is blitzscaling still the right strategy for this generation of storage technology, hyperconverged, data management and cloud storage startups?

What the heck is Blitzscaling?

For the uninformed, here’s a video of Reid Hoffman, co-founder of Linked and a member of the Paypal mafia, explaining Blitzscaling.

Blitzscaling is about hyper growing, scaling ultra fast and rocketing to escape velocity, at the expense of things like management efficiency, financial prudence, profits and others. While this blog focuses on storage companies, blitzscaling is probably most recognizable in the massive expansion of Uber (and contraction) a few years ago. In the US, the ride hailing war is between Uber and Lyft, but over here in South East Asia, just a few years back, it was between Uber and Grab. In China it was Uber and Didi.

From the storage angle, 2 segments exemplified the blitzscaling culture between 2015 and 2020.

All Flash Startups
Hyper Converged Infrastructure Startups

Continue reading →

Storage Performance Considerations for AI Data Paths

By cfheoh | June 17, 2019 - 10:50 am |June 17, 2019 100Gigabit Ethernet, Algorithm, Analytics, API, Artificial Intelligence, Big Data, Cloud, Composable Infrastructure, Data, Data Fabric, Data Management, Data Privacy, Data Security, Digital Transformation, Drivescale, E8 Storage, Edge Computing, Elastifile, Excelero, Filesystems, High Performance Computing, Hyperconvergence, Industry 4.0, Infiniband, Intel, Liqid, Lustre, Machine Learning, Mellanox Technologies, NVMe, Object Storage, Performance Benchmark, Performance Caching, Quantum Corporation, RDMA, Software-defined Datacenter, Storage Optimization, Storage Tiering, ThinkParq, Vast Data, Virtualization, WekaIO

1 Comment

The hype of Deep Learning (DL), Machine Learning (ML) and Artificial Intelligence (AI) has reached an unprecedented frenzy. Every infrastructure vendor from servers, to networking, to storage has a word to say or play about DL/ML/AI. This prompted me to explore this hyped ecosystem from a storage perspective, notably from a storage performance requirement point-of-view.

One question on my mind

There are plenty of questions on my mind. One stood out and that is related to storage performance requirements.

Reading and learning from one storage technology vendor to another, the context of everyone’s play against their competitors seems to be “They are archaic, they are legacy. Our architecture is built from ground up, modern, NVMe-enabled“. And there are more juxtaposing, but you get the picture – “We are better, no doubt“.

Are the data patterns and behaviours of AI different? How do they affect the storage design as the data moves through the workflow, the data paths and the lifecycle of the AI ecosystem?

Continue reading →

VAST Data must be something special

By cfheoh | February 28, 2019 - 9:59 pm |March 1, 2019 Analytics, Artificial Intelligence, Big Data, CIFS, Cloud, Clusters, Composable Infrastructure, Data, Data Fabric, Data Management, Data Protection, Deduplication, Edge Computing, Filesystems, High Performance Computing, Infiniband, Machine Learning, NAS, NFS, NVMe, Object Storage, Scale-out architecture, Software Defined Storage, Tech Field Day, Vast Data, XtremIO

3 Comments

[Preamble: I have been invited by GestaltIT as a delegate to their Tech Field Day for Storage Field Day 18 from Feb 27-Mar 1, 2019 in the Silicon Valley USA. My expenses, travel and accommodation were covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

Vast Data coming out bash!

The delegates of Storage Field Days were always the lucky bunch. We have witnessed several storage technology companies coming out of stealth at these Tech Field Days. The recent ones in memory for me were Excelero and Hammerspace. But to have one where the venerable storage doyen, Mr. Howard Marks, Vast Data new tech evangelist, to introduce the deep dive of Vast Data technology was something special.

For those who knew Howard, he is fiercely independent, very storage technology smart, opinionated and not easily impressed. As a storage technology connoisseur myself, I believe Howard must have seen something special in Vast Data. They must be doing something extremely unique and impressive that someone like Howard could not resist, and made him jump to the vendor side. This sets the tone of my blog.

Category Archives: Vast Data

Intelligent Data Movement and Data Placement dictate the future of AI Data Infrastructure