Flash Archives - Storage Gaga

Accelerated Data Paths of High Performance Storage is the Cornerstone of building AI

By cfheoh | September 16, 2024 - 7:30 am |September 16, 2024 100Gigabit Ethernet, Algorithm, Analytics, API, Appliance, Artificial Intelligence, Big Data, Cloud, Clusters, Containers, DDN, Deep Learning, Fibre Channel, Filesystems, Flash, High Performance Computing, Infiniband, Lustre, Machine Learning, Mellanox, Mellanox Technologies, Nutanix, NVMe, Parallel NFS, Performance Benchmark, Performance Caching, pNFS, RDMA, Scale-out architecture, Storage Optimization

Leave a comment

It has been 2 months into my new role at DDN as a Solutions Architect. With many revolving doors around me, I have been trying to find the essence, the critical cog of the data infrastructure that supports the accelerated computing of the Nvidia GPU clusters. The more I read and engage, a pattern emerged. I found that cog in the supercharged data paths between the storage infrastructure systems and the GPU clusters. I will share more.

To set the context, let me start with a wonderful article I read in CIO.com back in July 2024. It was titled “Storage: The unsung hero of AI deployments“. It was music to my ears because as a long-time practitioner in the storage technology industry, it is time the storage industry gets its credit it deserves.

What is the data path?

To put it simply, a Data Path, from a storage context, is the communication route taken by the data bits between the compute system’s processing and program memory and the storage subsystem. The links and the established sessions can be within the system components such as the PCIe bus or external to the system through the shared networking infrastructure.

High speed accelerated data paths

In the world of accelerated computing such as AI and HPC, there are additional, more advanced technologies to create even faster delivery of the data bits. This is the accelerated data paths between the compute nodes and the storage subsystems. Following on, I share a few of these technologies that are lesser used in the enterprise storage segment.

Continue reading →

The All-Important Storage Appliance Mindset for HPC and AI projects

By cfheoh | July 22, 2024 - 7:30 am |November 17, 2024 API, Appliance, Artificial Intelligence, BeeGFS, Big Data, Cloud, Clusters, Data Direct Networks, DDN, Deep Learning, Digital Transformation, Elastifile, Filesystems, Flash, High Performance Computing, Infiniband, iXsystems, Lustre, Machine Learning, MapReduce, NAS, NetApp, NFS, nVidia, NVMe, Object Storage, Parallel NFS, Performance Benchmark, Performance Caching, RDMA, Scale-out architecture, Software Defined Storage, Solid State Devices, Storage Optimization, ThinkParq, WekaIO

Leave a comment

I am strong believer of using the right tool to do the job right. I have said this before 2 years ago, in my blog “Stating the case for a Storage Appliance approach“. It was written when I was previously working for an open source storage company. And I am an advocate of the crafter versus assembler mindset, especially in the enterprise and high- performance storage technology segments.

I have joined DDN. Even with DDN that same mindset does not change a bit. I have been saying all along that the storage appliance model should always be the mindset for the businesses’ peace-of-mind.

My view of the storage appliance model began almost 25 years. I came into NAS systems world via Sun Microsystems®. Sun was famous for running NFS servers on general Sun Solaris servers. NFS services on Unix systems. Back then, I remember arguing with one of the Sun distributors about the tenets of running NFS over 100Mbit/sec Ethernet on Sun servers. I was drinking Sun’s Kool-Aid big time.

When I joined Network Appliance® (now NetApp®) in 2000, my worldview of putting software on general purpose servers changed. Network Appliance®, had one product family, the FAS700 (720, 740, 760) family. All NetApp® did was to serve NFS services in the beginning. They were the NAS filers and nothing else.

I was completed sold on the appliance way with NetApp®. Firstly, it was my very first time knowing such network storage services could be provisioned with an appliance concept. This was different from Sun. I was used to managing NFS exports on a Sun SPARCstation 20 to Unix clients in the network.

Secondly, my mindset began to shape that “you have to have the right tool to the job correctly and extremely well“. Well, the toaster toasts bread very well and nothing else. And the fridge (an analogy used by Dave Hitz, I think) does what it does very well too. That is what the appliance does. You definitely cannot grill a steak with a bread toaster, just like you can’t run an excellent, ultra-high performance storage services to serve the demanding AI and HPC applications on a general server platform. You have to have a storage appliance solution for High-Speed Storage.

That little Network Appliance® toaster award given out to exemplary employees stood vividly in my mind. The NetApp® tagline back then was “Fast, Simple, Reliable”. That solidifies my mindset for the high-speed storage in AI and HPC projects in present times.

DDN AI400X2 Turbo Appliance

Costs Benefits and Risks

I like to think about what the end users are thinking about. There are investments costs involved, and along with it, risks to the investments as well as their benefits. Let’s just simplify and lump them into Cost-Benefits-Risk analysis triangle. These variables come into play in the decision making of AI and HPC projects.

Continue reading →

As Disk Drive capacity gets larger (and larger), the resilient Filesystem matters

By cfheoh | May 30, 2022 - 8:00 am |May 30, 2022 Appliance, Backup, Ceph, CIFS, Clusters, Data Management, Data Protection, Disks, Filesystems, Flash, FreeNAS, Gluster, Hadoop, High Performance Computing, iXsystems, Joyent, Linux, Lustre, NetApp, Nexenta, NFS, OpenZFS, Oracle, RAID, Reliability, SMB, Snapshots, TrueNAS, Virtualization

2 Comments

I just got home from the wonderful iXsystems™ Sales Summit in Knoxville, Tennessee. The key highlight was to christian the opening of iXsystems™ Maryville facility, the key operations center that will house iX engineering, support and part of marketing as well. News of this can be found here.

iX datacenter in the new Maryville facility

Western Digital® has always been a big advocate of iX, and at the Summit, they shared their hard disk drives HDD, solid state drives SSD, and other storage platforms roadmaps. I felt like a kid a candy store because I love all these excitements in the disk drive industry. Who says HDDs are going to be usurped by SSDs?

Several other disk drive manufacturers, including Western Digital®, have announced larger capacity drives. Here are some news of each vendor in recent months

Other than the AFR (annualized failure rates) numbers published by Backblaze every quarter, the Capacity factor has always been a measurement of high interest in the storage industry.

Continue reading →

Do we still need FAST (and its cohorts)?

By cfheoh | February 1, 2021 - 9:00 am |January 22, 2021 Algorithm, Analytics, Dell, DellEMC, Disks, EMC, Filesystems, Flash, HP, HPE, IBM, ILM, Intel, iXsystems, RAID, Solid State Devices, Storage Optimization, Storage Tiering

Leave a comment

In a recent conversation with an iXsystems™ reseller in Hong Kong, the topic of Storage Tiering was brought up. We went about our banter and I brought up the inter-array tiering and the intra-array tiering piece.

After that conversation, I started thinking a lot about intra-array tiering, where data blocks within the storage array were moved between fast and slow storage media. The general policy was simple. Find all the least frequently access blocks and move them from a fast tier like the SSD tier, to a slower tier like the spinning drives with different RPM speeds. And then promote the data blocks to the faster media when accessed frequently. Of course, there were other variables in the mix besides storage media and speeds.

My mind raced back 10 years or more to my first encounter with Compellent and 3PAR. Both were still independent companies then, and I had my first taste of intra-array tiering

The original Compellent and 3PAR logos

I couldn’t recall which encounter I had first, but I remembered the time of both events were close. I was at Impact Business Solutions in their office listening to their Compellent pitch. The Kuching boys (thank you Chyr and Winston!) were very passionate in evangelizing the Compellent Data Progression technology.

At about the same time, I was invited by PTC Singapore GM at the time, Ken Chua to grace their new Malaysian office and listen to their latest storage vendor partnership, 3PAR. I have known Ken through my NetApp® days, and he linked me up Nathan Boeger, 3PAR’s pre-sales consultant. 3PAR had their Adaptive Optimization (AO) disk tiering and Dynamic Optimization (DO) technology.

Continue reading →

Trusting your storage – It’s not about performance

By cfheoh | December 28, 2020 - 9:00 am |December 26, 2020 Acquisition, Appliance, Backup, Business Continuity, Cloud, Data, Data Corruption, Data Protection, Elastifile, EMC, Filesystems, Flash, NetApp, OpenZFS, RAID

6 Comments

I have taken some downtime from my blog since late October. Part of my “hiatus” was my illness which had affected my right kidney but I am happy to announce that I am well again. During this period, I spent a lot of the time reading the loads of storage technologies announcements and their marketing calls and almost every single one of them touts Performance as if it is the single “sellable” feature of the respective storage vendor. None ever positions data integrity and the technology behind it in what I believe as the most important and most fundamental feature of any storage technology – Reading the right data exactly it was written into the storage array.

[ Note: Data integrity is even more critical in cloud storage and data corruption, especially the silent ones are even more acute in the clouds ]

Sure, this fundamental feature sounds like it is a given thing in any storage array but believe me, there are enterprise storage arrays which have failed to deliver this simple feature properly. I have end users coming to me through out my storage career that they have database corruption, or file corruption and unable to access their data in an acceptable manner. Data corruption is real folks!

Data corruption.

After several weeks of reading these stuff, I got jaded with so many storage vendors playing leapfrog announcements with their millions of IOPS boasts.

The 3 legged stool

Rewind to circa 2012, just about the time when EMC® acquired XtremIO™. XtremIO™ was a nascent All-Flash startup, and many, including yours truly, really saw the EMC® acquisition was about a high performant storage array. I was having an email conversation with Shahar Frank, one of the co-founders of XtremIO™, and expressing my views about their performance. What Shahar replied surprised me.

The fundamentals of the strength of a storage array was a like a 3-legged stool. 2 legs of the stool would be Performance, and Protection, but with 2 legs, the person sitting on the stool would fall. The 3rd leg would stabilize the balance of the stool, and this 3rd leg was Reliability. This stumped me because XtremIO™’s most sellable feature was Performance. But the wisdom of Shahar pointed to Reliability, the least exciting feature and the most dull of the 3. He was brilliant, of course and went on to found ElastiFile (acquired by Google™), but that’s another story for another day.

Continue reading →

OpenZFS 2.0 exciting new future

By cfheoh | October 19, 2020 - 7:38 pm |October 19, 2020 Clusters, Datto, deduplication, Deduplication, Delphix, Disks, Filesystems, Flash, FreeNAS, High Performance Computing, IBM, Intel, iXsystems, Joyent, Linux, Lustre, NAS, Nexenta, Oracle, Panasas, Panzura, Performance Caching, RAID, Reliability, Snapshots, Software Defined Storage, Solid State Devices

2 Comments

The OpenZFS (virtual) Developer Summit ended over a weekend ago. I stayed up a bit (not much) to listen to some of the talks because it started midnight my time, and ran till 5am on the first day, and 2am on the second day. Like a giddy schoolboy, I was excited, not because I am working for iXsystems™ now, but I have been a fan and a follower of the ZFS file system for a long time.

History wise, ZFS was conceived at Sun Microsystems in 2005. I started working on ZFS reselling Nexenta in 2009 (my first venture into business with my company nextIQ) after I was professionally released by EMC early that year. I bought a Sun X4150 from one of Sun’s distributors, and started creating a lab server. I didn’t like the workings of NexentaStor (and NexentaCore) very much, and it was priced at 8TB per increment. Later, I started my second company with a partner and it was him who showed me the elegance and beauty of ZFS through the command lines. The creed of ZFS as a volume and a file system at the same time with the CLI had an effect on me. I was in love.

OpenZFS Developer Summit 2020 Logo

Exciting developments

Among the many talks shared in the OpenZFS Developer Summit 2020 , there were a few ideas and developments which were exciting to me. Here are 3 which I liked and I provide some commentary about them.

Block Reference Table
dRAID (declustered RAID)
Persistent L2ARC

Continue reading →

The Edge is coming! The Edge is coming!

By cfheoh | October 12, 2020 - 9:15 am |October 11, 2020 100Gigabit Ethernet, Analytics, Big Data, Containers, Data, Deep Learning, Edge Computing, Flash, Industry 4.0, InfluxDB, Linux, Machine Learning, Mellanox, Mellanox Technologies, Minio, nVidia, NVMe, Pravega, SNIA, Solid State Devices

Leave a comment

Actually, Edge Computing is already here. It has been here on everyone’s lips for quite some time, but for me and for many others, Edge Computing is still a hodgepodge of many things. The proliferation of devices, IoT, sensor, end points being pulled into the ubiquitous term of Edge Computing has made the scope ever changing, and difficult to pin down. And it is this proliferation of edge devices that will generate voluminous amount of data. Obvious questions emerge:

How to do you store all the data?
How do you process all the data?
How do you derive competitive value from the data from these edge devices?
How do you securely transfer and share the data?

From the storage technology perspective, it might be easier to observe what are the traits of the data generated on the edge device. In this blog, we also observe what could some new storage technologies out there that could be part of the Edge Computing present and future.

Edge Computing overview – Cloud to Edge to Endpoint

Storage at the Edge

The mantra of putting compute as close to the data and processing it where it is stored is the main crux right now, at least where storage of the data is concerned. The latency to the computing resources on the cloud and back to the edge devices will not be conducive, and in many older settings, these edge devices in factory may not be even network enabled. In my last encounter several years ago, there were more than 40 interfaces, specifications and protocols, most of them proprietary, for the edge devices. And there is no industry wide standard for these edge devices too.

Continue reading →

Kubernetes Persistent Storage Managed Well

By cfheoh | October 5, 2020 - 8:25 am |October 5, 2020 100Gigabit Ethernet, Business Continuity, Cloud, Clusters, Containers, Flash, iSCSI, Kubernetes, Linux, Performance Caching, Scale-out architecture, Software Defined Storage, Storpool, Virtualization

2 Comments

[ Disclosure: This is a StorPool Storage sponsored blog ]

StorPool Storage – Distributed Storage

There is a rapid adoption of Kubernetes in the enterprise and in the cloud. The push for digital transformation to modernize businesses for a cloud native world in the next decade has lifted both containerized applications and the Kubernetes container orchestration platform to an unprecedented level. The application landscape, especially the enterprise, is looking at Kubernetes to address these key areas:

Scale
High performance
Availability and Resiliency
Security and Compliance
Controllable Costs
Simplified

The Persistent Storage Question

Enterprise applications such as relational databases, email servers, and even the cloud native ones like NoSQL, analytics engines, demand a single data source of truth. Fundamentals properties such as ACID (atomicity, consistency, isolation, durability) and BASE (Basic Availability, Soft State, Eventual Consistency) have to have persistent storage as the foundational repository for the data. And thus, persistent storage have rallied under Container Storage Interface (CSI), and fast becoming a de facto standard for Kubernetes. At last count, there are more than 80 CSI drivers from 60+ storage and cloud vendors, each providing block-level storage to Kubernetes pods.

However, at this juncture, Kubernetes is still very engineering-centric. Persistent storage is equally as challenging, despite all the new developments and hype around it.

Continue reading →

Intel is still a formidable force

By cfheoh | August 17, 2020 - 9:15 am |August 17, 2020 Algorithm, Analytics, Artificial Intelligence, Big Data, Clusters, Composable Infrastructure, Cray Inc, Deep Learning, Disks, Edge Computing, Filesystems, Flash, High Performance Computing, Industry 4.0, Intel, IoT, Linux, Machine Learning, Performance Benchmark, Performance Caching, Scale-out architecture, SNIA, Solid State Devices, Storage Field Day, Storage Market Share, Storage Optimization, Storage Tiering, Tech Field Day

1 Comment

It is easy to kick someone who is down. Bad news have stronger ripple effects than the good ones. Intel® is going through a rough patch, and perhaps the worst one so far. They delayed their 7nm manufacturing process, one which could have given Intel® the breathing room in the CPU war with rival AMD. And this delay has been pushed back to 2021, possibly 2022.

Intel Apple Collaboration and Partnership started in 2005

Their association with Apple® is coming to an end after 15 years, and more security flaws surfaced after the Spectre and Meltdown debacle. Extremetech probably said it best (or worst) last month:

We’ve never seen Intel® struggle like this

If we look deeper (and I am sure you have), all these negative news were related to their processors. Intel® is much, much more than that.

Their Optane™ storage prowess

I have years of association with the folks at Intel® here in Malaysia dating back 20 years. And I hardly see Intel® beating it own drums when it comes to storage technologies but they are beginning to. The Optane™ revolution in storage, has been a game changer. Optane™ enables the implementation of persistent memory or storage class memory, a performance tier that sits between DRAM and the SSD. The speed and more notable the latency of Optane™ are several times faster than the Enterprise SSDs.

Intel pyramid of tiers of storage medium

If you want to know more about Optane™’s latency and speed, here is a very geeky article from Intel®:

Restoring the Balance between Bandwidth and Latency

The list of storage vendors who have embedded Intel® Optane™ into their gears is long. Vast Data, StorOne™, NetApp® MAX Data, Pure Storage® DirectMemory Modules, HPE 3PAR and Nimble Storage, Dell Technologies PowerMax, PowerScale, PowerScale and many more, cement Intel® storage prowess with Optane™.

3D Xpoint, the Phase Change Memory technology behind Optane™ was from the joint venture between Intel® and Micron®. That partnership was dissolved in 2019, but it has not diminished the momentum of next generation Optane™. Alder Stream and Barlow Pass are going to be Gen-2 SSD and Persistent Memory DC DIMM respectively. A screenshot of the Optane™ roadmap appeared in Blocks & Files last week.

Intel next generation Optane roadmap

Continue reading →

Down the rabbit hole with Kubernetes Storage

By cfheoh | May 19, 2020 - 9:30 am |May 16, 2020 Acquisition, Algorithm, Amazon Web Services, Analytics, API, Artificial Intelligence, Ceph, Cloud, Clusters, Containers, Data Management, Edge Computing, Elastifile, Filesystems, Flash, Google, Hyperconvergence, Kubernetes, Linux, Minio, NFS, Object Storage

Leave a comment

Kubernetes is on fire. Last week VMware® released the State of Kubernetes 2020 report which surveyed companies with 1,000 employees and above. Results were not surprising as the adoptions of this nascent technology are booming. But persistent storage remained the nagging concern for the Kubernetes serving the infrastructure resources to applications instances running in the containers of a pod in a cluster.

The standardization of storage resources have settled with CSI (Container Storage Interface). Storage vendors have almost, kind of, sort of agreed that the API objects such as PersistentVolumes, PersistentVolumeClaims, StorageClasses, along with the parameters would be the way to request the storage resources from the Pre-provisioned Volumes via the CSI driver plug-in. There are already more than 50 vendor specific CSI drivers in Github.

Kubernetes and the CSI (Container Storage Interface) logos

The CSI plug-in method is the only way for Kubernetes to scale and keep its dynamic, loadable storage resource integration with external 3rd party vendors, all clamouring to grab a piece of this burgeoning demands both in the cloud and in the enterprise.

Continue reading →

Category Archives: Flash