Linux Archives - Storage Gaga

The AI Platformization of Storage – The Data Intelligence Platform

By cfheoh | March 3, 2025 - 7:30 am |March 3, 2025 Algorithm, Analytics, API, Artificial Intelligence, Big Data, Cloud, Clusters, Containers, Data Direct Networks, Data Fabric, Data Governance, Data Management, Data Security, DDN, Deep Learning, Digital Transformation, eDiscovery, Filesystems, High Performance Computing, Infiniband, Kubernetes, Linux, Lustre, Machine Learning, Mellanox Technologies, nVidia, Object Storage, Openstack, RDMA, Scale-out architecture, Software Defined Storage, Software-defined Datacenter, Storage Optimization

Paradigm shift for Data.

For the longest time, networked storage technology has been about data sharing, be it blocks, files or objects. The data from these protocols is delivered over the network, mostly over Fibre Channel and/or Ethernet (although I remembered implementing NFS over Asynchronous Transfer Mode at Sarawak Shell in East Malaysia), in a client-server fashion.

By late 2000s onwards, unified storage or multi-protocol storage (where the storage array is able to served all 3 SAN, NAS and S3 services) was all the rage. All the prominent enterprise storage vendors had a solution or two in their solutions portfolio. I started viewing networked storage as a Data Services Platform which I started explaining it in 2017. Within the data services platform, various features revolve around my A.P.P.A.R.M.S.C. framework (I crafted the initial framework in 2000, thanks to Jon Toigo‘s book – The Holy Grail of Data Management). This framework and the approach I used for my consulting and analyst work worked well and is still relevant, even after 25 years.

But AI is changing the data landscape. AI is changing the way data is consumed and processed through the networks between the compute layer and the storage layer. It is indeed, for me, a paradigm shift of data, and the storage layer, better known as AI Data Infrastructure now, is shifting as well. And this shift will accelerate the exponential growth in innovations, with AI and super-charged data leading the way.

DDN Infinia Data Intelligence Platform (screencapture from DDN Beyond Artificial webinar)

Continue reading →

Rethinking Storage OKRs for AI Data Infrastructure – Part 1

By cfheoh | January 6, 2025 - 8:00 am |January 13, 2025 Algorithm, Analytics, API, Appliance, Artificial Intelligence, Big Data, Cloud, Clusters, Composable Infrastructure, Containers, Data, Data Direct Networks, Data Management, DDN, Deep Learning, Dell, Digital Transformation, Filesystems, HDS, High Performance Computing, HPE, Infiniband, Kubernetes, Linux, Machine Learning, Mellanox Technologies, NetApp, nVidia, NVMe, Object Storage, Parallel NFS, Performance Benchmark, Pure Storage, Reliability, Scale-out architecture, Software Defined Storage, Storage Optimization, Vast Data, WekaIO

GPU is king

In the AI world, the GPU infrastructure is the deity at the altar. The utilization rate of the GPUs is kept at the highest to get the maximum compute infrastructure return-on-investment (ROI). Keeping the GPUs resolutely busy is a must. HPSS is very much part of that ecosystem.

These are a few OKRs I would consider the storage or data infrastructure for AI.

Reliability
Speed
Power Efficiency
Security

Let’s look at each one of them from the point of view of a storage practitioner like me.

Continue reading →

Deploying a MinIO SNMD Object Storage Server in TrueNAS SCALE

By cfheoh | May 27, 2024 - 7:30 am |May 27, 2024 Appliance, Cloud, Clusters, Containers, Disks, Docker, Filesystems, FreeNAS, iXsystems, Kubernetes, Linux, Machine Learning, Minio, Object Storage, OpenZFS, RAID, TrueNAS, Virtualization

3 Comments

[ Preamble ] This deployment of MinIO SNMD (single node multi drive) object storage server on TrueNAS® SCALE 24.04 (codename “Dragonfish”) is experimental. I am just deploying this in my home lab for the fun of it. Do not deploy in any production environment.

I have been contemplating this for quite a while. Which MinIO deployment mode on TrueNAS® SCALE should I work on? For one, there are 3 modes – Standalone, SNMD (Single Node Multi Drives) and MNMD (Multi Node Multi Drives). Of course, the ideal lab experiment is MNMD deployment, the MinIO cluster, and I am still experimenting this on my meagre lab resources.

In the end, I decided to implement SNMD since this is, most likely, deployed on top of a TrueNAS® SCALE storage appliance instead an x-86 bare-metal or in a Kubernetes cluster on Linux systems. Incidentally, the concept of MNMD on top of TrueNAS® SCALE is “Kubernetes cluster”-like albeit a different container platform. At the same time, if this is deployed in a TrueNAS® SCALE Enterprise, a dual-controller TrueNAS® storage appliance, it will take care of the “MinIO nodes” availability in its active-passive HA architecture of the appliance. Otherwise, it can be a full MinIO cluster spread and distributed across several TrueNAS storage appliances (minimum 4 nodes in a 2+2 erasure set) in an MNMD deployment scheme.

Ideally, the MNMD deployment should look like this:

MinIO distributed multi-node cluster architecture (credit: MinIO)

Continue reading →

Enhancing NAS client resiliency and performance with SMB Multichannel and NFS nconnect

By cfheoh | May 13, 2024 - 7:30 am |May 11, 2024 100Gigabit Ethernet, Azure NetApp Files, CIFS, Clusters, Filesystems, FreeNAS, Gluster, High Performance Computing, Huawei, iXsystems, Linux, Microsoft, Microsoft Azure, NAS, NetApp, NFS, NFS+, pNFS, Pure Storage, Ryuusi, SMB, Software Defined Storage, Storage Optimization, TrueNAS, VMware

2 Comments

NAS (network attached storage) is obviously the file-level workhorse for shared resources in the network of any organization. SMB (server message block) for Windows environments and NFS (network file system) for Linux platforms are the 2 most prominent protocols that rule the NAS world. Of course we have SMB implementations in the form of Samba and others in non-Windows, Linux and NFS implementations in Windows as well.

As the versions of both network file sharing protocols iterated, present versions of SMB v3.x and NFS v4.x (NFS v3 on the supported Linux kernel version) on the client-side have evolved well. Both now have enhanced resiliency and performance improvement features. And there is an underlying similarity of both implementations. This blog looks at the client-side architectures of both.

One TCP connection

NAS is a client-server architecture. Over the network, NAS clients (SMB or NFS) access their corresponding NAS server(s) – SMB or NFS server(s) – through the TCP/IP network.

NAS client-server architecture (Credit: https://hypertecsp.com/en-CA/knowledge-base/nas-vs-san/)

One very important key starting point to note is the use of one TCP connection per NAS client to the NAS server relationship. For both SMB and NFS, there is just one TCP link between client and the server even if there are several SMB mapped shares or NFS mount points respectively on the clients.

For a long time, this one TCP connection is sufficient for the NAS traffic. But as the network file accesses grow, this connection becomes both a single point of failure as well as a performance bottleneck.

Continue reading →

Proxmox storage with TrueNAS iSCSI volumes

By cfheoh | November 6, 2023 - 7:00 am |November 7, 2023 Appliance, Business Continuity, Data Availability, Filesystems, FreeNAS, iSCSI, iXsystems, Kubernetes, Linux, OpenZFS, Proxmox, RAID, Reliability, SCSI, Software Defined Storage, Storage Area Network, TrueNAS, Virtualization, VMware

7 Comments

A few weeks ago, I decided to wipe clean my entire lab setup running Proxmox 6.2. I wanted to connect the latest version of Proxmox VE 8.0-2 using iSCSI LUNs from the TrueNAS® system I have with me. I thought it would be fun to have the configuration steps and the process documented. This is my journal on how to provision a TrueNAS® CORE iSCSI LUN to Proxmox storage. This iSCSI volume in Proxmox is where the VMs will be installed into.

Here is a simplified network diagram of my setup but it will be expanded to a Proxmox cluster in the future with the shared storage.

Proxmox and TrueNAS network setup

Preparing the iSCSI LUN provisioning

The iSCSI LUN (logical unit number) is provisioned as a logical disk volume to the Proxmox node, where the initiator-target relationship and connection are established.

This part assumes that a zvol has been created from the zpool. At the same time, the IQN (iSCSI Qualified Name) should be known to the TrueNAS® storage as it establishes the connection between Proxmox (iSCSI initiator) and TrueNAS (iSCSI target).

The IQN for Proxmox can be found by viewing the content of the /etc/iscsi/initiatorname.iscsi within the Proxmox shell as shown in the screenshot below.

Where to find the Proxmox iSCSI IQN

The green box shows the IQN number of the Proxmox node that starts with iqn.year-month.com.domain:generated-hostname. This will be used during the iSCSI target portal configuration in the TrueNAS® webGUI.

Continue reading →

OpenZFS dRAID has risen!

By cfheoh | October 14, 2023 - 2:25 pm |October 14, 2023 Appliance, Clusters, Data Availability, Delphix, Disks, Filesystems, FreeNAS, High Performance Computing, Hyperconvergence, Intel, iXsystems, Linux, OpenZFS, RAID, Software Defined Storage, TrueNAS

Knowing RAID resilvering

RAID rebuilding or reconstruction is a painful and potentially risky process. In OpenZFS and ZFS speak, this process is called resilvering. In simple laymen terms, when a drive (or drives) failed in a parity-based RAID volume (eg. RAID-Z1 or RAID-Z2 vdev), the data which was previously in the failed drive is recreated in the newly integrated spare drive. The structural integrity of the RAID volume (and the storage pool) is preserved but the data that was lost is painstakingly remade through the mathematical algorithm of the parity function of the RAID volume.

When hard disk drives were small in capacity like 2TB or less, the RAID resilvering process was probably faster to complete, returning the parity RAID volume to a normal, online state. But today, drives are 22TB and higher, leaving the traditional RAID resilvering process to take days and even weeks. This leads the RAID volume vulnerable to another possible drive failure, weakening the integrity of the RAID volume. Even worse, most of modern day storage arrays have many disk drives, into the thousands even. And yes, solid state drives would probably be faster in the resilvering, but the same mechanics pretty much apply in OpenZFS.

At the same time, the spare drives are assigned physically and designated to the OpenZFS storage pool, and are not part of the vdev until the resilvering process kicks in.

Yes, this is pretty much a physical process that takes time, computing resources and patience. Note the operative word of “physical” here.

dRAID resilvering

dRAID speeds up the RAID resilvering process several folds, returning the RAID volume (or vdev) much faster than traditional OpenZFS RAID resilvering process. It uses a logical (as opposed to physical) RAID layout concept and uses “logical spare drives”. Thus, there will be many spares “blocks” distributed across the entire dRAID zpool, as shown in the diagram below.

Traditional RAID vdev vs dRAID vdev

Continue reading →

Open Source Storage and Data Responsibility

By cfheoh | September 2, 2023 - 7:08 am |September 2, 2023 Amazon Web Services, Appliance, Business Continuity, Data, Data Archiving, Data Availability, Data Corruption, Data Management, Data Protection, Data Security, Digital Transformation, Disaster Recovery, FreeNAS, iXsystems, Linux, Security, Software Defined Storage, TrueNAS

1 Comment

There was a Super Blue Moon a few days ago. It was a rare sky show. Friends of mine who are photo and moon gazing enthusiasts were showing off their digital captures online. One ignorant friend, who was probably a bit envious of the other people’s attention, quipped that his Oppo Reno 10 Pro Plus can take better pictures. Oppo Reno 10 Pro Plus claims 3x optical zoom and 120x digital zoom. Yes, 120 times!

Yesterday, a WIRED article came out titled “How Much Detail of the Moon Can Your Smartphone Really Capture?” It was a very technical article. I thought the author did an excellent job explaining the physics behind his notes. But I also found the article funny, flippant even, when I juxtaposed this WIRED article to what my envious friend was saying the other day about his phone’s camera.

Super Blue Moon 2023

Open Source storage expectations and outcomes

I work for iXsystems™. Open Source has been its DNA for over 30 years. Similarly, I have also worked on Open Source (decades before it was called open source) in my home labs ever since I entered the industry. I had SoftLanding Linux System 3.5″ diskette (Linux kernel 0.99), and I bought a boxed set of FreeBSD OS from Walnut Creek (photo below). My motivation was to learn as much as possible about information technology world because I was making my first steps into building my career (I was also quietly trying to prove my father wrong) in the IT industry.

FreeBSD Boxed Set (circa 1993)

Open source has democratized technology. It has placed the power of very innovative technology into the hands of the common people With Open Source, I see the IT landscape changing as well, especially for home labers like myself in the early years. Social media platforms, FAANG (Facebook, Apple, Amazon, Netflix, Google), etc, etc, have amplified that power (to the people). But with that great power, comes great responsibility. And some users with little technology background start to have hallucinated expectations and outcomes. Just like my friend with the “powerful” Oppo phone.

Likewise, in my world, I have plenty of anecdotes of these types of open source storage users having wild expectations, but little skills to exact the reality.

Continue reading →

Open Source on my mind

By cfheoh | July 25, 2022 - 6:30 am |July 25, 2022 Ceph, Cloud, Decentralized Storage, Digital Transformation, FreeNAS, iXsystems, Kubernetes, Linux, Minio, NFS, Nutanix, Object Storage, Openstack, OpenZFS, TrueNAS

2 Comments

Last week was cropped with topics around Open Source software. I want to voice my opinions here (with a bit of ranting) and hoping not to rouse many abhorrent comments from different parties and views. This blog is to create conversations, even controversial ones, but we must first agree that there will be disagreements. We must accept disagreements as part of this conversation.

In my 30 years career, Open Source has been a big part of my development and progress. The ideas of freely using (certain) software without any licensing implications and these software being openly available were not always welcomed, as they are now. I think the Open Source revolution has created an innovation movement that is still going strong, and it has not only permeated completely into the IT industry, Open Source has also now in almost every part of the technology-based industries as well. The Open Source influence is massive.

Open Source word cloud

In the beginning

In the beginning, in my beginning in 1992, the availability of software and its source codes was a closed one. Coming from a VAX/VMS background (I was a system admin in my mathematics department’s mini computers), Unix liberated my thinking. The final 6 months in the university was systems programming in C, and it completely changed how I wanted my career to shape. The mantra of “Free as in Freedom” in General Public License GPL (which I got know of much later) boded well with my own tenets in life.

If closed source development models led to proprietary software and a centralized way to distributing software with license, I would count the Open Source development models as one of the earliest decentralized technology frameworks. Down with the capitalistic corporations (aka Evil Empires)!

It was certainly a wonderful and generous way to make the world that it is today. It is a better world now.

Continue reading →

Stating the case for a Storage Appliance approach

By cfheoh | June 13, 2022 - 8:00 am |June 12, 2022 Analytics, Appliance, Artificial Intelligence, Backup, CIFS, Clusters, Data Management, Data Protection, Deduplication, Disks, Fibre Channel, FreeNAS, Hyperconvergence, Isilon, iXsystems, Jon Toigo, Linux, Lustre, Machine Learning, Minio, NAS, NetApp, Nexenta, NFS, Nimble Storage, NVMe, Oracle, Performance Caching, RAID, Reliability, Scale-out architecture, SMB, Snapshots, SoftIron, Software Defined Storage, Software-defined Datacenter, Storage Area Network, TrueNAS, Virtualization

1 Comment

I was in Indonesia last week to meet with iXsystems™‘ partner PT Maha Data Solusi. I had the wonderful opportunity to meet with many people there and one interesting and often-replayed question arose. Why aren’t iX doing software-defined-storage (SDS)? It was a very obvious and deliberate question.

After all, iX is already providing the free use of the open source TrueNAS® CORE software that runs on many x86 systems as an SDS solution and yet commercially, iX sell the TrueNAS® storage appliances.

This argument between a storage appliance model and a storage storage only model has been debated for more than a decade, and it does come into my conversations on and off. I finally want to address this here, with my own views and opinions. And I want to inform that I am open to both models, because as a storage consultant, both have their pros and cons, advantages and disadvantages. Up front I gravitate to the storage appliance model, and here’s why.

My story of the storage appliance begins …

Back in the 90s, most of my work was on Fibre Channel and NFS. iSCSI has not existed yet (iSCSI was ratified in 2003). It was almost exclusively on the Sun Microsystems® enterprise storage with Sun’s software resell of the Veritas® software suite that included the Sun Volume Manager (VxVM), Veritas® Filesystem (VxFS), Veritas® Replication (VxVR) and Veritas® Cluster Server (VCS). I didn’t do much Veritas® NetBackup (NBU) although I was trained at Veritas® in Boston in July 1997 (I remembered that 2 weeks’ trip fondly). It was just over 2 months after Veritas® acquired OpenVision. Backup Plus was the NetBackup.

Between 1998-1999, I spent a lot of time working Sun NFS servers. The prevalent networking speed at that time was 100Mbits/sec. And I remember having this argument with a Sun partner engineer by the name of Wong Teck Seng. Teck Seng was an inquisitive fella (still is) and he was raving about this purpose-built NFS server he knew about and he shared his experience with me. I detracted him, brushing aside his always-on tech orgasm, and did not find great things about a NAS storage appliance. Auspex™ was big then, and I knew of them.

I joined NetApp® as Malaysia’s employee #2. It was an odd few months working with a storage appliance but after a couple of months, I started to understand and appreciate the philosophy. The storage Appliance Model made sense to me, even through these days.

Continue reading →

As Disk Drive capacity gets larger (and larger), the resilient Filesystem matters

By cfheoh | May 30, 2022 - 8:00 am |May 30, 2022 Appliance, Backup, Ceph, CIFS, Clusters, Data Management, Data Protection, Disks, Filesystems, Flash, FreeNAS, Gluster, Hadoop, High Performance Computing, iXsystems, Joyent, Linux, Lustre, NetApp, Nexenta, NFS, OpenZFS, Oracle, RAID, Reliability, SMB, Snapshots, TrueNAS, Virtualization

2 Comments

I just got home from the wonderful iXsystems™ Sales Summit in Knoxville, Tennessee. The key highlight was to christian the opening of iXsystems™ Maryville facility, the key operations center that will house iX engineering, support and part of marketing as well. News of this can be found here.

iX datacenter in the new Maryville facility

Western Digital® has always been a big advocate of iX, and at the Summit, they shared their hard disk drives HDD, solid state drives SSD, and other storage platforms roadmaps. I felt like a kid a candy store because I love all these excitements in the disk drive industry. Who says HDDs are going to be usurped by SSDs?

Several other disk drive manufacturers, including Western Digital®, have announced larger capacity drives. Here are some news of each vendor in recent months

Other than the AFR (annualized failure rates) numbers published by Backblaze every quarter, the Capacity factor has always been a measurement of high interest in the storage industry.

Continue reading →

Category Archives: Linux

The AI Platformization of Storage – The Data Intelligence Platform

Paradigm shift for Data.

Rethinking Storage OKRs for AI Data Infrastructure – Part 1

GPU is king

Enhancing NAS client resiliency and performance with SMB Multichannel and NFS nconnect

One TCP connection

Continue reading →

Proxmox storage with TrueNAS iSCSI volumes

Preparing the iSCSI LUN provisioning

OpenZFS dRAID has risen!

Knowing RAID resilvering

dRAID resilvering

Open Source Storage and Data Responsibility

Open Source storage expectations and outcomes

Open Source on my mind

In the beginning

Stating the case for a Storage Appliance approach

My story of the storage appliance begins …

As Disk Drive capacity gets larger (and larger), the resilient Filesystem matters

Recent Posts

Sponsored Ads

Google Adsense

Recent Comments

Google Adsense

Paradigm shift for Data.

Share this:

GPU is king

Share this:

Share this:

One TCP connection

Share this:

Preparing the iSCSI LUN provisioning

Share this:

Knowing RAID resilvering

dRAID resilvering

Share this:

Open Source storage expectations and outcomes

Share this:

In the beginning

Share this:

My story of the storage appliance begins …

Share this:

Share this:

Recent Posts

Sponsored Ads

Google Adsense

Recent Comments

Google Adsense