NFS – Storage Gaga

Intelligent Data Movement and Data Placement dictate the future of AI Data Infrastructure

By cfheoh | July 29, 2025 - 7:48 am |July 29, 2025 100Gigabit Ethernet, Algorithm, Analytics, Artificial Intelligence, BeeGFS, Big Data, Big Switch Networks, Broadcom, compression, Computational Storage, Containers, CXL, Data Direct Networks, Data Management, DDN, Filesystems, Flash, Hammerspace, High Performance Computing, Machine Learning, NFS, nVidia, NVMe, Parallel NFS, Performance Benchmark, Performance Caching, pNFS, Scale-out architecture, Software Defined Storage, Storage Optimization, Storage Tiering, Vast Data, WekaIO

1 Comment

I have been reading a couple of articles over the weekend which started by placing the weights of outdated networking infrastructure slowing down AI ambitions. The 2 articles are:

The AI Infrastructure bottleneck no one talks about (which turned out to be a not-so-subtle play for Netris, a secure multi-tenant network provisioner technology).
Data Infrastructure: The missing link in successful AI adoption (a more subtle introduction of Indicium, an AI data services company).

I did not fully agree that networking infrastructure is the main inhibitor of AI ambitions per se. Not from the experiences and the present development in high performance networking of what I know so far. In fact, AI networking infrastructure has been growing leaps and bounds, laying down ultra-high throughput plumbing between the GPUs (inadvertently up the stack to the AI models and applications) and the data storage infrastructure.

The NVIDIA-heavy GPU compute infrastructure is of course, dominated by its own NVIDIA’s networking infrastructure. Both NVIDIA Spectrum (Ethernet) and Quantum (InfiniBand), BlueField (data processing units), ConnectX and LinkX are the mainstays of DGX Cloud, a big part of NVIDIA NCPs as well.

In fact, in one of DDN’s NCP customers, I have seen a 10-node DDN EXAscaler cluster deliver almost 1.1TB/sec read and 750GB/sec write throughput to the GPU compute cluster, out-of-the-box, all with 200Gbps networking gear.

Continue reading →

The All-Important Storage Appliance Mindset for HPC and AI projects

By cfheoh | July 22, 2024 - 7:30 am |November 17, 2024 API, Appliance, Artificial Intelligence, BeeGFS, Big Data, Cloud, Clusters, Data Direct Networks, DDN, Deep Learning, Digital Transformation, Elastifile, Filesystems, Flash, High Performance Computing, Infiniband, iXsystems, Lustre, Machine Learning, MapReduce, NAS, NetApp, NFS, nVidia, NVMe, Object Storage, Parallel NFS, Performance Benchmark, Performance Caching, RDMA, Scale-out architecture, Software Defined Storage, Solid State Devices, Storage Optimization, ThinkParq, WekaIO

Leave a comment

I am strong believer of using the right tool to do the job right. I have said this before 2 years ago, in my blog “Stating the case for a Storage Appliance approach“. It was written when I was previously working for an open source storage company. And I am an advocate of the crafter versus assembler mindset, especially in the enterprise and high- performance storage technology segments.

I have joined DDN. Even with DDN that same mindset does not change a bit. I have been saying all along that the storage appliance model should always be the mindset for the businesses’ peace-of-mind.

My view of the storage appliance model began almost 25 years. I came into NAS systems world via Sun Microsystems®. Sun was famous for running NFS servers on general Sun Solaris servers. NFS services on Unix systems. Back then, I remember arguing with one of the Sun distributors about the tenets of running NFS over 100Mbit/sec Ethernet on Sun servers. I was drinking Sun’s Kool-Aid big time.

When I joined Network Appliance® (now NetApp®) in 2000, my worldview of putting software on general purpose servers changed. Network Appliance®, had one product family, the FAS700 (720, 740, 760) family. All NetApp® did was to serve NFS services in the beginning. They were the NAS filers and nothing else.

I was completed sold on the appliance way with NetApp®. Firstly, it was my very first time knowing such network storage services could be provisioned with an appliance concept. This was different from Sun. I was used to managing NFS exports on a Sun SPARCstation 20 to Unix clients in the network.

Secondly, my mindset began to shape that “you have to have the right tool to the job correctly and extremely well“. Well, the toaster toasts bread very well and nothing else. And the fridge (an analogy used by Dave Hitz, I think) does what it does very well too. That is what the appliance does. You definitely cannot grill a steak with a bread toaster, just like you can’t run an excellent, ultra-high performance storage services to serve the demanding AI and HPC applications on a general server platform. You have to have a storage appliance solution for High-Speed Storage.

That little Network Appliance® toaster award given out to exemplary employees stood vividly in my mind. The NetApp® tagline back then was “Fast, Simple, Reliable”. That solidifies my mindset for the high-speed storage in AI and HPC projects in present times.

DDN AI400X2 Turbo Appliance

Costs Benefits and Risks

I like to think about what the end users are thinking about. There are investments costs involved, and along with it, risks to the investments as well as their benefits. Let’s just simplify and lump them into Cost-Benefits-Risk analysis triangle. These variables come into play in the decision making of AI and HPC projects.

Continue reading →

Enhancing NAS client resiliency and performance with SMB Multichannel and NFS nconnect

By cfheoh | May 13, 2024 - 7:30 am |May 11, 2024 100Gigabit Ethernet, Azure NetApp Files, CIFS, Clusters, Filesystems, FreeNAS, Gluster, High Performance Computing, Huawei, iXsystems, Linux, Microsoft, Microsoft Azure, NAS, NetApp, NFS, NFS+, pNFS, Pure Storage, Ryuusi, SMB, Software Defined Storage, Storage Optimization, TrueNAS, VMware

2 Comments

NAS (network attached storage) is obviously the file-level workhorse for shared resources in the network of any organization. SMB (server message block) for Windows environments and NFS (network file system) for Linux platforms are the 2 most prominent protocols that rule the NAS world. Of course we have SMB implementations in the form of Samba and others in non-Windows, Linux and NFS implementations in Windows as well.

As the versions of both network file sharing protocols iterated, present versions of SMB v3.x and NFS v4.x (NFS v3 on the supported Linux kernel version) on the client-side have evolved well. Both now have enhanced resiliency and performance improvement features. And there is an underlying similarity of both implementations. This blog looks at the client-side architectures of both.

One TCP connection

NAS is a client-server architecture. Over the network, NAS clients (SMB or NFS) access their corresponding NAS server(s) – SMB or NFS server(s) – through the TCP/IP network.

NAS client-server architecture (Credit: https://hypertecsp.com/en-CA/knowledge-base/nas-vs-san/)

One very important key starting point to note is the use of one TCP connection per NAS client to the NAS server relationship. For both SMB and NFS, there is just one TCP link between client and the server even if there are several SMB mapped shares or NFS mount points respectively on the clients.

For a long time, this one TCP connection is sufficient for the NAS traffic. But as the network file accesses grow, this connection becomes both a single point of failure as well as a performance bottleneck.

Continue reading →

Open Source on my mind

By cfheoh | July 25, 2022 - 6:30 am |July 25, 2022 Ceph, Cloud, Decentralized Storage, Digital Transformation, FreeNAS, iXsystems, Kubernetes, Linux, Minio, NFS, Nutanix, Object Storage, Openstack, OpenZFS, TrueNAS

2 Comments

Last week was cropped with topics around Open Source software. I want to voice my opinions here (with a bit of ranting) and hoping not to rouse many abhorrent comments from different parties and views. This blog is to create conversations, even controversial ones, but we must first agree that there will be disagreements. We must accept disagreements as part of this conversation.

In my 30 years career, Open Source has been a big part of my development and progress. The ideas of freely using (certain) software without any licensing implications and these software being openly available were not always welcomed, as they are now. I think the Open Source revolution has created an innovation movement that is still going strong, and it has not only permeated completely into the IT industry, Open Source has also now in almost every part of the technology-based industries as well. The Open Source influence is massive.

Open Source word cloud

In the beginning

In the beginning, in my beginning in 1992, the availability of software and its source codes was a closed one. Coming from a VAX/VMS background (I was a system admin in my mathematics department’s mini computers), Unix liberated my thinking. The final 6 months in the university was systems programming in C, and it completely changed how I wanted my career to shape. The mantra of “Free as in Freedom” in General Public License GPL (which I got know of much later) boded well with my own tenets in life.

If closed source development models led to proprietary software and a centralized way to distributing software with license, I would count the Open Source development models as one of the earliest decentralized technology frameworks. Down with the capitalistic corporations (aka Evil Empires)!

It was certainly a wonderful and generous way to make the world that it is today. It is a better world now.

Continue reading →

TrueNAS SCALE Clustered SMB to the fore

By cfheoh | July 4, 2022 - 8:00 am |July 4, 2022 API, Appliance, Business Continuity, CIFS, Clusters, Containers, Data Availability, Data Management, Disaster Recovery, Filesystems, FreeNAS, Gluster, High Performance Computing, iXsystems, Microsoft, NAS, NFS, Ryuusi, Scale-out architecture, SCSI, SMB, Software Defined Storage, TrueNAS, Unified Storage, Virtualization

Leave a comment

iXsystems™ released second iteration of TrueNAS® SCALE software just over a week ago. It is known as version 22.02.2 or Anglefish.2, with the most notable upgrades to HA (High Availability) for SCALE and Clustered SMB capabilities. This is the perfect excuse for me to learn about Clustered SMB and share what I have learned.

TrueNAS SCALEFor the

For the uninformed, Clustered SMB brings highly available SMB file sharing services to mission critical environments. More importantly, Clustered SMB is high availability in a scale-out clustered architecture.

My view beyond HA SMB

I am not familiar with Clustered SMB in a NAS (Network Attached Storage). The world I am more familiar with is either having CIFS/SMB file services on a dual controller storage appliance or running Windows File Sharing on an Microsoft® Clustered Service (MSCS). Typically in these 2 types of HA SMB services, the scale up architecture require a shared access to a consolidated storage volume. Behind the scenes, there are many mechanisms at play to ensure that one, and only one, storage controller or HA host can have write access capabilities at one time. The most common mechanism is the SCSI-3 Persistent Reservation or sometimes known as SCSI fencing, using the SPC-3 (SCSI Primary Command) primitives. The whole objective is to prevent 2 nodes or hosts to writing to the shared storage volume at the same time and other issues like split-brain.

Continue reading →

Stating the case for a Storage Appliance approach

By cfheoh | June 13, 2022 - 8:00 am |June 12, 2022 Analytics, Appliance, Artificial Intelligence, Backup, CIFS, Clusters, Data Management, Data Protection, Deduplication, Disks, Fibre Channel, FreeNAS, Hyperconvergence, Isilon, iXsystems, Jon Toigo, Linux, Lustre, Machine Learning, Minio, NAS, NetApp, Nexenta, NFS, Nimble Storage, NVMe, Oracle, Performance Caching, RAID, Reliability, Scale-out architecture, SMB, Snapshots, SoftIron, Software Defined Storage, Software-defined Datacenter, Storage Area Network, TrueNAS, Virtualization

1 Comment

I was in Indonesia last week to meet with iXsystems™‘ partner PT Maha Data Solusi. I had the wonderful opportunity to meet with many people there and one interesting and often-replayed question arose. Why aren’t iX doing software-defined-storage (SDS)? It was a very obvious and deliberate question.

After all, iX is already providing the free use of the open source TrueNAS® CORE software that runs on many x86 systems as an SDS solution and yet commercially, iX sell the TrueNAS® storage appliances.

This argument between a storage appliance model and a storage storage only model has been debated for more than a decade, and it does come into my conversations on and off. I finally want to address this here, with my own views and opinions. And I want to inform that I am open to both models, because as a storage consultant, both have their pros and cons, advantages and disadvantages. Up front I gravitate to the storage appliance model, and here’s why.

My story of the storage appliance begins …

Back in the 90s, most of my work was on Fibre Channel and NFS. iSCSI has not existed yet (iSCSI was ratified in 2003). It was almost exclusively on the Sun Microsystems® enterprise storage with Sun’s software resell of the Veritas® software suite that included the Sun Volume Manager (VxVM), Veritas® Filesystem (VxFS), Veritas® Replication (VxVR) and Veritas® Cluster Server (VCS). I didn’t do much Veritas® NetBackup (NBU) although I was trained at Veritas® in Boston in July 1997 (I remembered that 2 weeks’ trip fondly). It was just over 2 months after Veritas® acquired OpenVision. Backup Plus was the NetBackup.

Between 1998-1999, I spent a lot of time working Sun NFS servers. The prevalent networking speed at that time was 100Mbits/sec. And I remember having this argument with a Sun partner engineer by the name of Wong Teck Seng. Teck Seng was an inquisitive fella (still is) and he was raving about this purpose-built NFS server he knew about and he shared his experience with me. I detracted him, brushing aside his always-on tech orgasm, and did not find great things about a NAS storage appliance. Auspex™ was big then, and I knew of them.

I joined NetApp® as Malaysia’s employee #2. It was an odd few months working with a storage appliance but after a couple of months, I started to understand and appreciate the philosophy. The storage Appliance Model made sense to me, even through these days.

Continue reading →

As Disk Drive capacity gets larger (and larger), the resilient Filesystem matters

By cfheoh | May 30, 2022 - 8:00 am |May 30, 2022 Appliance, Backup, Ceph, CIFS, Clusters, Data Management, Data Protection, Disks, Filesystems, Flash, FreeNAS, Gluster, Hadoop, High Performance Computing, iXsystems, Joyent, Linux, Lustre, NetApp, Nexenta, NFS, OpenZFS, Oracle, RAID, Reliability, SMB, Snapshots, TrueNAS, Virtualization

2 Comments

I just got home from the wonderful iXsystems™ Sales Summit in Knoxville, Tennessee. The key highlight was to christian the opening of iXsystems™ Maryville facility, the key operations center that will house iX engineering, support and part of marketing as well. News of this can be found here.

iX datacenter in the new Maryville facility

Western Digital® has always been a big advocate of iX, and at the Summit, they shared their hard disk drives HDD, solid state drives SSD, and other storage platforms roadmaps. I felt like a kid a candy store because I love all these excitements in the disk drive industry. Who says HDDs are going to be usurped by SSDs?

Several other disk drive manufacturers, including Western Digital®, have announced larger capacity drives. Here are some news of each vendor in recent months

Other than the AFR (annualized failure rates) numbers published by Backblaze every quarter, the Capacity factor has always been a measurement of high interest in the storage industry.

Continue reading →

Unstructured Data Observability with Datadobi StorageMAP

By cfheoh | May 23, 2022 - 8:00 am |May 22, 2022 Artificial Intelligence, Backup, CIFS, Cloud, Data Archiving, Data Management, Data Privacy, Data Protection, Data Security, Datadobi, Digital Transformation, eDiscovery, Filesystems, FreeNAS, ILM, iXsystems, Minio, NAS, NFS, Object Storage, SMB, Storage Tiering, TrueNAS

Leave a comment

Let’s face it. Data is bursting through its storage seams. And every organization now is storing too much data that they don’t know they have.

By 2025, IDC predicts that 80% the world’s data will be unstructured. IDC‘s report Global Datasphere Forecast 2021-2025 will see the global data creation and replication capacity expand to 181 zettabytes, an unfathomable figure. Organizations are inundated. They struggle with data growth, with little understanding of what data they have, where the data is residing, what to do with the data, and how to manage the voluminous data deluge.

The simple knee-jerk action is to store it in cloud object storage where the price of storage is $0.0000xxx/GB/month. But many IT departments in these organizations often overlook the fact that that the data they have parked in the cloud require movement between the cloud and on-premises. I have been involved in numerous discussions where the customers realized that they moved the data in the cloud moved too frequently. Often it was an erred judgement or short term blindness (blinded by the cheap storage costs no doubt), further exacerbated by the pandemic. These oversights have resulted in expensive and painful monthly API calls and egress fees. Welcome to reality. Suddenly the cheap cloud storage doesn’t sound so cheap after all.

The same can said about storing non-active unstructured data on primary storage. Many organizations have not been disciplined to practise good data management. The primary Tier 1 storage becomes bloated over time, grinding sluggishly as the data capacity grows. I/O processing becomes painfully slow and backup takes longer and longer. Sounds familiar?

The A in ABC

I brought up the ABC mantra a few blogs ago. A is for Archive First. It is part of my data protection consulting practice conversation repertoire, and I use it often to advise IT organizations to be smart with their data management. Before archiving (some folks like to call it tiering, but I am not going down that argument today), we must know what to archive. We cannot blindly send all sorts of junk data to the secondary or tertiary storage premises. If we do that, it is akin to digging another hole to fill up the first hole.

We must know which unstructured data to move replicate or sync from the Tier 1 storage to a second (or third) less taxing storage premises. We must be able to see this data, observe its behaviour over time, and decide the best data management practice to apply to this data. Take note that I said best data management practice and not best storage location in the previous sentence. There has to be a clear distinction that a data management strategy is more prudent than to a “best” storage premises. The reason is many organizations are ignorantly thinking the best storage location (the thought of the “cheapest” always seems to creep up) is a good strategy while ignoring the fact that data is like water. It moves from premises to premises, from on-prem to cloud, cloud to other cloud. Data mobility is a variable in data management.

Continue reading →

I built a 6-node Gluster cluster with TrueNAS SCALE

By cfheoh | April 11, 2022 - 8:00 am |April 2, 2022 Acquisition, Appliance, CIFS, Clusters, Composable Infrastructure, Filesystems, Gluster, High Performance Computing, Hyperconvergence, Infiniband, iXsystems, Linux, NAS, NFS, OpenZFS, Performance Benchmark, RDMA, Redhat, Scale-out architecture, SMB, Software Defined Storage, TrueNAS, Unified Storage, Virtualbox

10 Comments

I haven’t had hands-on with Gluster for over a decade. My last blog about Gluster was in 2011, right after I did a proof-of-concept for the now defunct, Jaring, Malaysia’s first ISP (Internet Service Provider). But I followed Gluster’s development on and off, until I found out that Gluster was a feature in then upcoming TrueNAS® SCALE. That was almost 2 years ago, just before I accepted to offer to join iXsystems™, my present employer.

The eagerness to test drive Gluster (again) on TrueNAS® SCALE has always been there but I waited for SCALE to become GA. GA finally came on February 22, 2022. My plans for the test rig was laid out, and in the past few weeks, I have been diligently re-learning and putting up the scope to built a 6-node Gluster clustered storage with TrueNAS® SCALE VMs on Virtualbox®.

Gluster on OpenZFS with TrueNAS SCALE

Before we continue, I must warn that this is not pretty. I have limited computing resources in my homelab, but Gluster worked beautifully once I ironed out the inefficiencies. Secondly, this is not a performance test as well, for obvious reasons. So, this is the annals along with the trials and tribulations of my 6-node Gluster cluster test rig on TrueNAS® SCALE.

Continue reading →

Nakivo Backup Replication architecture and installation on TrueNAS – Part 1

By cfheoh | March 28, 2022 - 8:00 am |March 24, 2022 API, Appliance, Arcserve, Backup, Carbonite, Cloud, Commvault, compression, Data Archiving, Data Availability, Data Management, Data Protection, Data Security, Deduplication, deduplication, Disaster Recovery, Filesystems, FreeNAS, Infrascale, iSCSI, iXsystems, Linux, Microsoft, Microsoft Azure, Nakivo, NAS, NFS, Nutanix, Oracle, Quest Software, Security, Snapshots, Tape storage, TrueNAS, Veritas, virtual tape library, Virtualization, VMware

Leave a comment

Backup and Replication software have received strong mandates in organizations with enterprise mindsets and vision. But lower down the rung, small medium organizations are less invested in backup and replication software. These organizations know full well that they must backup, replicate and protect their servers, physical and virtual, and also new workloads in the clouds, given the threat of security breaches and ransomware is looming larger and larger all the time. But many are often put off by the cost of implementing and deploying a Backup and Replication software.

So I explored one of the lesser known backup and recovery software called Nakivo® Backup and Replication (NBR) and took the opportunity to build a backup and replication appliance in my homelab with TrueNAS®. My objective was to create a cost effective option for small medium organizations to enjoy enterprise-grade protection and recovery without the hefty price tag.

This blog, Part 1, writes about the architecture overview of Nakivo® and the installation of the NBR software in TrueNAS® to bake in and create the concept of a backup and replication appliance. Part 2, in a future blog post, will cover the administrative and operations usage of NBR.

Continue reading →

Category Archives: NFS

Intelligent Data Movement and Data Placement dictate the future of AI Data Infrastructure

The All-Important Storage Appliance Mindset for HPC and AI projects

Costs Benefits and Risks