Accelerated Data Paths of High Performance Storage is the Cornerstone of building AI

It has been 2 months into my new role at DDN as a Solutions Architect. With many revolving doors around me, I have been trying to find the essence, the critical cog of the data infrastructure that supports the accelerated computing of the Nvidia GPU clusters. The more I read and engage, a pattern emerged. I found that cog in the supercharged data paths between the storage infrastructure systems and the GPU clusters. I will share more.

To set the context, let me start with a wonderful article I read in CIO.com back in July 2024. It was titled “Storage: The unsung hero of AI deployments“. It was music to my ears because as a long-time practitioner in the storage technology industry, it is time the storage industry gets its credit it deserves.

What is the data path?

To put it simply, a Data Path, from a storage context, is the communication route taken by the data bits between the compute system’s processing and program memory and the storage subsystem. The links and the established sessions can be within the system components such as the PCIe bus or external to the system through the shared networking infrastructure.

High speed accelerated data paths

In the world of accelerated computing such as AI and HPC, there are additional, more advanced technologies to create even faster delivery of the data bits. This is the accelerated data paths between the compute nodes and the storage subsystems. Following on, I share a few of these technologies that are lesser used in the enterprise storage segment.

Continue reading

Enhancing NAS client resiliency and performance with SMB Multichannel and NFS nconnect

NAS (network attached storage) is obviously the file-level workhorse for shared resources in the network of any organization. SMB (server message block) for Windows environments and NFS (network file system) for Linux platforms are the 2 most prominent protocols that rule the NAS world. Of course we have SMB implementations in the form of Samba and others in non-Windows, Linux and NFS implementations in Windows as well.

As the versions of both network file sharing protocols iterated, present versions of SMB v3.x and NFS v4.x (NFS v3 on the supported Linux kernel version) on the client-side have evolved well. Both now have enhanced resiliency and performance improvement features. And there is an underlying similarity of both implementations. This blog looks at the client-side architectures of both.

One TCP connection

NAS is a client-server architecture. Over the network, NAS clients (SMB or NFS) access their corresponding NAS server(s) – SMB or NFS server(s) – through the TCP/IP network.

NAS client-server architecture (Credit: https://hypertecsp.com/en-CA/knowledge-base/nas-vs-san/)

One very important key starting point to note is the use of one TCP connection per NAS client to the NAS server relationship. For both SMB and NFS, there is just one TCP link between client and the server even if there are several SMB mapped shares or NFS mount points respectively on the clients.

For a long time, this one TCP connection is sufficient for the NAS traffic. But as the network file accesses grow, this connection becomes both a single point of failure as well as a performance bottleneck.

Continue reading