Scale-out architecture – Page 2

Stating the case for a Storage Appliance approach

By cfheoh | June 13, 2022 - 8:00 am |June 12, 2022 Analytics, Appliance, Artificial Intelligence, Backup, CIFS, Clusters, Data Management, Data Protection, Deduplication, Disks, Fibre Channel, FreeNAS, Hyperconvergence, Isilon, iXsystems, Jon Toigo, Linux, Lustre, Machine Learning, Minio, NAS, NetApp, Nexenta, NFS, Nimble Storage, NVMe, Oracle, Performance Caching, RAID, Reliability, Scale-out architecture, SMB, Snapshots, SoftIron, Software Defined Storage, Software-defined Datacenter, Storage Area Network, TrueNAS, Virtualization

1 Comment

I was in Indonesia last week to meet with iXsystems™‘ partner PT Maha Data Solusi. I had the wonderful opportunity to meet with many people there and one interesting and often-replayed question arose. Why aren’t iX doing software-defined-storage (SDS)? It was a very obvious and deliberate question.

After all, iX is already providing the free use of the open source TrueNAS® CORE software that runs on many x86 systems as an SDS solution and yet commercially, iX sell the TrueNAS® storage appliances.

This argument between a storage appliance model and a storage storage only model has been debated for more than a decade, and it does come into my conversations on and off. I finally want to address this here, with my own views and opinions. And I want to inform that I am open to both models, because as a storage consultant, both have their pros and cons, advantages and disadvantages. Up front I gravitate to the storage appliance model, and here’s why.

My story of the storage appliance begins …

Back in the 90s, most of my work was on Fibre Channel and NFS. iSCSI has not existed yet (iSCSI was ratified in 2003). It was almost exclusively on the Sun Microsystems® enterprise storage with Sun’s software resell of the Veritas® software suite that included the Sun Volume Manager (VxVM), Veritas® Filesystem (VxFS), Veritas® Replication (VxVR) and Veritas® Cluster Server (VCS). I didn’t do much Veritas® NetBackup (NBU) although I was trained at Veritas® in Boston in July 1997 (I remembered that 2 weeks’ trip fondly). It was just over 2 months after Veritas® acquired OpenVision. Backup Plus was the NetBackup.

Between 1998-1999, I spent a lot of time working Sun NFS servers. The prevalent networking speed at that time was 100Mbits/sec. And I remember having this argument with a Sun partner engineer by the name of Wong Teck Seng. Teck Seng was an inquisitive fella (still is) and he was raving about this purpose-built NFS server he knew about and he shared his experience with me. I detracted him, brushing aside his always-on tech orgasm, and did not find great things about a NAS storage appliance. Auspex™ was big then, and I knew of them.

I joined NetApp® as Malaysia’s employee #2. It was an odd few months working with a storage appliance but after a couple of months, I started to understand and appreciate the philosophy. The storage Appliance Model made sense to me, even through these days.

Continue reading →

I built a 6-node Gluster cluster with TrueNAS SCALE

By cfheoh | April 11, 2022 - 8:00 am |April 2, 2022 Acquisition, Appliance, CIFS, Clusters, Composable Infrastructure, Filesystems, Gluster, High Performance Computing, Hyperconvergence, Infiniband, iXsystems, Linux, NAS, NFS, OpenZFS, Performance Benchmark, RDMA, Redhat, Scale-out architecture, SMB, Software Defined Storage, TrueNAS, Unified Storage, Virtualbox

10 Comments

I haven’t had hands-on with Gluster for over a decade. My last blog about Gluster was in 2011, right after I did a proof-of-concept for the now defunct, Jaring, Malaysia’s first ISP (Internet Service Provider). But I followed Gluster’s development on and off, until I found out that Gluster was a feature in then upcoming TrueNAS® SCALE. That was almost 2 years ago, just before I accepted to offer to join iXsystems™, my present employer.

The eagerness to test drive Gluster (again) on TrueNAS® SCALE has always been there but I waited for SCALE to become GA. GA finally came on February 22, 2022. My plans for the test rig was laid out, and in the past few weeks, I have been diligently re-learning and putting up the scope to built a 6-node Gluster clustered storage with TrueNAS® SCALE VMs on Virtualbox®.

Gluster on OpenZFS with TrueNAS SCALE

Before we continue, I must warn that this is not pretty. I have limited computing resources in my homelab, but Gluster worked beautifully once I ironed out the inefficiencies. Secondly, this is not a performance test as well, for obvious reasons. So, this is the annals along with the trials and tribulations of my 6-node Gluster cluster test rig on TrueNAS® SCALE.

Continue reading →

Exploring the venerable NFS Ganesha

By cfheoh | February 14, 2022 - 8:00 am |February 13, 2022 API, Clusters, Filesystems, FreeNAS, Gluster, High Performance Computing, Hyperconvergence, iXsystems, Linux, Lustre, NAS, NFS, Nutanix, OpenZFS, Oracle, Panasas, Scale-out architecture, Software Defined Storage, TrueNAS, Unified Storage

2 Comments

As TrueNAS® SCALE approaches its General Availability date in less than 10 days time, one of the technology pieces I am extremely excited about in TrueNAS® SCALE is the NFS Ganesha server. It is still early days to see the full prowess of NFS Ganesha in TrueNAS® SCALE, but the potential of Ganesha’s capabilities in iXsystems™ new scale-out storage technology is very, very promising.

NFS Ganesha

I love Network File System (NFS). It was one of the main reasons I was so attracted to Sun Microsystems® SunOS in the first place. 6 months before I graduated, I took a Unix systems programming course in C in the university. The labs were on Sun 3/60 workstations. Coming from a background of a VAX/VMS system administrator in the school’s lab, Unix became a revelation for me. It completely (and blissfully) opened my eyes to open technology, and NFS was the main catalyst. Till this day, my devotion to Unix remained sacrosanct because of the NFS spark aeons ago.

I don’t know NFS Ganesha. I knew of its existence for almost a decade, but I have never used it. Most of the NFS daemons/servers I worked with were kernel NFS, and these included NFS services in Sun SunOS/Solaris, several Linux flavours – Red Hat®, SuSE®, Ubuntu, BSD variants in FreeBSD and MacOS, the older Unices of the 90s – HP-UX, Ultrix, AIX and Irix along with SCO Unix and Microsoft® Xenix, NetApp® ONTAP™, EMC® Isilon (very briefly), Hitachi® HNAS (née BlueArc) and of course, in these past 5-6 years FreeNAS®/TrueNAS™.

So, as TrueNAS® SCALE beckons, I took to this weekend to learn a bit about NFS Ganesha. Here are what I have learned.

Continue reading →

Celebrating MinIO

By cfheoh | January 31, 2022 - 8:00 am |January 31, 2022 Algorithm, Amazon Web Services, Analytics, API, Artificial Intelligence, Backup, Big Data, Blitzscaling, Cloud, Cloudian, Clusters, Containers, Data Archiving, Deep Learning, Gartner, Gluster, Hadoop, Hadoop Clusters, HDS, High Performance Computing, Hitachi Vantara, IBM, InfluxDB, Kubernetes, Machine Learning, Minio, NetApp, Object Storage, OpenIO, Openstack, Scale-out architecture, Software Defined Storage

2 Comments

“Essentially MinIO is a web server …“

I vaguely recalled Anand Babu Periasamy (AB as he is known), the CEO of MinIO saying that when I first met him in 2017. I was fresh “playing around” with MinIO and instantly I fell in love with software technology. Wait a minute. Object storage wasn’t supposed to be so easy. It was not supposed to be that simple to set up and use, but MinIO burst into my storage universe like the birth of the Infinity Stones. There was a eureka moment. And I was attending one of the Storage Field Days in the US shortly after my MinIO discovery in late 2017. What an opportunity!

I could not recall how I made the appointment to meeting MinIO, but I recalled myself taking an Uber to their cosy office on University Avenue in Palo Alto to meet. Through Andy Watson (one of the CTOs then), I was introduced to AB, Garima Kapoor, MinIO’s COO and his wife, Frank Wessels, Zamin (one of the business people who is no longer there) and Ugur Tigli (East Coast CTO) who was on the Polycom. I was awe struck.

Last week, MinIO scored a major Series B round funding of USD103 million. It was delayed by the pandemic because I recalled Garima telling me that the funding was happening in 2020. But I think the delay made it better, because the world now is even more ready for MinIO than ever before.

Continue reading →

A conceptual distributed enterprise HCI with open source software

By cfheoh | January 10, 2022 - 8:00 am |January 9, 2022 Amazon Web Services, API, Appliance, Business Continuity, Cloud, Containers, Data Fabric, DellEMC, Digital Transformation, Edge Computing, Gluster, Google Anthos, Hyperconvergence, iXsystems, Kubernetes, Linux, Minio, Object Storage, OpenZFS, Redhat, Scale-out architecture, Software Defined Storage, Software-defined Datacenter, SuSE, TrueNAS, Unified Storage, Virtualization, VMware

The next version of HCI

Hyperconverged Infrastructure (HCI) also changed the game. Integration of compute, network and storage became easier, more seamless and less costly when HCI entered the market. Wrapped with a single control plane, the HCI management component can orchestrate VM (virtual machine) resources without much friction. That was HCI 1.0.

But HCI 1.0 was challenged, because several key components of its architecture were based on DAS (direct attached) storage. Scaling storage from a capacity point of view was limited by storage components attached to the HCI architecture. Some storage vendors decided to be creative and created dHCI (disaggregated HCI). If you break down the components one by one, in my opinion, dHCI is just a SAN (storage area network) to HCI. Maybe this should be HCI 1.5.

A new version of an HCI architecture is swimming in as Angelfish

Kubernetes came into the HCI picture in recent years. Without the weights and dependencies of VMs and DAS at the HCI server layer, lightweight containers orchestrated, mostly by, Kubernetes, made distribution of compute easier. From on-premises to cloud and in between, compute resources can easily spun up or down anywhere.

[ 2021 ] Storage Elephant Compute Birds

Continue reading →

Rethinking data processing frameworks systems in real time

By cfheoh | December 27, 2021 - 8:00 am |December 26, 2021 Algorithm, Amazon Web Services, Analytics, API, Artificial Intelligence, Confluent, Containers, Data, Data Management, Data Privacy, Data Protection, Data Security, Digital Transformation, Google, Hadoop, Hadoop Clusters, InfluxDB, Machine Learning, MapReduce, Microsoft Azure, Pravega, Scale-out architecture

How well do you know your data and the storage platform that processes the data

By cfheoh | December 20, 2021 - 7:30 am |December 20, 2021 100Gigabit Ethernet, 10Gigabit Ethernet, Algorithm, Analytics, Appliance, Backup, Big Data, Business Continuity, Cloud, Clusters, Composable Infrastructure, compression, Confluent, Data Archiving, Data Availability, Data Fabric, Data Management, Data Privacy, Data Protection, Data Security, Deduplication, Digital Transformation, Disaster Recovery, Edge Computing, Filesystems, Hyperconvergence, ILM, Industry 4.0, InfluxDB, iRODS, Machine Learning, NAS, NFS, NVMe, Object Storage, Performance Caching, Pravega, Reliability, SATA, Scale-out architecture, Security, Software Defined Storage, Storage Area Network, Storage Optimization, Storage Tiering, Unified Storage, VDI, Virtualization

I/O Characteristics

Applications and workloads (A&W) read and write from the data storage services platforms. These could be local DAS (direct access storage), network storage arrays in SAN and NAS, and now objects, or from cloud storage services. Regardless of structured or unstructured data, different A&Ws have different behavioural I/O patterns in accessing data from storage. Therefore storage has to be configured at best to match these patterns, so that it can perform optimally for these A&Ws. Without going into deep details, here are a few to think about:

Random and Sequential patterns
Block sizes of these A&Ws ranging from typically 4K to 1024K.
Causal effects of synchronous and asynchronous I/Os to and from the storage

Continue reading →

The future of Fibre Channel in the Cloud Era

By cfheoh | September 13, 2021 - 7:30 am |September 13, 2021 100Gigabit Ethernet, Algorithm, Analytics, API, Appliance, Backup, Big Data, Broadcom, Brocade, Business Continuity, Cloud, Containers, Data, Data Availability, Data Fabric, Data Protection, Data Security, Deep Learning, Digital Transformation, Disaster Recovery, Disks, Edge Computing, FCoE, Fibre Channel, HDS, High Performance Computing, Hitachi Vantara, Hyperconvergence, iSCSI, Kubernetes, Machine Learning, NetApp, NVMe, Performance Benchmark, QLogic, RAID, Reliability, SATA, Scale-out architecture, SNIA, Solid State Devices, Storage Area Network, Storage Optimization, Storage Tiering

3 Comments

The world has pretty much settled that hybrid cloud is the way to go for IT infrastructure services today. Straddled between the enterprise data center and the infrastructure-as-a-service in public cloud offerings, hybrid clouds define the storage ecosystems and architecture of choice.

A recent Blocks & Files article, “Broadcom server-storage connectivity sales down but recovery coming” caught my attention. One segment mentioned that the server-storage connectivity sales was down 9% leading me to think “Is this a blip or is it a signal that Fibre Channel, the venerable SAN (storage area network) protocol is on the wane?”

Fibre Channel Sign

Thus, I am pondering the position of Fibre Channel SANs in the cloud era. Where does it stand now and in the near future? Continue reading →

Plotting the Crypto Coin Storage Farm

By cfheoh | May 10, 2021 - 9:00 am |May 10, 2021 Algorithm, Appliance, Blitzscaling, Chia, Cloud, Deduplication, Filesystems, FreeNAS, High Performance Computing, Hyperconvergence, iXsystems, Linux, NAS, Performance Caching, RAID, Reliability, Scale-out architecture, Solid State Devices, TrueNAS

1 Comment

The recent craze of the Chia cryptocurrency got me excited. Mostly because it uses storage as the determinant for the Proof-of-Work consensus algorithm in a blockchain network. Yes, I am always about storage. 😉

I am not a Bitcoin miner nor am I a Chia coin farmer, and my knowledge and experience in both are very shallow. But I recently became interested in the 2 main activities of Chia – plotting and farming, because they both involved storage. I am writing this blog to find out more and document about my learning experience.

[ NB: This blog does not help you make money. It is just informational from a storage technology perspective. ]

Chia Cryptocurrency

Proof of Space and Time

Bitcoin is based on Proof-of-Work (PoW). In a nutshell, there is a complex mathematical puzzle to be solved. Bitcoin miners compete to solve this puzzle and the process uses high computational processing to solve it. Once solved, the miners are rewarded for their work.

Newer entrants like Filecoin and Chia coin (XCH) use an alternate method which is Proof-of-Space (PoS) to validate and verify the transactions. Instead of miners, Chia coin farmers have to prove to have a legitimate amount of disk and/or memory space to solve a mathematical puzzle, conceptually similar to the one in Bitcoin mining. In the beginning, this was great for folks who have unused disk space that can be “rented” out to store the crypto stuff (Note: I am not familiar with the terminology yet, and I did not want to use the word “crypto tokens” incorrectly). Storj was one of the early vendors that I remember in this space touting this method but I have not followed them for a while. Their business model might have changed.

Continue reading →

Kubernetes Persistent Storage Managed Well

By cfheoh | October 5, 2020 - 8:25 am |October 5, 2020 100Gigabit Ethernet, Business Continuity, Cloud, Clusters, Containers, Flash, iSCSI, Kubernetes, Linux, Performance Caching, Scale-out architecture, Software Defined Storage, Storpool, Virtualization

2 Comments

[ Disclosure: This is a StorPool Storage sponsored blog ]

StorPool Storage – Distributed Storage

There is a rapid adoption of Kubernetes in the enterprise and in the cloud. The push for digital transformation to modernize businesses for a cloud native world in the next decade has lifted both containerized applications and the Kubernetes container orchestration platform to an unprecedented level. The application landscape, especially the enterprise, is looking at Kubernetes to address these key areas:

Scale
High performance
Availability and Resiliency
Security and Compliance
Controllable Costs
Simplified

The Persistent Storage Question

Enterprise applications such as relational databases, email servers, and even the cloud native ones like NoSQL, analytics engines, demand a single data source of truth. Fundamentals properties such as ACID (atomicity, consistency, isolation, durability) and BASE (Basic Availability, Soft State, Eventual Consistency) have to have persistent storage as the foundational repository for the data. And thus, persistent storage have rallied under Container Storage Interface (CSI), and fast becoming a de facto standard for Kubernetes. At last count, there are more than 80 CSI drivers from 60+ storage and cloud vendors, each providing block-level storage to Kubernetes pods.

However, at this juncture, Kubernetes is still very engineering-centric. Persistent storage is equally as challenging, despite all the new developments and hype around it.

Continue reading →

Category Archives: Scale-out architecture

Stating the case for a Storage Appliance approach

My story of the storage appliance begins …

I built a 6-node Gluster cluster with TrueNAS SCALE

Exploring the venerable NFS Ganesha

Celebrating MinIO