A conceptual distributed enterprise HCI with open source software

Cloud computing has changed everything, at least at the infrastructure level. Kubernetes is changing everything as well, at the application level. Enterprises are attracted by tenets of cloud computing and thus, cloud adoption has escalated. But it does not have to be a zero-sum game. Hybrid computing can give enterprises a balanced choice, and they can take advantage of the best of both worlds.

Open Source has changed everything too because organizations now has a choice to balance their costs and expenditures with top enterprise-grade software. The challenge is what can organizations do to put these pieces together using open source software? Integration of open source infrastructure software and applications can be complex and costly.

The next version of HCI

Hyperconverged Infrastructure (HCI) also changed the game. Integration of compute, network and storage became easier, more seamless and less costly when HCI entered the market. Wrapped with a single control plane, the HCI management component can orchestrate VM (virtual machine) resources without much friction. That was HCI 1.0.

But HCI 1.0 was challenged, because several key components of its architecture were based on DAS (direct attached) storage. Scaling storage from a capacity point of view was limited by storage components attached to the HCI architecture. Some storage vendors decided to be creative and created dHCI (disaggregated HCI). If you break down the components one by one, in my opinion, dHCI is just a SAN (storage area network) to HCI. Maybe this should be HCI 1.5.

A new version of an HCI architecture is swimming in as Angelfish

Kubernetes came into the HCI picture in recent years. Without the weights and dependencies of VMs and DAS at the HCI server layer, lightweight containers orchestrated, mostly by, Kubernetes, made distribution of compute easier. From on-premises to cloud and in between, compute resources can easily spun up or down anywhere.

Continue reading

Right time for Andrew. The Filesystem that is.

I couldn’t hold my excitement when I discovered Auristor® early last week. I stumbled upon this Computerweekly article “Want to side step Public Cloud? Auristor® offers global file storage.” Given the many news not exactly praising the public cloud storage vendors nowadays, the article’s title caught my attention. Immediately Andrew File System (AFS) was there. I was perplexed at first because I have never seen or heard a commercial version of AFS before. This news gave me goosebumps.

For the curious, I am sure many will ask who is this Andrew anyway? What is my relationship with this Andrew?

One time with Andrew

A bit of my history. I recalled quite vividly helping Intel in Penang, Malaysia to implement their globally distributed file caching mechanism with the NetApp® filer’s NFS. It was probably 2001 and I believed Intel wanted to share their engineering computing (EC) files between their US facilities and Intel Penang Design Center (PDC). As I worked along with the Intel folks, I found out that this distributed file caching technology was called Andrew File System (AFS).

Although I couldn’t really recalled how the project went, I remembered it being a bed of bugs at that time. But being the storage geek that I am, I obviously took some time to get to know Andrew the File System. 20 years have gone by, and I never really thought of AFS coming out as a commercial solution or even knew of it as one, until Auristor®,

Auristor Logo

Continue reading

Open Source Storage Technology Crafters

The conversation often starts with a challenge. “What’s so great about open source storage technology?

For the casual end users of storage systems, regardless of SAN (definitely not Fibre Channel) or NAS on-premises, or getting “files” from the personal cloud storage like Dropbox, OneDrive et al., there is a strong presumption that open source storage technology is cheap and flaky. This is not helped with the diet of consumer brands of NAS in the market, where the price is cheap, but the storage offering with capabilities, reliability and performance are found to be wanting. Thus this notion floats its way to the business and enterprise users, and often ended up with a negative perception of open source storage technology.

Highway Signpost with Open Source wording

Storage Assemblers

Anybody can “build” a storage system with open source storage software. Put the software together with any commodity x86 server, and it can function with the basic storage services. Most open source storage software can do the job pretty well. However, once the completed storage technology is put together, can it do the job well enough to serve a business critical end user? I have plenty of sob stories from end users I have spoken to in these many years in the industry related to so-called “enterprise” storage vendors. I wrote a few blogs in the past that related to these sad situations:

We have such storage offerings rigged with cybersecurity risks and holes too. In a recent Unit 42 report, 250,000 NAS devices are vulnerable and exposed to the public Internet. The brands in question are mentioned in the report.

I would categorize these as storage assemblers.

Continue reading

Windows SMB synchronous writes with OpenZFS

Sometimes I get really pissed off with myself because I have taken a bigoted view, and ended up with eggs on my face. The past week was like that, and the problem was gnawing me on the inside all week, because I was determined to balance my equilibrium by finding the answer.

Early in the week, I was having a conversation with a potential customer. It evolved around the missing 10 seconds or so of the video footage between the users of a popular video editing software. The company had 70% Windows users, and 30% users on the Mac, both sides accessing the NAS device. The issue was the editors on the Windows side will store the raw and edited files to the NAS, but when the Mac users read them, they will often find 10 seconds or so of the stored video files missing.

The likeliest culprit of this problem is the way the SMB protocol write I/O behaves in Windows and in MacOS. Windows SMB, by default, writes I/O asynchronously while SMB on MacOS writes I/O synchronously.

I had a strong conviction I had the answer to this issue but this was not a TrueNAS®, It was another brand of NAS that I did not have knowledge of, and so, I left the conversation feeling quite embarrassed because I had the answer only on the TrueNAS® server side, not on the Windows client side. Bigotry blinded me. Hmmph! 

SMB (Server Message Block) client-server model

Continue reading

First looks into Interplanetary File System

The cryptocurrency craze has elevated another strong candidate in recent months. Filecoin, is leading the voice of a decentralized Internet, the next generation Web 3.0. In this blog, I am not going to write much about the Filecoin frenzy but the underlying distributed file system that powers this phenomenon – The Interplanetary File System.

[ Note: This is still a very new area for me, and the rest of the content of this blog is still nascent and developing ]

Interplanetary File System

Tremulous Client-Server web architecture

The entire Internet architecture is almost client and server. Your clients like browsers, apps, connect to Web services served from a collection of servers. As Web 3.0 approaches (some say it is already here), the client-server model is no longer perceived as the Internet architecture of choice. Billions, and billions of users, applications, devices relying solely on a centralized service would lead to many impactful consequences, and the reasons for decentralization, away from the client-server architecture models of the Internet are cogent.

Continue reading

Plotting the Crypto Coin Storage Farm

The recent craze of the Chia cryptocurrency got me excited. Mostly because it uses storage as the determinant for the Proof-of-Work consensus algorithm in a blockchain network. Yes, I am always about storage. 😉

I am not a Bitcoin miner nor am I a Chia coin farmer, and my knowledge and experience in both are very shallow. But I recently became interested in the 2 main activities of Chia – plotting and farming, because they both involved storage. I am writing this blog to find out more and document about my learning experience.

[ NB: This blog does not help you make money. It is just informational from a storage technology perspective. ]

Chia Cryptocurrency

Proof of Space and Time

Bitcoin is based on Proof-of-Work (PoW). In a nutshell, there is a complex mathematical puzzle to be solved. Bitcoin miners compete to solve this puzzle and the process uses high computational processing to solve it. Once solved, the miners are rewarded for their work.

Newer entrants like Filecoin and Chia coin (XCH) use an alternate method which is Proof-of-Space (PoS) to validate and verify the transactions. Instead of miners, Chia coin farmers have to prove to have a legitimate amount of disk and/or memory space to solve a mathematical puzzle, conceptually similar to the one in Bitcoin mining. In the beginning, this was great for folks who have unused disk space that can be “rented” out to store the crypto stuff (Note: I am not familiar with the terminology yet, and I did not want to use the word “crypto tokens” incorrectly). Storj was one of the early vendors that I remember in this space touting this method but I have not followed them for a while. Their business model might have changed.

Continue reading

Layers in Storage – For better or worse

Storage arrays and storage services are built upon by layers and layers beneath its architecture. The physical components of hard disk drives and solid states are abstracted into RAID volumes, virtualized into other storage constructs before they are exposed as shares/exports, LUNs or objects to the network.

Everyone in the storage networking industry, is cognizant of the layers and it is the foundation of knowledge and experience. The public cloud storage services side is the same, albeit more opaque. Nevertheless, both have layers.

In the early 2000s, SNIA® Technical Council outlined a blueprint of the SNIA® Shared Storage Model, a framework describing layers and properties of a storage system and its services. It was similar to the OSI 7-layer model for networking. The framework helped many industry professionals and practitioners shaped their understanding and the development of knowledge in their respective fields. The layering scheme of the SNIA® Shared Storage Model is shown below:

SNIA Shared Storage Model – The layering scheme

Storage vendors layering scheme

While SNIA® storage layers were generic and open, each storage vendor had their own proprietary implementation of storage layers. Some of these architectures are simple, but some, I find a bit too complex and convoluted.

Here is an example of the layers of the Automated Volume Management (AVM) architecture of the EMC® Celerra®.

EMC Celerra AVM Layering Scheme

I would often scratch my head about AVM. Disks were grouped into RAID groups, which are LUNs (Logical Unit Numbers). Then they were defined as Celerra® dvols (disk volumes), and stripes of the dvols were consolidated into a storage pool.

From the pool, a piece of a storage capacity construct, called a slice volume, were combined with other slice volumes into a metavolume which eventually was presented as a file system to the network and their respective NAS clients. Explaining this took an effort because I was the IP Storage product manager for EMC® between 2007 – 2009. It was a far cry from the simplicity of NetApp® ONTAP 7 architecture of RAID groups and volumes, and the WAFL® (Write Anywhere File Layout) filesystem.

Another complicated layered framework I often gripe about is Ceph. Here is a look of how the layers of CephFS is constructed.

Ceph Storage Layered Framework

I work with the OpenZFS filesystem a lot. It is something I am rather familiar with, and the layered structure of the ZFS filesystem is essentially simpler.

Storage architecture mixology

Engineers are bizarre when they get too creative. They have a can do attitude that transcends the boundaries of practicality sometimes, and boggles many minds. This is what happens when they have their own mixology ideas.

Recently I spoke to two magnanimous persons who had the idea of providing Ceph iSCSI LUNs to the ZFS filesystem in order to use the simplicity of NAS file sharing capabilities in TrueNAS® CORE. From their own words, Ceph NAS capabilities sucked. I had to draw their whole idea out in a Powerpoint and this is the architecture I got from the conversation.

There are 3 different storage subsystems here just to provide NAS. As if Ceph layers aren’t complicated enough, the iSCSI LUNs from Ceph are presented as Cinder volumes to the KVM hypervisor (or VMware® ESXi) through the Cinder driver. Cinder is the persistent storage volume subsystem of the Openstack® project. The Cinder volumes/hypervisor datastore are virtualized as vdisks to the respective VMs installed with TrueNAS® CORE and OpenZFS filesystem. From the TrueNAS® CORE, shares and exports are provisioned via the SMB and NFS protocols to Windows and Linux respectively.

It works! As I was told, it worked!

A.P.P.A.R.M.S.C. considerations

Continuing from the layered framework described above for NAS, other aspects beside the technical work have to be considered, even when it can work technically.

I often use a set of diligent data storage focal points when considering a good storage design and implementation. This is the A.P.P.A.R.M.S.C. Take for instance Protection as one of the points and snapshot is the technology to use.

Snapshots can be executed at the ZFS level on the TrueNAS® CORE subsystem. Snapshots can be trigged at the volume level in Openstack® subsystem and likewise, rbd snapshots at the Ceph subsystem. The question is, which snapshot at which storage subsystem is the most valuable to the operations and business? Do you run all 3 snapshots? How do you execute them in succession in a scheduled policy?

In terms of performance, can it truly maximize its potential? Can it churn out the best IOPS, and deliver at wire speed? What is the latency we can expect with so many layers from 3 different storage subsystems?

And supporting this said architecture would be a nightmare. Where do you even start the troubleshooting?

Those are just a few considerations and questions to think about when such a layered storage architecture along. IMHO, such a design was over-engineered. I was tempted to say “Just because you can, doesn’t mean you should

Elegance in Simplicity

Einstein (I think) quoted:

Einstein’s quote on simplicity and complexity

I am not saying that having too many layers is wrong. Having a heavily layered architecture works for many storage solutions out there, where they are often masked with a simple and intuitive UI. But in yours truly point of view, as a storage architecture enthusiast and connoisseur, there is beauty and elegance in simple designs.

The purpose here is to promote better understanding of the storage layers, and how they integrate and interact with each other to deliver the data services to the network. In the end, that is how most storage architectures are built.

 

Discovering OpenZFS Fusion Pool

Fusion Pool excites me, but unfortunately this new key feature of OpenZFS is hardly talked about. I would like to introduce the Fusion Pool feature as iXsystems™ expands the TrueNAS® Enterprise storage conversations.

I would not say that this technology is revolutionary. Other vendors already have the similar concept of Fusion Pool. The most notable (to me) is NetApp® Flash Pool, and I am sure other enterprise storage vendors have the same. But this is a big deal (for me) for an open source file system in OpenZFS.

What is Fusion Pool  (aka ZFS Allocation Classes)?

To understand Fusion Pool, we have to understand the basics of the ZFS zpool. A zpool is the aggregation (borrowing the NetApp® terminology) of vdevs (virtual devices), and vdevs are a collection of physical drives configured with the OpenZFS RAID levels (RAID-0, RAID-1, RAID-Z1, RAID-Z2, RAID-Z3 and a few nested RAID permutations). A zpool can start with one vdev, and new vdevs can be added on-the-fly, expanding the capacity of the zpool online.

There are several types of vdevs prior to Fusion Pool, and this is as of pre-TrueNAS® version 12.0. As shown below, these are the types of vdevs available to the zpool at present.

OpenZFS zpool and vdev types – Credit: Jim Salter and Arstechnica

Fusion Pool is a zpool that integrates with a new, special type of vdev, alongside other normal vdevs. This special vdev is designed to work with small data blocks between 4-16K, and is highly efficient in handling random reading and writing of these small blocks. This bodes well with the OpenZFS file system metadata blocks and other blocks of small files. And the random nature of the Read/Write I/Os works best with SSDs (can be read or write intensive SSDs).

Continue reading

Storageless shan’t be thy name

Storageless??? What kind of a tech jargon is that???

This latest jargon irked me. Storage vendor NetApp® (through its acquisition of Spot) and Hammerspace, a metadata-driven storage agnostic orchestration technology company, have begun touting the “storageless” tech jargon in hope that it will become an industry buzzword. Once again, the hype cycle jargon junkies are hard at work.

Clear, empty storage containers

Clear, nondescript storage containers

It is obvious that the storageless jargon wants to ride on the hype of serverless computing, an abstraction method of computing resources where the allocation and the consumption of resources are defined by pieces of programmatic code of the running application. The “calling” of the underlying resources are based on the application’s code, and thus, rendering the computing resources invisible, insignificant and not sexy.

My stand

Among the 3 main infrastructure technology – compute, network, storage, storage technology is a bit of a science and a bit of dark magic. It is complex and that is what makes storage technology so beautiful. The constant innovation and technology advancement continue to make storage as a data services platform relentlessly interesting.

Cloud, Kubernetes and many data-as-a-service platforms require strong persistent storage. As defined by NIST Definition of Cloud Computing, the 4 of the 5 tenets – on-demand self-service, resource pooling, rapid elasticity, measured servicedemand storage to be abstracted. Therefore, I am all for abstraction of storage resources from the data services platform.

But the storageless jargon is doing a great disservice. It is not helping. It does not lend its weight glorifying the innovations of storage. In fact, IMHO, it felt like a weighted anchor sinking storage into the deepest depth, invisible, insignificant and not sexy. I am here dutifully to promote and evangelize storage innovations, and I am duly unimpressed with such a jargon.

Continue reading

OpenZFS 2.0 exciting new future

The OpenZFS (virtual) Developer Summit ended over a weekend ago. I stayed up a bit (not much) to listen to some of the talks because it started midnight my time, and ran till 5am on the first day, and 2am on the second day. Like a giddy schoolboy, I was excited, not because I am working for iXsystems™ now, but I have been a fan and a follower of the ZFS file system for a long time.

History wise, ZFS was conceived at Sun Microsystems in 2005. I started working on ZFS reselling Nexenta in 2009 (my first venture into business with my company nextIQ) after I was professionally released by EMC early that year. I bought a Sun X4150 from one of Sun’s distributors, and started creating a lab server. I didn’t like the workings of NexentaStor (and NexentaCore) very much, and it was priced at 8TB per increment. Later, I started my second company with a partner and it was him who showed me the elegance and beauty of ZFS through the command lines. The creed of ZFS as a volume and a file system at the same time with the CLI had an effect on me. I was in love.

OpenZFS Developer Summit 2020 Logo

OpenZFS Developer Summit 2020 Logo

Exciting developments

Among the many talks shared in the OpenZFS Developer Summit 2020 , there were a few ideas and developments which were exciting to me. Here are 3 which I liked and I provide some commentary about them.

  • Block Reference Table
  • dRAID (declustered RAID)
  • Persistent L2ARC

Continue reading