A conceptual distributed enterprise HCI with open source software

Cloud computing has changed everything, at least at the infrastructure level. Kubernetes is changing everything as well, at the application level. Enterprises are attracted by tenets of cloud computing and thus, cloud adoption has escalated. But it does not have to be a zero-sum game. Hybrid computing can give enterprises a balanced choice, and they can take advantage of the best of both worlds.

Open Source has changed everything too because organizations now has a choice to balance their costs and expenditures with top enterprise-grade software. The challenge is what can organizations do to put these pieces together using open source software? Integration of open source infrastructure software and applications can be complex and costly.

The next version of HCI

Hyperconverged Infrastructure (HCI) also changed the game. Integration of compute, network and storage became easier, more seamless and less costly when HCI entered the market. Wrapped with a single control plane, the HCI management component can orchestrate VM (virtual machine) resources without much friction. That was HCI 1.0.

But HCI 1.0 was challenged, because several key components of its architecture were based on DAS (direct attached) storage. Scaling storage from a capacity point of view was limited by storage components attached to the HCI architecture. Some storage vendors decided to be creative and created dHCI (disaggregated HCI). If you break down the components one by one, in my opinion, dHCI is just a SAN (storage area network) to HCI. Maybe this should be HCI 1.5.

A new version of an HCI architecture is swimming in as Angelfish

Kubernetes came into the HCI picture in recent years. Without the weights and dependencies of VMs and DAS at the HCI server layer, lightweight containers orchestrated, mostly by, Kubernetes, made distribution of compute easier. From on-premises to cloud and in between, compute resources can easily spun up or down anywhere.

Continue reading

How well do you know your data and the storage platform that processes the data

Last week was consumed by many conversations on this topic. I was quite jaded, really. Unfortunately many still take a very simplistic view of all the storage technology, or should I say over-marketing of the storage technology. So much so that the end users make incredible assumptions of the benefits of a storage array or software defined storage platform or even cloud storage. And too often caveats of turning on a feature and tuning a configuration to the max are discarded or neglected. Regards for good storage and data management best practices? What’s that?

I share some of my thoughts handling conversations like these and try to set the right expectations rather than overhype a feature or a function in the data storage services.

Complex data networks and the storage services that serve it

I/O Characteristics

Applications and workloads (A&W) read and write from the data storage services platforms. These could be local DAS (direct access storage), network storage arrays in SAN and NAS, and now objects, or from cloud storage services. Regardless of structured or unstructured data, different A&Ws have different behavioural I/O patterns in accessing data from storage. Therefore storage has to be configured at best to match these patterns, so that it can perform optimally for these A&Ws. Without going into deep details, here are a few to think about:

  • Random and Sequential patterns
  • Block sizes of these A&Ws ranging from typically 4K to 1024K.
  • Causal effects of synchronous and asynchronous I/Os to and from the storage

Continue reading

OpenZFS with Object Storage

At AWS re:Invent last week, Amazon Web Services announced Amazon FSx for OpenZFS. This is the 4th managed service under the Amazon FSx umbrella, joining NetApp® ONTAP™, Lustre and Windows File Server. The highly scalable OpenZFS filesystem can provide high throughput and IOPS bandwidth to Amazon EC2, ECS, EKS and VMware® Cloud on AWS.

I am assuming the AWS OpenZFS uses EBS as the block storage backend, given the announcement that it can deliver 4GB/sec of throughput and 160,000 IOPS from the “drives” without caching. How the OpenZFS is provisioned to the AWS clients is well documented in this blog here. It is an absolutely joy (for me) to see the open source OpenZFS filesystem getting the validation and recognization from AWS. This is one hell of a filesystem.

But this blog isn’t about AWS FSx for OpenZFS with block storage. It is about what is coming, and eventually AWS FSx for OpenZFS could expand into AWS’s proficient S3 storage as well.  Can OpenZFS integrate with an S3 object storage backend? This blog looks into the burning question.

In the recently concluded OpenZFS Developer Summit 2021, one of the topics was “ZFS on Object Storage“, and the short answer is a resounding YES!

OpenZFS Developer Summit 2021

Continue reading

Control your Files. Control your Sovereignty.

Data residency, data sovereignty, data localization – the trio of data compliance and governance – have been on my mind a lot lately. I am seeing a disturbing trend. “Splinternet” has taken a hurried and hastened pace. We are now seeing many countries drawing up digital boundaries in the name of data privacy and data protection with sovereign laws and regulations. Besides, these digital demarcation along the lines with data definitions, digital “colonization” is a strong undercurrent as developing countries are accepting larger and more powerful foreign powers into their playpen.

Public cloud services transcend national borders. The breakneck speed in the adoption of public cloud services is causing anxieties and concerns with conservative governments everywhere. On the flip side of the coin, commerce has certainly flourished and bloomed as global wide collaborations bring new opportunities, new markets – all for capitalism and growth.

[ Note: While we are on this debacle, the voices of decentralization are getting louder as well, but that is a topic for another day ]

Where are your data files now?

Continue reading

Right time for Andrew. The Filesystem that is.

I couldn’t hold my excitement when I discovered Auristor® early last week. I stumbled upon this Computerweekly article “Want to side step Public Cloud? Auristor® offers global file storage.” Given the many news not exactly praising the public cloud storage vendors nowadays, the article’s title caught my attention. Immediately Andrew File System (AFS) was there. I was perplexed at first because I have never seen or heard a commercial version of AFS before. This news gave me goosebumps.

For the curious, I am sure many will ask who is this Andrew anyway? What is my relationship with this Andrew?

One time with Andrew

A bit of my history. I recalled quite vividly helping Intel in Penang, Malaysia to implement their globally distributed file caching mechanism with the NetApp® filer’s NFS. It was probably 2001 and I believed Intel wanted to share their engineering computing (EC) files between their US facilities and Intel Penang Design Center (PDC). As I worked along with the Intel folks, I found out that this distributed file caching technology was called Andrew File System (AFS).

Although I couldn’t really recalled how the project went, I remembered it being a bed of bugs at that time. But being the storage geek that I am, I obviously took some time to get to know Andrew the File System. 20 years have gone by, and I never really thought of AFS coming out as a commercial solution or even knew of it as one, until Auristor®,

Auristor Logo

Continue reading

The Starbucks model for Storage-as-a-Service

Starbucks™ is not a coffee shop. It purveys beyond coffee and tea, and food and puts together the yuppie beverages experience. The intention is to get the customers to stay as long as they can, and keep purchasing the Starbucks’ smorgasbord of high margin provisions in volume. Wifi, ambience, status, coffee or tea with your name on it (plenty of jokes and meme there), energetic baristas and servers, fancy coffee roasts and beans et. al. All part of the Starbucks™-as-a-Service pleasurable affair that intends to lock the customer in and have them keep coming back.

The Starbucks experience

Data is heavy and they know it

Unlike compute and network infrastructures, storage infrastructures holds data persistently and permanently. Data has to land on a piece of storage medium. Coupled that with the fact that data is heavy, forever growing and data has gravity, you have a perfect recipe for lock-in. All storage purveyors, whether they are on-premises data center enterprise storage or public cloud storage, and in between, there are many, many methods to keep the data chained to a storage technology or a storage service for a long time. The storage-as-a-service is like tying the cow to the stake and keeps on milking it. This business model is very sticky. This stickiness is also a lock-in mechanism.

Continue reading

Open Source Storage Technology Crafters

The conversation often starts with a challenge. “What’s so great about open source storage technology?

For the casual end users of storage systems, regardless of SAN (definitely not Fibre Channel) or NAS on-premises, or getting “files” from the personal cloud storage like Dropbox, OneDrive et al., there is a strong presumption that open source storage technology is cheap and flaky. This is not helped with the diet of consumer brands of NAS in the market, where the price is cheap, but the storage offering with capabilities, reliability and performance are found to be wanting. Thus this notion floats its way to the business and enterprise users, and often ended up with a negative perception of open source storage technology.

Highway Signpost with Open Source wording

Storage Assemblers

Anybody can “build” a storage system with open source storage software. Put the software together with any commodity x86 server, and it can function with the basic storage services. Most open source storage software can do the job pretty well. However, once the completed storage technology is put together, can it do the job well enough to serve a business critical end user? I have plenty of sob stories from end users I have spoken to in these many years in the industry related to so-called “enterprise” storage vendors. I wrote a few blogs in the past that related to these sad situations:

We have such storage offerings rigged with cybersecurity risks and holes too. In a recent Unit 42 report, 250,000 NAS devices are vulnerable and exposed to the public Internet. The brands in question are mentioned in the report.

I would categorize these as storage assemblers.

Continue reading

Windows SMB synchronous writes with OpenZFS

Sometimes I get really pissed off with myself because I have taken a bigoted view, and ended up with eggs on my face. The past week was like that, and the problem was gnawing me on the inside all week, because I was determined to balance my equilibrium by finding the answer.

Early in the week, I was having a conversation with a potential customer. It evolved around the missing 10 seconds or so of the video footage between the users of a popular video editing software. The company had 70% Windows users, and 30% users on the Mac, both sides accessing the NAS device. The issue was the editors on the Windows side will store the raw and edited files to the NAS, but when the Mac users read them, they will often find 10 seconds or so of the stored video files missing.

The likeliest culprit of this problem is the way the SMB protocol write I/O behaves in Windows and in MacOS. Windows SMB, by default, writes I/O asynchronously while SMB on MacOS writes I/O synchronously.

I had a strong conviction I had the answer to this issue but this was not a TrueNAS®, It was another brand of NAS that I did not have knowledge of, and so, I left the conversation feeling quite embarrassed because I had the answer only on the TrueNAS® server side, not on the Windows client side. Bigotry blinded me. Hmmph! 

SMB (Server Message Block) client-server model

Continue reading

Setting up Nextcloud on FreeNAS Part 2

[ Note: ] This is a continuation of Setting up Nextcloud on FreeNAS Part 1 in June 2021 blog.

Nextcloud logo

I mentioned in my previous blog that what I did here was not unique. There were many great open source crafters who have done this better than I did. I stood on the shoulders of giants whose videos have helped me to learn and configure Nextcloud on FreeNAS™ (not TrueNAS® CORE, because my weekend exercises were on version 11.2U5). The videos made by Nhan P. Nguyen were instrumental in getting my Nextcloud to work, and I would shamefully admit that I have copied his work almost verbatim.

Continue reading

Enterprise Storage is not just a Label

I have many anecdotes around the topic of Enterprise Storage, but the conversations in the past 2 weeks made it important for me to share this.

Enterprise Storage is …

Amusing, painful, angry

I get riled up whenever people do not want to be educated about Enterprise Storage. Here are a few that happened in the last 2 weeks.

[ Story #1 ]

A guy was building his own storage for cryptocurrency. He was informed by his supplier that the RAID card was enterprise, and he could get the best performance using “Enterprise” RAID-0.

  • Well, “Enterprise” RAID-0 volume crashed, and he lost all data. Painfully, he said he lost a hefty sum financially

[ Story #2 ]

A media company complained about the reliability of previous storage vendor. The GM was shopping around and was told that there are “Enterprise” SATA drives and the reliability is as good, if not better than SAS drives.

  • The company wanted a fully reliable Enterprise Storage system with 99.999% availability, and yet the SATA interface was not meant to build a more highly reliable enterprise storage. The GM insisted to use “Enterprise” SATA drives for his “enterprise” storage system instead of SAS.  

[ Story #3 ]

An IT admin of a manufacturing company claimed that they had an “Enterprise Storage” system for a few years, and could not figure out why his hard disk drives would die every 12-15 months.

  • He figured out that the drives supplied by his vendor were consumer SATA drives, even though he was told it was an “Enterprise Storage” system when he bought the system.

Continue reading