Stating the case for a Storage Appliance approach

I was in Indonesia last week to meet with iXsystems™‘ partner PT Maha Data Solusi. I had the wonderful opportunity to meet with many people there and one interesting and often-replayed question arose. Why aren’t iX doing software-defined-storage (SDS)? It was a very obvious and deliberate question.

After all, iX is already providing the free use of the open source TrueNAS® CORE software that runs on many x86 systems as an SDS solution and yet commercially, iX sell the TrueNAS® storage appliances.

This argument between a storage appliance model and a storage storage only model has been debated for more than a decade, and it does come into my conversations on and off. I finally want to address this here, with my own views and opinions. And I want to inform that I am open to both models, because as a storage consultant, both have their pros and cons, advantages and disadvantages. Up front I gravitate to the storage appliance model, and here’s why.

My story of the storage appliance begins …

Back in the 90s, most of my work was on Fibre Channel and NFS. iSCSI has not existed yet (iSCSI was ratified in 2003). It was almost exclusively on the Sun Microsystems® enterprise storage with Sun’s software resell of the Veritas® software suite that included the Sun Volume Manager (VxVM), Veritas® Filesystem (VxFS), Veritas® Replication (VxVR) and Veritas® Cluster Server (VCS). I didn’t do much Veritas® NetBackup (NBU) although I was trained at Veritas® in Boston in July 1997 (I remembered that 2 weeks’ trip fondly). It was just over 2 months after Veritas® acquired OpenVision. Backup Plus was the NetBackup.

Between 1998-1999, I spent a lot of time working Sun NFS servers. The prevalent networking speed at that time was 100Mbits/sec. And I remember having this argument with a Sun partner engineer by the name of Wong Teck Seng. Teck Seng was an inquisitive fella (still is) and he was raving about this purpose-built NFS server he knew about and he shared his experience with me. I detracted him, brushing aside his always-on tech orgasm, and did not find great things about a NAS storage appliance. Auspex™ was big then, and I knew of them.

I joined NetApp® as Malaysia’s employee #2. It was an odd few months working with a storage appliance but after a couple of months, I started to understand and appreciate the philosophy. The storage Appliance Model made sense to me, even through these days.

Continue reading

As Disk Drive capacity gets larger (and larger), the resilient Filesystem matters

I just got home from the wonderful iXsystems™ Sales Summit in Knoxville, Tennessee. The key highlight was to christian the opening of iXsystems™ Maryville facility, the key operations center that will house iX engineering, support and part of marketing as well. News of this can be found here.

iX datacenter in the new Maryville facility

Western Digital® has always been a big advocate of iX, and at the Summit, they shared their hard disk drives HDD, solid state drives SSD, and other storage platforms roadmaps. I felt like a kid a candy store because I love all these excitements in the disk drive industry. Who says HDDs are going to be usurped by SSDs?

Several other disk drive manufacturers, including Western Digital®, have announced larger capacity drives. Here are some news of each vendor in recent months

Other than the AFR (annualized failure rates) numbers published by Backblaze every quarter, the Capacity factor has always been a measurement of high interest in the storage industry.

Continue reading

Unstructured Data Observability with Datadobi StorageMAP

Let’s face it. Data is bursting through its storage seams. And every organization now is storing too much data that they don’t know they have.

By 2025, IDC predicts that 80% the world’s data will be unstructured. IDC‘s report Global Datasphere Forecast 2021-2025 will see the global data creation and replication capacity expand to 181 zettabytes, an unfathomable figure. Organizations are inundated. They struggle with data growth, with little understanding of what data they have, where the data is residing, what to do with the data, and how to manage the voluminous data deluge.

The simple knee-jerk action is to store it in cloud object storage where the price of storage is $0.0000xxx/GB/month. But many IT departments in these organizations often overlook the fact that that the data they have parked in the cloud require movement between the cloud and on-premises. I have been involved in numerous discussions where the customers realized that they moved the data in the cloud moved too frequently. Often it was an erred judgement or short term blindness (blinded by the cheap storage costs no doubt), further exacerbated by the pandemic. These oversights have resulted in expensive and painful monthly API calls and egress fees. Welcome to reality. Suddenly the cheap cloud storage doesn’t sound so cheap after all.

The same can said about storing non-active unstructured data on primary storage. Many organizations have not been disciplined to practise good data management. The primary Tier 1 storage becomes bloated over time, grinding sluggishly as the data capacity grows. I/O processing becomes painfully slow and backup takes longer and longer. Sounds familiar?

The A in ABC

I brought up the ABC mantra a few blogs ago. A is for Archive First. It is part of my data protection consulting practice conversation repertoire, and I use it often to advise IT organizations to be smart with their data management. Before archiving (some folks like to call it tiering, but I am not going down that argument today), we must know what to archive. We cannot blindly send all sorts of junk data to the secondary or tertiary storage premises. If we do that, it is akin to digging another hole to fill up the first hole.

We must know which unstructured data to move replicate or sync from the Tier 1 storage to a second (or third) less taxing storage premises. We must be able to see this data, observe its behaviour over time, and decide the best data management practice to apply to this data. Take note that I said best data management practice and not best storage location in the previous sentence. There has to be a clear distinction that a data management strategy is more prudent than to a “best” storage premises. The reason is many organizations are ignorantly thinking the best storage location (the thought of the “cheapest” always seems to creep up) is a good strategy while ignoring the fact that data is like water. It moves from premises to premises, from on-prem to cloud, cloud to other cloud. Data mobility is a variable in data management.

Continue reading

I built a 6-node Gluster cluster with TrueNAS SCALE

I haven’t had hands-on with Gluster for over a decade. My last blog about Gluster was in 2011, right after I did a proof-of-concept for the now defunct, Jaring, Malaysia’s first ISP (Internet Service Provider). But I followed Gluster’s development on and off, until I found out that Gluster was a feature in then upcoming TrueNAS® SCALE. That was almost 2 years ago, just before I accepted to offer to join iXsystems™, my present employer.

The eagerness to test drive Gluster (again) on TrueNAS® SCALE has always been there but I waited for SCALE to become GA. GA finally came on February 22, 2022. My plans for the test rig was laid out, and in the past few weeks, I have been diligently re-learning and putting up the scope to built a 6-node Gluster clustered storage with TrueNAS® SCALE VMs on Virtualbox®.

Gluster on OpenZFS with TrueNAS SCALE

Before we continue, I must warn that this is not pretty. I have limited computing resources in my homelab, but Gluster worked beautifully once I ironed out the inefficiencies. Secondly, this is not a performance test as well, for obvious reasons. So, this is the annals along with the trials and tribulations of my 6-node Gluster cluster test rig on TrueNAS® SCALE.

Continue reading

Nakivo Backup Replication architecture and installation on TrueNAS – Part 1

Backup and Replication software have received strong mandates in organizations with enterprise mindsets and vision. But lower down the rung, small medium organizations are less invested in backup and replication software. These organizations know full well that they must backup, replicate and protect their servers, physical and virtual, and also new workloads in the clouds, given the threat of security breaches and ransomware is looming larger and larger all the time. But many are often put off by the cost of implementing and deploying a Backup and Replication software.

So I explored one of the lesser known backup and recovery software called Nakivo® Backup and Replication (NBR) and took the opportunity to build a backup and replication appliance in my homelab with TrueNAS®. My objective was to create a cost effective option for small medium organizations to enjoy enterprise-grade protection and recovery without the hefty price tag.

This blog, Part 1, writes about the architecture overview of Nakivo® and the installation of the NBR software in TrueNAS® to bake in and create the concept of a backup and replication appliance. Part 2, in a future blog post, will cover the administrative and operations usage of NBR.

Continue reading

Exploring the venerable NFS Ganesha

As TrueNAS® SCALE approaches its General Availability date in less than 10 days time, one of the technology pieces I am extremely excited about in TrueNAS® SCALE is the NFS Ganesha server. It is still early days to see the full prowess of NFS Ganesha in TrueNAS® SCALE, but the potential of Ganesha’s capabilities in iXsystems™ new scale-out storage technology is very, very promising.

NFS Ganesha

I love Network File System (NFS). It was one of the main reasons I was so attracted to Sun Microsystems® SunOS in the first place. 6 months before I graduated, I took a Unix systems programming course in C in the university. The labs were on Sun 3/60 workstations. Coming from a background of a VAX/VMS system administrator in the school’s lab, Unix became a revelation for me. It completely (and blissfully) opened my eyes to open technology, and NFS was the main catalyst. Till this day, my devotion to Unix remained sacrosanct because of the NFS spark aeons ago.

I don’t know NFS Ganesha. I knew of its existence for almost a decade, but I have never used it. Most of the NFS daemons/servers I worked with were kernel NFS, and these included NFS services in Sun SunOS/Solaris, several Linux flavours – Red Hat®, SuSE®, Ubuntu, BSD variants in FreeBSD and MacOS, the older Unices of the 90s – HP-UX, Ultrix, AIX and Irix along with SCO Unix and Microsoft® XenixNetApp® ONTAP™, EMC® Isilon (very briefly), Hitachi® HNAS (née BlueArc) and of course, in these past 5-6 years FreeNAS®/TrueNAS™.

So, as TrueNAS® SCALE beckons, I took to this weekend to learn a bit about NFS Ganesha. Here are what I have learned.

Continue reading

Crash consistent data recovery for ZFS volumes

While TrueNAS® CORE and TrueNAS® Enterprise are more well known for its NAS (network attached storage) prowess, many organizations are also confidently placing their enterprise applications such as hypervisors and databases on TrueNAS® via SANs (storage area networks) as well. Both iSCSI and Fibre Channel™ (selected TrueNAS® Enterprise storage models) protocols are supported well.

To reliably protect these block-based applications via the SAN protocols, ZFS snapshot is the key technology that can be dependent upon to restore the enterprise applications quickly. However, there are still some confusions when it comes to the state of recovery from the ZFS snapshots. On that matter, this situations are not unique to the ZFS environments because as with many other storage technologies, the confusion often stem from the (mis)understanding of the consistency state of the data in the backups and in the snapshots.

Crash Consistency vs Application Consistency

To dispel this misunderstanding, we must first begin with the understanding of a generic filesystem agnostic snapshot. It is a point-in-time copy, just like a data copy on the tape or in the disks or in the cloud backup. It is a complete image of the data and the state of the data at the storage layer at the time the storage snapshot was taken. This means that the data and metadata in this snapshot copy/version has a consistent state at that point in time. This state is frozen for this particular snapshot version, and therefore it is often labeled as “crash consistent“.

In the event of a subsystem (application, compute, storage, rack, site, etc) failure or a power loss, data recovery can be initiated using the last known “crash consistent” state, i.e. restoring from the last good backup or snapshot copy. Depending on applications, operating systems, hypervisors, filesystems and the subsystems (journals, transaction logs, protocol resiliency primitives etc) that are aligned with them, some workloads will just continue from where it stopped. It may already have some recovery mechanisms or these workloads can accept data loss without data corruption and inconsistencies.

Some applications, especially databases, are more sensitive to data and state consistencies. That is because of how these applications are designed. Take for instance, the Oracle® database. When an Oracle® database instance is online, there is an SGA (system global area) which handles all the running mechanics of the database. SGA exists in the memory of the compute along with transaction logs, tablespaces, and open files that represent the Oracle® database instance. From time to time, often measured in seconds, the state of the Oracle® instance and the data it is processing have to be synched to non-volatile, persistent storage. This commit is important to ensure the integrity of the data at all times.

Continue reading

How well do you know your data and the storage platform that processes the data

Last week was consumed by many conversations on this topic. I was quite jaded, really. Unfortunately many still take a very simplistic view of all the storage technology, or should I say over-marketing of the storage technology. So much so that the end users make incredible assumptions of the benefits of a storage array or software defined storage platform or even cloud storage. And too often caveats of turning on a feature and tuning a configuration to the max are discarded or neglected. Regards for good storage and data management best practices? What’s that?

I share some of my thoughts handling conversations like these and try to set the right expectations rather than overhype a feature or a function in the data storage services.

Complex data networks and the storage services that serve it

I/O Characteristics

Applications and workloads (A&W) read and write from the data storage services platforms. These could be local DAS (direct access storage), network storage arrays in SAN and NAS, and now objects, or from cloud storage services. Regardless of structured or unstructured data, different A&Ws have different behavioural I/O patterns in accessing data from storage. Therefore storage has to be configured at best to match these patterns, so that it can perform optimally for these A&Ws. Without going into deep details, here are a few to think about:

  • Random and Sequential patterns
  • Block sizes of these A&Ws ranging from typically 4K to 1024K.
  • Causal effects of synchronous and asynchronous I/Os to and from the storage

Continue reading

Right time for Andrew. The Filesystem that is.

I couldn’t hold my excitement when I discovered Auristor® early last week. I stumbled upon this Computerweekly article “Want to side step Public Cloud? Auristor® offers global file storage.” Given the many news not exactly praising the public cloud storage vendors nowadays, the article’s title caught my attention. Immediately Andrew File System (AFS) was there. I was perplexed at first because I have never seen or heard a commercial version of AFS before. This news gave me goosebumps.

For the curious, I am sure many will ask who is this Andrew anyway? What is my relationship with this Andrew?

One time with Andrew

A bit of my history. I recalled quite vividly helping Intel in Penang, Malaysia to implement their globally distributed file caching mechanism with the NetApp® filer’s NFS. It was probably 2001 and I believed Intel wanted to share their engineering computing (EC) files between their US facilities and Intel Penang Design Center (PDC). As I worked along with the Intel folks, I found out that this distributed file caching technology was called Andrew File System (AFS).

Although I couldn’t really recalled how the project went, I remembered it being a bed of bugs at that time. But being the storage geek that I am, I obviously took some time to get to know Andrew the File System. 20 years have gone by, and I never really thought of AFS coming out as a commercial solution or even knew of it as one, until Auristor®,

Auristor Logo

Continue reading

Windows SMB synchronous writes with OpenZFS

Sometimes I get really pissed off with myself because I have taken a bigoted view, and ended up with eggs on my face. The past week was like that, and the problem was gnawing me on the inside all week, because I was determined to balance my equilibrium by finding the answer.

Early in the week, I was having a conversation with a potential customer. It evolved around the missing 10 seconds or so of the video footage between the users of a popular video editing software. The company had 70% Windows users, and 30% users on the Mac, both sides accessing the NAS device. The issue was the editors on the Windows side will store the raw and edited files to the NAS, but when the Mac users read them, they will often find 10 seconds or so of the stored video files missing.

The likeliest culprit of this problem is the way the SMB protocol write I/O behaves in Windows and in MacOS. Windows SMB, by default, writes I/O asynchronously while SMB on MacOS writes I/O synchronously.

I had a strong conviction I had the answer to this issue but this was not a TrueNAS®, It was another brand of NAS that I did not have knowledge of, and so, I left the conversation feeling quite embarrassed because I had the answer only on the TrueNAS® server side, not on the Windows client side. Bigotry blinded me. Hmmph! 

SMB (Server Message Block) client-server model

Continue reading