Crash consistent data recovery for ZFS volumes

While TrueNAS® CORE and TrueNAS® Enterprise are more well known for its NAS (network attached storage) prowess, many organizations are also confidently placing their enterprise applications such as hypervisors and databases on TrueNAS® via SANs (storage area networks) as well. Both iSCSI and Fibre Channel™ (selected TrueNAS® Enterprise storage models) protocols are supported well.

To reliably protect these block-based applications via the SAN protocols, ZFS snapshot is the key technology that can be dependent upon to restore the enterprise applications quickly. However, there are still some confusions when it comes to the state of recovery from the ZFS snapshots. On that matter, this situations are not unique to the ZFS environments because as with many other storage technologies, the confusion often stem from the (mis)understanding of the consistency state of the data in the backups and in the snapshots.

Crash Consistency vs Application Consistency

To dispel this misunderstanding, we must first begin with the understanding of a generic filesystem agnostic snapshot. It is a point-in-time copy, just like a data copy on the tape or in the disks or in the cloud backup. It is a complete image of the data and the state of the data at the storage layer at the time the storage snapshot was taken. This means that the data and metadata in this snapshot copy/version has a consistent state at that point in time. This state is frozen for this particular snapshot version, and therefore it is often labeled as “crash consistent“.

In the event of a subsystem (application, compute, storage, rack, site, etc) failure or a power loss, data recovery can be initiated using the last known “crash consistent” state, i.e. restoring from the last good backup or snapshot copy. Depending on applications, operating systems, hypervisors, filesystems and the subsystems (journals, transaction logs, protocol resiliency primitives etc) that are aligned with them, some workloads will just continue from where it stopped. It may already have some recovery mechanisms or these workloads can accept data loss without data corruption and inconsistencies.

Some applications, especially databases, are more sensitive to data and state consistencies. That is because of how these applications are designed. Take for instance, the Oracle® database. When an Oracle® database instance is online, there is an SGA (system global area) which handles all the running mechanics of the database. SGA exists in the memory of the compute along with transaction logs, tablespaces, and open files that represent the Oracle® database instance. From time to time, often measured in seconds, the state of the Oracle® instance and the data it is processing have to be synched to non-volatile, persistent storage. This commit is important to ensure the integrity of the data at all times.

Continue reading

Control your Files. Control your Sovereignty.

Data residency, data sovereignty, data localization – the trio of data compliance and governance – have been on my mind a lot lately. I am seeing a disturbing trend. “Splinternet” has taken a hurried and hastened pace. We are now seeing many countries drawing up digital boundaries in the name of data privacy and data protection with sovereign laws and regulations. Besides, these digital demarcation along the lines with data definitions, digital “colonization” is a strong undercurrent as developing countries are accepting larger and more powerful foreign powers into their playpen.

Public cloud services transcend national borders. The breakneck speed in the adoption of public cloud services is causing anxieties and concerns with conservative governments everywhere. On the flip side of the coin, commerce has certainly flourished and bloomed as global wide collaborations bring new opportunities, new markets – all for capitalism and growth.

[ Note: While we are on this debacle, the voices of decentralization are getting louder as well, but that is a topic for another day ]

Where are your data files now?

Continue reading

Right time for Andrew. The Filesystem that is.

I couldn’t hold my excitement when I discovered Auristor® early last week. I stumbled upon this Computerweekly article “Want to side step Public Cloud? Auristor® offers global file storage.” Given the many news not exactly praising the public cloud storage vendors nowadays, the article’s title caught my attention. Immediately Andrew File System (AFS) was there. I was perplexed at first because I have never seen or heard a commercial version of AFS before. This news gave me goosebumps.

For the curious, I am sure many will ask who is this Andrew anyway? What is my relationship with this Andrew?

One time with Andrew

A bit of my history. I recalled quite vividly helping Intel in Penang, Malaysia to implement their globally distributed file caching mechanism with the NetApp® filer’s NFS. It was probably 2001 and I believed Intel wanted to share their engineering computing (EC) files between their US facilities and Intel Penang Design Center (PDC). As I worked along with the Intel folks, I found out that this distributed file caching technology was called Andrew File System (AFS).

Although I couldn’t really recalled how the project went, I remembered it being a bed of bugs at that time. But being the storage geek that I am, I obviously took some time to get to know Andrew the File System. 20 years have gone by, and I never really thought of AFS coming out as a commercial solution or even knew of it as one, until Auristor®,

Auristor Logo

Continue reading

The Starbucks model for Storage-as-a-Service

Starbucks™ is not a coffee shop. It purveys beyond coffee and tea, and food and puts together the yuppie beverages experience. The intention is to get the customers to stay as long as they can, and keep purchasing the Starbucks’ smorgasbord of high margin provisions in volume. Wifi, ambience, status, coffee or tea with your name on it (plenty of jokes and meme there), energetic baristas and servers, fancy coffee roasts and beans et. al. All part of the Starbucks™-as-a-Service pleasurable affair that intends to lock the customer in and have them keep coming back.

The Starbucks experience

Data is heavy and they know it

Unlike compute and network infrastructures, storage infrastructures holds data persistently and permanently. Data has to land on a piece of storage medium. Coupled that with the fact that data is heavy, forever growing and data has gravity, you have a perfect recipe for lock-in. All storage purveyors, whether they are on-premises data center enterprise storage or public cloud storage, and in between, there are many, many methods to keep the data chained to a storage technology or a storage service for a long time. The storage-as-a-service is like tying the cow to the stake and keeps on milking it. This business model is very sticky. This stickiness is also a lock-in mechanism.

Continue reading

Open Source Storage Technology Crafters

The conversation often starts with a challenge. “What’s so great about open source storage technology?

For the casual end users of storage systems, regardless of SAN (definitely not Fibre Channel) or NAS on-premises, or getting “files” from the personal cloud storage like Dropbox, OneDrive et al., there is a strong presumption that open source storage technology is cheap and flaky. This is not helped with the diet of consumer brands of NAS in the market, where the price is cheap, but the storage offering with capabilities, reliability and performance are found to be wanting. Thus this notion floats its way to the business and enterprise users, and often ended up with a negative perception of open source storage technology.

Highway Signpost with Open Source wording

Storage Assemblers

Anybody can “build” a storage system with open source storage software. Put the software together with any commodity x86 server, and it can function with the basic storage services. Most open source storage software can do the job pretty well. However, once the completed storage technology is put together, can it do the job well enough to serve a business critical end user? I have plenty of sob stories from end users I have spoken to in these many years in the industry related to so-called “enterprise” storage vendors. I wrote a few blogs in the past that related to these sad situations:

We have such storage offerings rigged with cybersecurity risks and holes too. In a recent Unit 42 report, 250,000 NAS devices are vulnerable and exposed to the public Internet. The brands in question are mentioned in the report.

I would categorize these as storage assemblers.

Continue reading

Windows SMB synchronous writes with OpenZFS

Sometimes I get really pissed off with myself because I have taken a bigoted view, and ended up with eggs on my face. The past week was like that, and the problem was gnawing me on the inside all week, because I was determined to balance my equilibrium by finding the answer.

Early in the week, I was having a conversation with a potential customer. It evolved around the missing 10 seconds or so of the video footage between the users of a popular video editing software. The company had 70% Windows users, and 30% users on the Mac, both sides accessing the NAS device. The issue was the editors on the Windows side will store the raw and edited files to the NAS, but when the Mac users read them, they will often find 10 seconds or so of the stored video files missing.

The likeliest culprit of this problem is the way the SMB protocol write I/O behaves in Windows and in MacOS. Windows SMB, by default, writes I/O asynchronously while SMB on MacOS writes I/O synchronously.

I had a strong conviction I had the answer to this issue but this was not a TrueNAS®, It was another brand of NAS that I did not have knowledge of, and so, I left the conversation feeling quite embarrassed because I had the answer only on the TrueNAS® server side, not on the Windows client side. Bigotry blinded me. Hmmph! 

SMB (Server Message Block) client-server model

Continue reading

Setting up Nextcloud on FreeNAS Part 2

[ Note: ] This is a continuation of Setting up Nextcloud on FreeNAS Part 1 in June 2021 blog.

Nextcloud logo

I mentioned in my previous blog that what I did here was not unique. There were many great open source crafters who have done this better than I did. I stood on the shoulders of giants whose videos have helped me to learn and configure Nextcloud on FreeNAS™ (not TrueNAS® CORE, because my weekend exercises were on version 11.2U5). The videos made by Nhan P. Nguyen were instrumental in getting my Nextcloud to work, and I would shamefully admit that I have copied his work almost verbatim.

Continue reading

RAIDZ expansion and dRAID excellent OpenZFS adventure

RAID (Redundant Array of Independent Disks) is the foundation of almost every enterprise storage array in existence. Thus a technology change to a RAID implementation is a big deal. In recent weeks, we have witnessed not one, but two seismic development updates to the volume management RAID subsystem of the OpenZFS open source storage platform.

OpenZFS logo

For the uninformed, ZFS is one of the rarities in the storage industry which combines the volume manager and the file system as one. Unlike traditional volume management, ZFS merges both the physical data storage representations (eg. Hard Disk Drives, Solid State Drives) and the logical data structures (eg. RAID stripe, mirror, Z1, Z2, Z3) together with a highly reliable file system that scales. For a storage practitioner like me, working with ZFS is that there is always a “I get it!” moment every time, because the beauty is there are both elegances of power and simplicity rolled into one.

Continue reading

First looks into Interplanetary File System

The cryptocurrency craze has elevated another strong candidate in recent months. Filecoin, is leading the voice of a decentralized Internet, the next generation Web 3.0. In this blog, I am not going to write much about the Filecoin frenzy but the underlying distributed file system that powers this phenomenon – The Interplanetary File System.

[ Note: This is still a very new area for me, and the rest of the content of this blog is still nascent and developing ]

Interplanetary File System

Tremulous Client-Server web architecture

The entire Internet architecture is almost client and server. Your clients like browsers, apps, connect to Web services served from a collection of servers. As Web 3.0 approaches (some say it is already here), the client-server model is no longer perceived as the Internet architecture of choice. Billions, and billions of users, applications, devices relying solely on a centralized service would lead to many impactful consequences, and the reasons for decentralization, away from the client-server architecture models of the Internet are cogent.

Continue reading

Setting up Nextcloud on FreeNAS Part 1

I have started to enhance the work that I did last weekend with Nextcloud on FreeNAS™. I promised to share the innards of my work but first I have to set the right expectations for the readers. This blog is just a documentation of the early work I have been doing to get Nextcloud on FreeNAS™ off the ground quickly. Also there are far better blogs than mine on the Nextcloud topic.

Note:

Nextcloud 17 (latest version is version 21)

Continue reading