Sometimes I get really pissed off with myself because I have taken a bigoted view, and ended up with eggs on my face. The past week was like that, and the problem was gnawing me on the inside all week, because I was determined to balance my equilibrium by finding the answer.
Early in the week, I was having a conversation with a potential customer. It evolved around the missing 10 seconds or so of the video footage between the users of a popular video editing software. The company had 70% Windows users, and 30% users on the Mac, both sides accessing the NAS device. The issue was the editors on the Windows side will store the raw and edited files to the NAS, but when the Mac users read them, they will often find 10 seconds or so of the stored video files missing.
The likeliest culprit of this problem is the way the SMB protocol write I/O behaves in Windows and in MacOS. Windows SMB, by default, writes I/O asynchronously while SMB on MacOS writes I/O synchronously.
I had a strong conviction I had the answer to this issue but this was not a TrueNAS®, It was another brand of NAS that I did not have knowledge of, and so, I left the conversation feeling quite embarrassed because I had the answer only on the TrueNAS® server side, not on the Windows client side. Bigotry blinded me. Hmmph!
[ This is part two of “Where are your files living now?”. You can read Part One here ]
“Data locality, Data mobility“. It was a term I like to use a lot when describing about data consolidation, leading to my mention about files and folders, and where they live in my previous blog. The thinking of where the files and folders are now as in everywhere as they can be in a plethora of premises stretches the premise of SSOT (Single Source of Truth). And this expatriation of files with minimal checks and balances disturbs me.
A year ago, just before I joined iXsystems, I was given Google® embargoed news, probably a week before they announced BigQuery Omni. Then I was interviewed by Enterprise IT News, a local Malaysian technology news portal to provide an opinion quote. This was what I quoted:
“’The data warehouse in the cloud’ managed services of Big Query is underpinned by Google® Anthos, its hybrid cloud infra and service management platform based on GKE (Google® Kubernetes Engine). The containerised applications, both on-prem and in the multi-clouds, would allow Anthos to secure and orchestrate infra, services and policy management under one roof.”
I further quoted ” The data repositories remain in each cloud is good to address data sovereignty, data security concerns but it did not mention how it addresses “single source of truth” across multi-clouds.”
Single Source of Truth – regardless of repositories
EMC2 (before the Dell® acquisition) in the 2000s had a tagline called “Where Information Lives™“**. This was before the time of cloud storage. The tagline was an adage of enterprise data storage, proper and contemporaneous to the persistent narrative at the time – Data Consolidation. Within the data consolidation stories, thousands of files and folders moved about the networks of the organizations, from servers to clients, clients to servers. NAS (Network Attached Storage) was, and still is the work horse of many, many organizations.
[ **Side story ] There was an internal anti-EMC joke within NetApp® called “Information has a new address”.
EMC tagline “Where Information Lives”
This was a time where there were almost no concerns about Shadow IT; ransomware were less known; and most importantly, almost everyone knew where their files and folders were, more or less (except in Oil & Gas upstream – to be told in later in this blog). That was because there were concerted attempts to consolidate data, and inadvertently files and folders, in the organization.
Even when these organizations were spread across the world, there were distributed file technologies at the time that could deliver files and folders in an acceptable manner. Definitely not as good as what we have today in a cloudy world, but acceptable. I personally worked a project setting up Andrew File Systems for Intel® in Penang in the mid-90s, almost joined Tacit Networks in the mid-2000s, dabbled on Microsoft®Distributed File System with NetApp® and Windows File Servers while fixing the mountains of issues in deploying the worldwide GUSto (Global Unified Storage) Project in Shell 2006. Somewhere in my chronological listings, Acopia Networks (acquired by F5) and of course, EMC2 Rainfinity and NetApp® NuView OEM, Virtual File Manager.
The point I am trying to make here is most IT organizations had a good grip of where the files and folders were. I do not think this is very true anymore. Do you know where your files and folders are living today?
RAID (Redundant Array of Independent Disks) is the foundation of almost every enterprise storage array in existence. Thus a technology change to a RAID implementation is a big deal. In recent weeks, we have witnessed not one, but two seismic development updates to the volume management RAID subsystem of the OpenZFS open source storage platform.
For the uninformed, ZFS is one of the rarities in the storage industry which combines the volume manager and the file system as one. Unlike traditional volume management, ZFS merges both the physical data storage representations (eg. Hard Disk Drives, Solid State Drives) and the logical data structures (eg. RAID stripe, mirror, Z1, Z2, Z3) together with a highly reliable file system that scales. For a storage practitioner like me, working with ZFS is that there is always a “I get it!” moment every time, because the beauty is there are both elegances of power and simplicity rolled into one.
The Windows DirectStorage API feature is only available in Windows 11. It was announced as part of the Xbox® Velocity Architecture last year to take advantage of the high I/O capability of modern day NVMe SSDs. DirectStorage-enabled applications and games have several technologies such as D3D Direct3D decompression/compression algorithm designed for the GPU, and SFS Sampler Feedback Streaming that uses the previous rendered frame results to decide which higher resolution texture frames to be loaded into memory of the GPU and rendered for the real-time gaming experience.
The cryptocurrency craze has elevated another strong candidate in recent months. Filecoin, is leading the voice of a decentralized Internet, the next generation Web 3.0. In this blog, I am not going to write much about the Filecoin frenzy but the underlying distributed file system that powers this phenomenon – The Interplanetary File System.
[ Note: This is still a very new area for me, and the rest of the content of this blog is still nascent and developing ]
Interplanetary File System
Tremulous Client-Server web architecture
The entire Internet architecture is almost client and server. Your clients like browsers, apps, connect to Web services served from a collection of servers. As Web 3.0 approaches (some say it is already here), the client-server model is no longer perceived as the Internet architecture of choice. Billions, and billions of users, applications, devices relying solely on a centralized service would lead to many impactful consequences, and the reasons for decentralization, away from the client-server architecture models of the Internet are cogent.
I took a week off blogging last week but the lazy days were inundated by bad news. A few more devastating ransomware attacks. This time, Colonial Pipeline in the US was hacked and its networks were shutdown by ransomware. These ransomware threats are never ending, and they are getting more damaging than ever. It is like trying to plug a leaking boat with your hands, and more leaks appear as you plug them.
More ransomware news hitting healthcare around the world last week:
We are forever chasing for a solution, forever losing because almost all technology defenses to protect the data against ransomware are reactive. Why is ransomware still such a big threat then? Time to rethink file security fundamentals.
I am not a Bitcoin miner nor am I a Chia coin farmer, and my knowledge and experience in both are very shallow. But I recently became interested in the 2 main activities of Chia – plotting and farming, because they both involved storage. I am writing this blog to find out more and document about my learning experience.
[ NB: This blog does not help you make money. It is just informational from a storage technology perspective. ]
Proof of Space and Time
Bitcoin is based on Proof-of-Work (PoW). In a nutshell, there is a complex mathematical puzzle to be solved. Bitcoin miners compete to solve this puzzle and the process uses high computational processing to solve it. Once solved, the miners are rewarded for their work.
Newer entrants like Filecoin and Chia coin (XCH) use an alternate method which is Proof-of-Space (PoS) to validate and verify the transactions. Instead of miners, Chia coin farmers have to prove to have a legitimate amount of disk and/or memory space to solve a mathematical puzzle, conceptually similar to the one in Bitcoin mining. In the beginning, this was great for folks who have unused disk space that can be “rented” out to store the crypto stuff (Note: I am not familiar with the terminology yet, and I did not want to use the word “crypto tokens” incorrectly). Storj was one of the early vendors that I remember in this space touting this method but I have not followed them for a while. Their business model might have changed.
The Apple Filing Protocol (AFP) file sharing service in the MacOS Server is gone. The AFP file server capability was dropped in MacOS version 11, aka Big Sur back in December last year. The AFP clientis the last remaining piece in MacOS and may see its days numbered as well as the world of file services evolved from the simple local networks and workgroup collaboration of the 80s and 90s, to something more complex and demanding. The AFP’s decline was also probably aided by the premium prices of Apple hardware, and many past users have switched to Windows for frugality and prudence reasons. SMB/CIFS is the network file sharing services for Windows, and AFP is not offered in Windows natively.
MacOS supports 3 of the file sharing protocols natively – AFP, NFS and SMB/CIFS – as a client. Therefore, it has the capability to collaborate well in many media and content development environments, and sharing and exchanging files easily, assuming that the access control and permissions and files/folders ownerships are worked out properly. The large scale Apple-only network environment is no longer feasible and many studios that continue to use Macs for media and content development have only a handful of machines and users.
NAS vendors that continue to support AFP file server services are not that many too, or at least those who advertise their support for AFP. iXsystems™TrueNAS® is one of the few. This blog shows the steps to setup the AFP file services for MacOS clients.
George Herbert Leigh Mallory, mountaineer extraordinaire, was once asked “Why did you want to climb Mount Everest?“, in which he replied “Because it’s there“. That retort demonstrated the indomitable human spirit and probably exemplified best the relationship between the human being’s desire to conquer the physical limits of nature. The software of humanity versus the hardware of the planet Earth.
Juxtaposing, similarities can be said between software and hardware in computer systems, in storage technology per se. In it, there are a few schools of thoughts when it comes to delivering storage services with the notable ones being the storageappliance model and the software-defined storage model.
There are arguments, of course. Some are genuinely partisan but many a times, these arguments come in the form of the flavour of the moment. I have experienced in my past companies touting the storage appliance model very strongly in the beginning, and only to be switching to a “software company” chorus years after that. That was what I meant about the “flavour of the moment”.