Somewhere, there is a misconception that data processing is cheap. That stems from the well-known pricings of the capacities of public cloud storage that are a fraction of cents per month. But data in storage has to be worked upon, and has to be built up and protected to increase its value. Data has to be processed, moved, shared, and used by applications. Data induce workloads. Nobody keeps data stored forever and never be used again. Nobody buys storage just for capacity alone.
We have a great saying in the industry. No matter, where the data moves, it will land in a storage. So, it is clear that data does not exist in ether. And yet, I often see how little attention and prudence and care, when it comes to data infrastructure and data management technologies, the very components that are foundational to great data.
Great data management for Great AI
AI is driving up costs in data processing
A few recent articles drew my focus into the cost of data processing.
A few weeks ago, I decided to wipe clean my entire lab setup running Proxmox 6.2. I wanted to connect the latest version of Proxmox VE 8.0-2 using iSCSI LUNs from the TrueNAS® system I have with me. I thought it would be fun to have the configuration steps and the process documented. This is my journal on how to provision a TrueNAS® CORE iSCSI LUN to Proxmox storage. This iSCSI volume in Proxmox is where the VMs will be installed into.
Here is a simplified network diagram of my setup but it will be expanded to a Proxmox cluster in the future with the shared storage.
Proxmox and TrueNAS network setup
Preparing the iSCSI LUN provisioning
The iSCSI LUN (logical unit number) is provisioned as a logical disk volume to the Proxmox node, where the initiator-target relationship and connection are established.
This part assumes that a zvol has been created from the zpool. At the same time, the IQN (iSCSI Qualified Name) should be known to the TrueNAS® storage as it establishes the connection between Proxmox (iSCSI initiator) and TrueNAS (iSCSI target).
The IQN for Proxmox can be found by viewing the content of the /etc/iscsi/initiatorname.iscsi within the Proxmox shell as shown in the screenshot below.
Where to find the Proxmox iSCSI IQN
The green box shows the IQN number of the Proxmox node that starts with iqn.year-month.com.domain:generated-hostname. This will be used during the iSCSI target portal configuration in the TrueNAS® webGUI.
I was listening to several storage luminaries in the GestaltIT’s podcast “No one understands Storage anymore” a few of weeks ago. Around the minute of 11.09 in the podcast, Dr. J. Metz, SNIA® Chair, brought up this is powerful quote “Storage does not mean Capacity“. It struck me, not in a funny way. It is what it is, and it something I wanted to say to many who do not understand the storage solutions they are purchasing. It exemplifies what is wrong in the many organizations today in their understanding of investing in a storage infrastructure project.
This is my pet peeve. The first words uttered in most, if not all storage requirements in my line of work are, “I want this many Terabytes of storage“. There are no other details and context of what the other requirement factors are, such as availability, performance, future growth, etc. Or even the goals to achieve when purchasing a storage system and operating it. What is the improvement they are looking for?What are the problems to solve?
Where is the OKR?
It pains me to say this. For the folks who have in the IT industry for years, both end users and IT purveyors alike, most are absolutely clueless about OKR (Objectives and Key Results) for their storage infrastructure project. Many cannot frame the data challenges they are facing, and they have no idea where to go next. There is no alignment. There is no strategy. Even worse, there is no concept of how their storage infrastructure investments will improve their business and operations.
Just the other day, one company director from a renown IT integrator here in Malaysia came calling. He has been in the IT industry since 1989 (I checked his Linkedin profile), asking to for a 100TB storage quote. I asked a few questions about availability, performance, scalability; the usual questions a regular IT guy would ask. He has no idea, and instead of telling me he didn’t know, he gave me a runaround of this and that. Plenty of yada, yada nonsense.
In the end, I told him to buy a consumer grade storage appliance from Taiwan. I will just let him make a fool of himself in front of his customer since he didn’t want to take accountability of ensuring his customer get a proper enterprise storage solution in good faith. His customer is probably in the same mould as well.
Defensive Strategies as Data Foundations
A strong storage infrastructure foundation is vital for good Data Credibility. If you do the right things for your data, there is Data Value, and it will serve your business well. Both Data Credibility and Data Value create confidence. And Confidence equates Trust.
In order to create the defensive strategies let’s look at storage Availability, Protection, Accessibility, Management Security and Compliance. These are 6 of the 8 data points of the A.P.P.A.R.M.S.C. framework.
Offensive Strategies as Competitive Advantage
Once we have achieved stability of the storage infrastructure foundation, then we can turn over and drive towards storage Performance, Recovery, plus things like Scalability and Agility.
With a strong data infrastructure foundation, the organization can embark on the offensive, and begin their business transformation journey, knowing that their data is well run, protection, and performs.
Alignment with Data and Business Goals
Why are the defensive and offensive strategies requiring alignment to business goals?
The fact is simple. It is about improving the business and operations, and setting OKRs is key to measure the ROI (return of investment) of getting the storage systems and the solutions in place. It is about switching the cost-fearing (negative) mindset to a profit-conviction (positive) mindset.
For example, maybe the availability of the data to the business is poor. Maybe there is the need to have access to the data 24×7, because the business is going online. The simple measurable fact is we can move availability from 95% uptime to 99.99% uptime with an HA storage system.
Perhaps there are concerns about recoverability in the deluge of ransomware threats. Setting new RPO goals from 24 hours to 4 hours is a measurable objective to enhance data resiliency.
Or getting the storage systems to deliver higher performance from 350 IOPS to 5000 IOPS for the database.
What I am saying here is these data points are measurable, and they can serve as checkpoints for business and operational improvements. From a management perspective, these can be used as KPI (key performance index) to define continuous improvement of Data Confidence.
Furthermore, it is easy when a OKR dashboard is used to map the improvement markers when organizations use storage to move from point A to point B, where B equates to a new success milestone. The alignment sets the paths to the business targets.
Storage does not mean only Capacity.
The sad part is what the OKRs and the measured goals alignments are glaringly missing in the minds of many organizations purchasing a storage infrastructure and data management solution. The people tasked to source a storage technology solution are not placing a set of goals and objectives. Capacity appears to be the only thing on their mind.
I am about to meet a procurement officer of a customer soon. She asked me this question “Why is your storage so expensive?” over email. I want to change her mindset, just like the many officers and C-levels who hold the purse strings.
Let’s frame the use storage infrastructure in the real world. Nobody buys a storage system just to keep data in there much like a puddle keeps stagnant water. Sooner or later the value of the data in the storage evaporates or the value becomes dull if the data is not used well in any ways, shape or form.
Storage systems and the interconnected pathways from on premises, to the next premises, to the edge and to the clouds serve the greater good for Data. Data is used, shared, shaped, improved, enhanced, protected, moved, and more to deliver Value to the Business.
Storage capacity is just one of the few factors to consider when investing in a storage infrastructure solution. In fact, capacity is probably the least important piece when considering a storage solution to achieve the company’s OKRs. If we think about it deeper, setting the foundation for Data in the defensive manner will help elevate value of the data to be promoted with the offensive strategies to gain the competitive advantage.
Storage infrastructure and storage solutions along with data management platforms may appear to be a cost to the annual budgets. If you know set the OKRs, define A to get to B, alignment the goals, storage infrastructure and the data management platforms and practices are investments that are worth their weight in gold. That is my guarantee.
On the flip side, ignoring and avoiding OKRs, and set the strategies without prudence will yield its comeuppance. Technical debts will prevail.
On the road, seat belt saves lives. So does the motorcycle helmet. But these 2 technologies alone are probably not well received and well applied daily unless there is a strong ecosystem and culture about road safety. For decades, there have been constant and unrelenting efforts to enforce the habits of putting on the seat belt or the helmet. Statistics have shown they reduce road fatalities, but like I said, it is the safety culture that made all this happen.
On the digital front, the ransomware threats are unabated. In fact, despite organizations (and individuals), both large and small, being more aware of cyber-hygiene practices more than ever, the magnitude of ransomware attacks has multiplied. Threat actors still see weaknesses and gaps, and vulnerabilities in the digital realms, and thus, these are lucrative ventures that compliment the endeavours.
Time to look at Data Management
The Cost-Benefits-Risks Conundrum of Data Management
And I have said this before in the past. At a recent speaking engagement, I brought it up again. I said that ransomware is not a cybersecurity problem. Ransomware is a data management problem. I got blank stares from the crowd.
I get it. It is hard to convince people and companies to embrace a better data management culture. I think about the Cost-Benefits-Risk triangle while I was analyzing the lack of data management culture used in many organizations when combating ransomware.
I get it that Cybersecurity is big business. Even many of the storage guys I know wanted to jump into the cybersecurity bandwagon. Many of the data protection vendors are already mashing their solutions with a cybersecurity twist. That is where the opportunities are, and where the cool kids hang out. I get it.
Cybersecurity technologies are more tangible than data management. I get it when the C-suites like to show off shiny new cybersecurity “toys” because they are allowed to brag. Oh, my company has just implemented security brand XXX, and it’s so cool! They can’t be telling their golf buddies that they have a new data management culture, can they? What’s that?
Fibre ChannelSANs (storage area networks) are touted as more secure than IP-based storage networks. In a way, that is true because Fibre Channel is a distinct network separated from the mainstream client-based applications. Moreover, the Fibre Channel protocol is entirely different from IP, and the deep understanding of the protocol, its implementations are exclusive to a selected cohort of practitioners and professionals in the storage technology industry.
The data landscape has changed significantly compared to the days where FC SANs were dominating the enterprise. The era was the mid 90s and early 2000s. EMC® was king; IBM® Shark was a top-tier predator; NetApp® was just getting over its WAFL™ NAS overdose to jump into Fibre Channel. There were other fishes in the Fibre Channel sea.
But the sands of storage networking have been shifting. Today, data is at the center of the universe. Data is the prized possession of every organization, and has also become the most coveted prize for data thieves, threat actors and other malefactors in the digital world. The Fibre Channel protocol has been changing too, under its revised specifications and implementations through its newer iterations in the past decade. This change in advancement of Fibre Channel as a storage networking protocol is less often mentioned, but nevertheless vital in the shift of the Fibre Channel SANs into a Zero Trust world.
Zones, masks and maps
Many storage practitioners are familiar with the type of security measures employed by Fibre Channel in the yesteryears. And this still rings true in many of the FC SANs that we know of today. For specific devices to connect to each other, from hosts to the storage LUNs (logical unit numbers), FC zoning must be configured. This could be hard zoning or soft zoning, where the concept involves segmentation and the grouping of configured FC ports of both the ends to “see” each other and to communicate, facilitated by the FC switches. These ports are either the initiators or the storage target, each with its own unique WWN (World Wide Name).
On top of zoning, storage practitioners also configure LUN masking at the host side, where only certain assigned LUNs from the storage array is “exposed” to the specific host initiators. In conjunction, at the storage array side, the LUNs are also associated to only a group of host initiators that are allowed to connect to the selected LUNs. This is the LUN mapping part.
The slogan of The Washington Post is “Democracy Dies in Darkness“. Although not everyone agrees with the US brand of democracy, the altruism of WaPo‘s (the publication’s informal name) slogan is a powerful one. The venerable newspaper remains the beacon in the US as one of the most trustworthy sources of truthful, honest information.
4 Horsemen of Apocalypse with the 5th joining
Misinformation has become a clear and present danger to humanity. Fake news, misleading information, lies are fueling and propelling the propaganda and agenda of the powerful (and the deranged). Facts are blurred, obfuscated, and even removed and replaced with misinformation to push for the undesirable effects that will affect the present and future generations.
The work of SNIA®
Data preservation is part of Data Management. More than a decade ago, SNIA® has already set up a technical work group (TWG) on Long Term Retention and proposed a format for long-term storage of digital format. It was called SIRF (Self-contained Information Retention Format). In the words of SNIA®, “The SIRF format enables long-term physical storage, cloud storage and tape-based containers effective and efficient ways to preserve and secure digital information for many decades, even with the ever-changing technology landscape.”
I don’t think battling misinformation was SNIA®’s original intent, but the requirements for a vendor-neutral organization as such to present and promote long term data preservation is more needed than ever. The need to protect the truth is paramount.
SNIA® continues to work with many organizations to create and grow the ecosystem for long term information retention and data preservation.
NFTs can save data
Despite the hullabaloo of NFTs (non-fungible tokens), which is very much soiled and discredited by the present day cryptocurrency speculations, I view data (and metadata) preservation as a strong use case for NFTs. The action is to digitalize data into an NFT asset.
Here are a few arguments:
NFTs are unique. Once they are verified and inserted into the blockchain, they are immutable. They cannot be modified, and each blockchain transaction is created with one never to be replicated hashed value.
NFTs are decentralized. Most of the NFTs we know of today are minted via a decentralized process. This means that the powerful cannot (most of the time), effect the NFTs state according to its whims and fancies. Unless the perpetrators know how to manipulate a Sybil attack on the blockchain.
NFTs are secure. I have to set the knowledge that NFTs in itself is mostly very secure. Most of the high profiled incidents related to NFTs are more of internal authentication vulnerabilities and phishing related to poor security housekeeping and hygiene of the participants.
NFTs represent authenticity. The digital certification of the NFTs as a data asset also define the ownership and the originality as well. The record of provenance is present and accounted for.
Since NFTs started as a technology to prove the assets and artifacts of the creative industry, there are already a few organizations that playing the role. Orygin Art is one that I found intriguing. Museums are also beginning to explore the potential of NFTs including validating and verifying the origins of many historical artifacts, and digitizing these physical assets to preserve its value forever.
The technology behind NFTs are not without its weaknesses as well but knowing what we know today, the potential is evident and power of the technology has yet to be explored fully. It does present a strong case in preserving the integrity of truthful data, and the data as historical artifacts.
Protect data safety and data integrity
Misinformation is damaging. Regardless if we believe the Butterfly Effect or not, misinformation can cause a ripple effect that could turn into a tidal wave. We need to uphold the sanctity of Truth, and continue to protect data safety and data integrity. The world is already damaged, and it will be damaged even more if we allow misinformation to permeate into the fabric of the global societies. We may welcome to a dystopian future, unfortunately.
This blog hopes to shake up the nonchalant state that we view “information” and “misinformation” today. There is a famous quote that said “Repeat a lie often enough and it becomes the truth“. We must lead the call to combat misinformation. What we do now will shape the generations of our present and future. Preserve Truth.
I find it blasphemous that with all the rhetoric of data protection and cybersecurity technologies and solutions in the market today, the ransomware threats and damages have grown proportionately larger each year. In a recent report by Kaspersky on Anti-Ransomware Day May 12th, 9 out of 10 of organizations previously attacked by ransomware are willing to pay again if attacked again. A day before my scheduled talk in Surabaya East Java 2 weeks’ back, the chatter through the grapevine was one bank in Indonesia was attacked by ransomware on that day. These news proved how virulent and dangerous the ransomware scourge is and has become.
And the question that everyone wants an answer to is … why are ransomware threats getting bigger and more harmful and there are no solutions to it?
Digital transformation and its data are very attractive targets
Today, all we hear from the data protection and storage vendors are recovery, restore that data blah, blah, blah and more blah, blah, blahs. The end point EDR (endpoint detection and response) solutions say they can stop it; the cybersecurity experts preach depth in defense; and the network security guys say use perimeter fencing. And the anti-phishing chaps say more awareness and education required. One or all have not worked effectively these few years. Ransomware’s threats and damages are getting worse. Why?
I was in Indonesia last week to meet with iXsystems™‘ partner PT Maha Data Solusi. I had the wonderful opportunity to meet with many people there and one interesting and often-replayed question arose. Why aren’t iX doing software-defined-storage (SDS)? It was a very obvious and deliberate question.
After all, iX is already providing the free use of the open source TrueNAS® CORE software that runs on many x86 systems as an SDS solution and yet commercially, iX sell the TrueNAS® storage appliances.
This argument between a storage appliance model and a storage storage only model has been debated for more than a decade, and it does come into my conversations on and off. I finally want to address this here, with my own views and opinions. And I want to inform that I am open to both models, because as a storage consultant, both have their pros and cons, advantages and disadvantages. Up front I gravitate to the storage appliance model, and here’s why.
My story of the storage appliance begins …
Back in the 90s, most of my work was on Fibre Channel and NFS. iSCSI has not existed yet (iSCSI was ratified in 2003). It was almost exclusively on the Sun Microsystems® enterprise storage with Sun’s software resell of the Veritas® software suite that included the Sun Volume Manager (VxVM), Veritas® Filesystem (VxFS), Veritas® Replication (VxVR) and Veritas® Cluster Server (VCS). I didn’t do much Veritas® NetBackup (NBU) although I was trained at Veritas® in Boston in July 1997 (I remembered that 2 weeks’ trip fondly). It was just over 2 months after Veritas® acquired OpenVision. Backup Plus was the NetBackup.
Between 1998-1999, I spent a lot of time working Sun NFS servers. The prevalent networking speed at that time was 100Mbits/sec. And I remember having this argument with a Sun partner engineer by the name of Wong Teck Seng. Teck Seng was an inquisitive fella (still is) and he was raving about this purpose-built NFS server he knew about and he shared his experience with me. I detracted him, brushing aside his always-on tech orgasm, and did not find great things about a NAS storage appliance. Auspex™ was big then, and I knew of them.
I joined NetApp® as Malaysia’s employee #2. It was an odd few months working with a storage appliance but after a couple of months, I started to understand and appreciate the philosophy. The storage Appliance Model made sense to me, even through these days.
I just got home from the wonderful iXsystems™ Sales Summit in Knoxville, Tennessee. The key highlight was to christian the opening of iXsystems™ Maryville facility, the key operations center that will house iX engineering, support and part of marketing as well. News of this can be found here.
iX datacenter in the new Maryville facility
Western Digital® has always been a big advocate of iX, and at the Summit, they shared their hard disk drives HDD, solid state drives SSD, and other storage platforms roadmaps. I felt like a kid a candy store because I love all these excitements in the disk drive industry. Who says HDDs are going to be usurped by SSDs?
Several other disk drive manufacturers, including Western Digital®, have announced larger capacity drives. Here are some news of each vendor in recent months
Unless you have been living under a rock in the past months, the fervent and loud, but vague debates of web3.0 have been causing quite a scene on the Internet. Those tiny murmurs a few months ago have turned into an avalanche of blares and booms, with both believers and detractors crying out their facts and hyperboles.
Within the web3.0, decentralized storage technologies have been rising to a crescendo. So many new names have come forth into the decentralized storage space, most backed by blockchain and incentivized by cryptocurrencies and is putting the 19th century California Gold Rush to shame.
At present, the decentralized storage market segment is fluid, very vibrant and very volatile. Being the perennial storage guy that I am, I would very much like the decentralized storage to be sustainably successful but first, it has to make sense. Logic must prevail before confidence follows.
Classic “Crossing the Chasm”
To understand this decentralization storage chaos, we must understand where it is now, and where it is going. History never forgets to teach us of the past to be intelligible in the fast approaching future.
Geoffrey Moore’s Crossing the Chasm Technology (Disruption) Adoption Cycle
As a new technology enters the market, the adoption is often fueled by the innovators, the testers, the crazy ones. It progresses and the early adopters set in. Here we get the believers, the fanatics, the cults that push the envelope a bit further, going against the institutions and the conventions. This, which is obvious, describes the early adopter stage of the decentralized storage today.
Like all technologies, it has to go mainstream to be profitable and to get there, its value to the masses must be well defined to be accepted. This is the market segment that decentralized storage must move to, to the early majority stage. But there is a gap, rightly pointed out and well defined by Geoffrey Moore. The “Chasm“. [ Note: To read about why the chasm, read this article ].
So how will decentralized storage cross the chasm to the majority of the market?