Somewhere, there is a misconception that data processing is cheap. That stems from the well-known pricings of the capacities of public cloud storage that are a fraction of cents per month. But data in storage has to be worked upon, and has to be built up and protected to increase its value. Data has to be processed, moved, shared, and used by applications. Data induce workloads. Nobody keeps data stored forever and never be used again. Nobody buys storage just for capacity alone.
We have a great saying in the industry. No matter, where the data moves, it will land in a storage. So, it is clear that data does not exist in ether. And yet, I often see how little attention and prudence and care, when it comes to data infrastructure and data management technologies, the very components that are foundational to great data.
Great data management for Great AI
AI is driving up costs in data processing
A few recent articles drew my focus into the cost of data processing.
I was listening to several storage luminaries in the GestaltIT’s podcast “No one understands Storage anymore” a few of weeks ago. Around the minute of 11.09 in the podcast, Dr. J. Metz, SNIA® Chair, brought up this is powerful quote “Storage does not mean Capacity“. It struck me, not in a funny way. It is what it is, and it something I wanted to say to many who do not understand the storage solutions they are purchasing. It exemplifies what is wrong in the many organizations today in their understanding of investing in a storage infrastructure project.
This is my pet peeve. The first words uttered in most, if not all storage requirements in my line of work are, “I want this many Terabytes of storage“. There are no other details and context of what the other requirement factors are, such as availability, performance, future growth, etc. Or even the goals to achieve when purchasing a storage system and operating it. What is the improvement they are looking for?What are the problems to solve?
Where is the OKR?
It pains me to say this. For the folks who have in the IT industry for years, both end users and IT purveyors alike, most are absolutely clueless about OKR (Objectives and Key Results) for their storage infrastructure project. Many cannot frame the data challenges they are facing, and they have no idea where to go next. There is no alignment. There is no strategy. Even worse, there is no concept of how their storage infrastructure investments will improve their business and operations.
Just the other day, one company director from a renown IT integrator here in Malaysia came calling. He has been in the IT industry since 1989 (I checked his Linkedin profile), asking to for a 100TB storage quote. I asked a few questions about availability, performance, scalability; the usual questions a regular IT guy would ask. He has no idea, and instead of telling me he didn’t know, he gave me a runaround of this and that. Plenty of yada, yada nonsense.
In the end, I told him to buy a consumer grade storage appliance from Taiwan. I will just let him make a fool of himself in front of his customer since he didn’t want to take accountability of ensuring his customer get a proper enterprise storage solution in good faith. His customer is probably in the same mould as well.
Defensive Strategies as Data Foundations
A strong storage infrastructure foundation is vital for good Data Credibility. If you do the right things for your data, there is Data Value, and it will serve your business well. Both Data Credibility and Data Value create confidence. And Confidence equates Trust.
In order to create the defensive strategies let’s look at storage Availability, Protection, Accessibility, Management Security and Compliance. These are 6 of the 8 data points of the A.P.P.A.R.M.S.C. framework.
Offensive Strategies as Competitive Advantage
Once we have achieved stability of the storage infrastructure foundation, then we can turn over and drive towards storage Performance, Recovery, plus things like Scalability and Agility.
With a strong data infrastructure foundation, the organization can embark on the offensive, and begin their business transformation journey, knowing that their data is well run, protection, and performs.
Alignment with Data and Business Goals
Why are the defensive and offensive strategies requiring alignment to business goals?
The fact is simple. It is about improving the business and operations, and setting OKRs is key to measure the ROI (return of investment) of getting the storage systems and the solutions in place. It is about switching the cost-fearing (negative) mindset to a profit-conviction (positive) mindset.
For example, maybe the availability of the data to the business is poor. Maybe there is the need to have access to the data 24×7, because the business is going online. The simple measurable fact is we can move availability from 95% uptime to 99.99% uptime with an HA storage system.
Perhaps there are concerns about recoverability in the deluge of ransomware threats. Setting new RPO goals from 24 hours to 4 hours is a measurable objective to enhance data resiliency.
Or getting the storage systems to deliver higher performance from 350 IOPS to 5000 IOPS for the database.
What I am saying here is these data points are measurable, and they can serve as checkpoints for business and operational improvements. From a management perspective, these can be used as KPI (key performance index) to define continuous improvement of Data Confidence.
Furthermore, it is easy when a OKR dashboard is used to map the improvement markers when organizations use storage to move from point A to point B, where B equates to a new success milestone. The alignment sets the paths to the business targets.
Storage does not mean only Capacity.
The sad part is what the OKRs and the measured goals alignments are glaringly missing in the minds of many organizations purchasing a storage infrastructure and data management solution. The people tasked to source a storage technology solution are not placing a set of goals and objectives. Capacity appears to be the only thing on their mind.
I am about to meet a procurement officer of a customer soon. She asked me this question “Why is your storage so expensive?” over email. I want to change her mindset, just like the many officers and C-levels who hold the purse strings.
Let’s frame the use storage infrastructure in the real world. Nobody buys a storage system just to keep data in there much like a puddle keeps stagnant water. Sooner or later the value of the data in the storage evaporates or the value becomes dull if the data is not used well in any ways, shape or form.
Storage systems and the interconnected pathways from on premises, to the next premises, to the edge and to the clouds serve the greater good for Data. Data is used, shared, shaped, improved, enhanced, protected, moved, and more to deliver Value to the Business.
Storage capacity is just one of the few factors to consider when investing in a storage infrastructure solution. In fact, capacity is probably the least important piece when considering a storage solution to achieve the company’s OKRs. If we think about it deeper, setting the foundation for Data in the defensive manner will help elevate value of the data to be promoted with the offensive strategies to gain the competitive advantage.
Storage infrastructure and storage solutions along with data management platforms may appear to be a cost to the annual budgets. If you know set the OKRs, define A to get to B, alignment the goals, storage infrastructure and the data management platforms and practices are investments that are worth their weight in gold. That is my guarantee.
On the flip side, ignoring and avoiding OKRs, and set the strategies without prudence will yield its comeuppance. Technical debts will prevail.
There was a Super Blue Moon a few days ago. It was a rare sky show. Friends of mine who are photo and moon gazing enthusiasts were showing off their digital captures online. One ignorant friend, who was probably a bit envious of the other people’s attention, quipped that his Oppo Reno 10 Pro Plus can take better pictures. Oppo Reno 10 Pro Plus claims 3x optical zoom and 120x digital zoom. Yes, 120 times!
Yesterday, a WIRED article came out titled “How Much Detail of the Moon Can Your Smartphone Really Capture?” It was a very technical article. I thought the author did an excellent job explaining the physics behind his notes. But I also found the article funny, flippant even, when I juxtaposed this WIRED article to what my envious friend was saying the other day about his phone’s camera.
Super Blue Moon 2023
Open Source storage expectations and outcomes
I work for iXsystems™. Open Source has been its DNA for over 30 years. Similarly, I have also worked on Open Source (decades before it was called open source) in my home labs ever since I entered the industry. I had SoftLanding Linux System 3.5″ diskette (Linux kernel 0.99), and I bought a boxed set of FreeBSD OS from Walnut Creek (photo below). My motivation was to learn as much as possible about information technology world because I was making my first steps into building my career (I was also quietly trying to prove my father wrong) in the IT industry.
FreeBSD Boxed Set (circa 1993)
Open source has democratized technology. It has placed the power of very innovative technology into the hands of the common people With Open Source, I see the IT landscape changing as well, especially for home labers like myself in the early years. Social media platforms, FAANG (Facebook, Apple, Amazon, Netflix, Google), etc, etc, have amplified that power (to the people). But with that great power, comes great responsibility. And some users with little technology background start to have hallucinated expectations and outcomes. Just like my friend with the “powerful” Oppo phone.
Likewise, in my world, I have plenty of anecdotes of these types of open source storage users having wild expectations, but little skills to exact the reality.
On the road, seat belt saves lives. So does the motorcycle helmet. But these 2 technologies alone are probably not well received and well applied daily unless there is a strong ecosystem and culture about road safety. For decades, there have been constant and unrelenting efforts to enforce the habits of putting on the seat belt or the helmet. Statistics have shown they reduce road fatalities, but like I said, it is the safety culture that made all this happen.
On the digital front, the ransomware threats are unabated. In fact, despite organizations (and individuals), both large and small, being more aware of cyber-hygiene practices more than ever, the magnitude of ransomware attacks has multiplied. Threat actors still see weaknesses and gaps, and vulnerabilities in the digital realms, and thus, these are lucrative ventures that compliment the endeavours.
Time to look at Data Management
The Cost-Benefits-Risks Conundrum of Data Management
And I have said this before in the past. At a recent speaking engagement, I brought it up again. I said that ransomware is not a cybersecurity problem. Ransomware is a data management problem. I got blank stares from the crowd.
I get it. It is hard to convince people and companies to embrace a better data management culture. I think about the Cost-Benefits-Risk triangle while I was analyzing the lack of data management culture used in many organizations when combating ransomware.
I get it that Cybersecurity is big business. Even many of the storage guys I know wanted to jump into the cybersecurity bandwagon. Many of the data protection vendors are already mashing their solutions with a cybersecurity twist. That is where the opportunities are, and where the cool kids hang out. I get it.
Cybersecurity technologies are more tangible than data management. I get it when the C-suites like to show off shiny new cybersecurity “toys” because they are allowed to brag. Oh, my company has just implemented security brand XXX, and it’s so cool! They can’t be telling their golf buddies that they have a new data management culture, can they? What’s that?
Ho hum. Another day, and another data leak. What else is new?
The latest hullabaloo in my radar was from one of Malaysia’s reverent universities, UiTM, which reported a data leak of 11,891 student applicants’ private details including MyKad (national identity card) numbers of each individual. Reading from the news article, one can deduced that the unsecured link mentioned was probably from a cloud storage service, i.e. file synchronization software such as OneDrive, Google Drive, Dropbox, etc. Those files that can be easily shared via an HTTP/S URL link. Ah, convenience over the data security best practices.
Cloud File Sync software
It irks me when data security practices are poorly practised. And it is likely that there is ignorance of data security practices in the first place.
It also irks me when many end users everywhere I have encountered tell me their file synchronization software is backup. That is just a very poor excuse of a data protection strategy, if any, especially in enterprise and cloud environments. Convenience, set-and-forget mentality. Out of sight. Out of mind. Right?
Convenience is not data security. File Sync is NOT Backup
Many users are used to the convenience of file synchronization. The proliferation of cloud storage services with free Gigabytes here and there have created an IT segment based on BYOD, which transformed into EFSS, and now CCP. The buzzword salad involves the Bring-Your-Own-Device, which evolved into Enterprise-File-Sync-&-Share, and in these later years, Content-Collaboration-Platform.
All these are fine and good. The data industry is growing up, and many are leveraging the power of file synchronization technologies, be it on on-premises and from cloud storage services. Organizations, large and small, are able to use these file synchronization platforms to enhance their businesses and digitally transforming their operational efficiencies and practices. But what is sorely missing in embracing the convenience and simplicity is the much ignored cybersecurity housekeeping practices that should be keeping our files and data safe.
Fibre ChannelSANs (storage area networks) are touted as more secure than IP-based storage networks. In a way, that is true because Fibre Channel is a distinct network separated from the mainstream client-based applications. Moreover, the Fibre Channel protocol is entirely different from IP, and the deep understanding of the protocol, its implementations are exclusive to a selected cohort of practitioners and professionals in the storage technology industry.
The data landscape has changed significantly compared to the days where FC SANs were dominating the enterprise. The era was the mid 90s and early 2000s. EMC® was king; IBM® Shark was a top-tier predator; NetApp® was just getting over its WAFL™ NAS overdose to jump into Fibre Channel. There were other fishes in the Fibre Channel sea.
But the sands of storage networking have been shifting. Today, data is at the center of the universe. Data is the prized possession of every organization, and has also become the most coveted prize for data thieves, threat actors and other malefactors in the digital world. The Fibre Channel protocol has been changing too, under its revised specifications and implementations through its newer iterations in the past decade. This change in advancement of Fibre Channel as a storage networking protocol is less often mentioned, but nevertheless vital in the shift of the Fibre Channel SANs into a Zero Trust world.
Zones, masks and maps
Many storage practitioners are familiar with the type of security measures employed by Fibre Channel in the yesteryears. And this still rings true in many of the FC SANs that we know of today. For specific devices to connect to each other, from hosts to the storage LUNs (logical unit numbers), FC zoning must be configured. This could be hard zoning or soft zoning, where the concept involves segmentation and the grouping of configured FC ports of both the ends to “see” each other and to communicate, facilitated by the FC switches. These ports are either the initiators or the storage target, each with its own unique WWN (World Wide Name).
On top of zoning, storage practitioners also configure LUN masking at the host side, where only certain assigned LUNs from the storage array is “exposed” to the specific host initiators. In conjunction, at the storage array side, the LUNs are also associated to only a group of host initiators that are allowed to connect to the selected LUNs. This is the LUN mapping part.
The deluge of data is astounding. We get bombarded and attacked by data every single waking minute of our day. And it will get even worse. Our senses will be numbed into submission. In the end, I ask in the sense of it all. Do we need this much information force fed to us at every second of our lives?
We have heard about the societies a decade ago living in the Information Age and now, we have touted the Social Age. TikTok, Youtube, Twitter, Spotify, Facebook, Metaverse(s) and so many more are creating societies that are defined by data, controlled by data and governed by data. Data can be gathered so easily now that it is hard to make sense of what is relevant or what is useful. Even worse, private data, information about the individual is out there either roaming without any security guarding it, or sold like a gutted fish in the market. The bigger “whales” are peddled to the highest bidder. So, to the prudent human being, what will it be?
Whatever the ages we are in, Information or Social, does not matter anymore. Data is used to feed the masses; Data is used to influence the population; Data is the universal tool to shape the societies, droning into submission and ruling them to oblivion.
GIGO the TikTok edition
GIGO is Garbage In Garbage Out. It is an age old adage to folks who have worked with data and storage for a long time. You put in garbage data, you get garbage output results. And if you repeat the garbage in enough times, you would have created a long lasting garbage world. So, imagine now that the data is the garbage that is fed into the targeted society. What will happen next is very obvious. A garbage society.
I find the terminology of WORM (Write Once Read Many) coming back into the IT speak in recent years. In the era of rip and burn, WORM was a natural thing where many of us “youngsters” used to copy files to a blank CD or DVD. I got know about how WORM worked when I learned that the laser in the CD burning process alters the chemical compound in a segment on the plastic disc of the CD, rendering the “burned” segment unwritable once it was written but it could be read many times.
At the enterprise level, I got to know about WORM while working with tape drives and tape libraries in the mid-90s. The objective of WORM is to save and archive the data and files in a non-rewritable format for compliance reasons. And it was the data compliance and data protection parts that got me interested into data management. WORM is a big deal in many heavily regulated industries such as finance and banking, insurance, oil and gas, transportation and more.
Obviously things have changed. WORM, while very much alive in the ageless tape industry, has another up-and-coming medium in Object Storage. The new generation of data infrastructure and data management specialists are starting to take notice.
Worm Storage – Image from Hubstor (https://www.hubstor.net/blog/write-read-many-worm-compliant-storage/)
I take this opportunity to take MinIOobject storage for a spin in creating WORM buckets which can be easily architected as data compliance repositories with many applications across regulated industries. Here are some relevant steps.
Public cloud storage services has been a boon for over a decade since Dropbox® entered the scene in 2008. BYOD (bring your own devices) became a constant in every IT person’s lips at that time. And with the teaser of 2GB or more, many still rely on these public cloud storage services with the ability to sync with tablets, smart phones and laptops. But the proliferation of these services also propagated many cybersecurity risks, and yes, ransomware can infect these public cloud storage. Even more noxious, the synchronization of files and folders of these services with on-premises storage devices makes it easy for infected data to spread, often with great efficacy.
Banning these widely available cloud storage applications is more than an inconvenience. Governments like China and India are shoring up their battlegrounds, as the battle for the protection and the privacy of sovereign data will not only escalate but also create a domino effect in the geopolitical dominance in the digital landscape.
This is not the war of 2 of the most populous nations of the world. Underneath these acts, there are more things to come, and it won’t just involve China and India. We will see other nations follow, with some already in the works to draw boundaries and demarcate digital borders in the name of data security, privacy, sovereignty and protection.
I hear of some foreign vendors lamenting about such a move. Most have already either complied with China’s laws or chose to exit that market. This recent move by India may feel like a storm in a teacup, but beneath it all, the undercurrent is getting stronger each day. A digital geopolitical tempest is percolating and brewing.
I find it blasphemous that with all the rhetoric of data protection and cybersecurity technologies and solutions in the market today, the ransomware threats and damages have grown proportionately larger each year. In a recent report by Kaspersky on Anti-Ransomware Day May 12th, 9 out of 10 of organizations previously attacked by ransomware are willing to pay again if attacked again. A day before my scheduled talk in Surabaya East Java 2 weeks’ back, the chatter through the grapevine was one bank in Indonesia was attacked by ransomware on that day. These news proved how virulent and dangerous the ransomware scourge is and has become.
And the question that everyone wants an answer to is … why are ransomware threats getting bigger and more harmful and there are no solutions to it?
Digital transformation and its data are very attractive targets
Today, all we hear from the data protection and storage vendors are recovery, restore that data blah, blah, blah and more blah, blah, blahs. The end point EDR (endpoint detection and response) solutions say they can stop it; the cybersecurity experts preach depth in defense; and the network security guys say use perimeter fencing. And the anti-phishing chaps say more awareness and education required. One or all have not worked effectively these few years. Ransomware’s threats and damages are getting worse. Why?