Processing data has become more expensive.
Somewhere, there is a misconception that data processing is cheap. That stems from the well-known pricings of the capacities of public cloud storage that are a fraction of cents per month. But data in storage has to be worked upon, and has to be built up and protected to increase its value. Data has to be processed, moved, shared, and used by applications. Data induce workloads. Nobody keeps data stored forever and never be used again. Nobody buys storage just for capacity alone.
We have a great saying in the industry. No matter, where the data moves, it will land in a storage. So, it is clear that data does not exist in ether. And yet, I often see how little attention and prudence and care, when it comes to data infrastructure and data management technologies, the very components that are foundational to great data.
AI is driving up costs in data processing
A few recent articles drew my focus into the cost of data processing.
Here is one posted by a friend on Facebook. It is titled “The world is running out of data to feed AI, experts warn.”
My first reaction was, “How can we run out of data“? We have so much data in the world today that the 175 zettabytes predicted by IDC when we reach 2025 might be grossly inaccurate. According to Exploding Topics, it is estimated that we create 328.77TB of data per day, 120 zettabytes per year. While I cannot vouch for the accuracy of the numbers, the numbers are humongous.