Falconstor Software Defined Data Preservation for the Next Generation

Falconstor® Software is gaining momentum. Given its arduous climb back to the fore, it is beginning to soar again.

Tape technology and Digital Data Preservation

I mentioned that long term digital data preservation is a segment within the data lifecycle which has merits and prominence. SNIA® has proved that this is a strong growing market segment through its 2007 and 2017 “100 Year Archive” surveys, respectively. 3 critical challenges of this long, long-term digital data preservation is to keep the archives

  • Accessible
  • Undamaged
  • Usable

For the longest time, tape technology has been the king of the hill for digital data preservation. The technology is cheap, mature, and many enterprises has built their long term strategy around it. And the pulse in the tape technology market is still very healthy.

The challenges of tape remain. Every 5 years or so, companies have to consider moving the data on the existing tape technology to the next generation. It is widely known that LTO can read tapes of the previous 2 generations, and write to it a generation before. The tape transcription process of migrating digital data for the sake of data preservation is bad because it affects the structural integrity and quality of the content of the data.

In my times covering the Oil & Gas subsurface data management, I have seen NOCs (national oil companies) with 500,000 tapes of all generations, from 1/2″ to DDS, DAT to SDLT, 3590 to LTO 1-7. And millions are spent to transcribe these tapes every few years and we have folks like Katalyst DM, Troika and more hovering this landscape for their fill.

Continue reading

Reap at low tide

[ Note: This article was published on Linkedin more than 6 months ago. Here is the original link to the article ]

[ Update (Apr 13 2020): Amid the COVID-19 pandemic and restricted movement globally,  we can turn our pessimism into an opportunistic one ]

Nature has a way of teaching us. What works and what doesn’t are often hidden in plain sight, but we human are mostly too occupied to notice the things that work.

Why are they not spending?

This news appeared in my LinkedIn feed. It read “Malaysian Banks Don’t Spend Enough on Tech“. It irked me immensely because in a soft economy climate (the low tide), our Malaysian financial institutions should be spending more on technology (reaping the opportunity) to get ahead.

Why are the storks and the egrets in my page photo above waiting and wading in the knee-deep waters? Because at low tide, when the waves ebb, food is exposed to them abundantly. They scurry for shrimps, small crabs, cockles, mussels and more. This is nature’s way.

From the report, the technology spending average among the Malaysian banks is pathetic.

No alt text provided for this image

The negative domino effect on SMEs

When the banks are not spending on technology, the other industries, especially the SMEs (small medium enterprises) follow suit. The “penny pinching” and “tightening purse string” effect permeates across industries, slowly and surely putting the negative effect in tech spending into a volatile spin-cycle.

From a macro-economic point of view, spending slows down. Buying less means lesser demands and effectively, lowering supply, and it rolls on. The law of demand and supply just got dumped into an abyss.

A great opportunity for those who see it

When I was an engineer at Sun Microsystems more than 2 decades ago, I read a comment delivered by one of the executives. It said “When times are bad, those who know will get the best parts“. I took his comment to heart because what he said held true, even until today.

This is the best time, when the country is experiencing an economic downturn. When the competitors are holding back and may be reeling from the negative effects of the economy, the banks are in the best position to grab the best deals. This is the time to gain market share, when the competition is holding back for fear that the economy will become softer.

Furthermore, with the low interest rates across the board, there is no better time than the present to step up the tech spending. Banks should know this very well but I am perplexed.

That is why the Malaysian banks must kick start their tech spending campaign now. And the SMEs will follow, overturning the downturn with demands of spending for the best “parts”. The supply “factories” are fired up again, and will lead to a positive growth to the economy.

Bank Negara RMiT is that one opportunity

One thing which has been looming is Bank Negara, Malaysia’s Central Bank, RMiT (Risk Management in Technology) framework. A new version was released in July 2019, and to me as an outsider, is a great opportunity to grab the best parts. And some of these standards will come into effect in January 2020

Bank Negara is strongly encouraging banks to improve the security and the confidence of the country’s financial industry, and the RMiT framework is really a prod to increase tech spending. Unfortunately, in some of my business interactions with a few of the banks, the feet dragging practice is prevalent.

Nature’s lesson

The best time to have your best pick is at low tide. This is nature’s lesson for us. What are we waiting for?

StorageGRID gets gritty

[ Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies presented at the event. The content of this blog is of my own opinions and views ]

NetApp® presented StorageGRID® Webscale (SGWS) at Storage Field Day 19 last month. It was timely when the general purpose object storage market, in my humble opinion, was getting disillusioned and almost about to deprive itself of the value of what it was supposed to be.

Cheap and deep“, “Race to Zero” were some of the less storied calls I have come across when discussing about object storage, and it was really de-valuing the merits of object storage as vendors touted their superficial glory of being in the IDC Marketscape for Object-based Storage 2019.

Almost every single conversation I had in the past 3 years was either explaining what object storage is or “That is cheap storage right?

Continue reading

DellEMC Project Nautilus Re-imagine Storage for Streams

[ Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies presented at this event. The content of this blog is of my own opinions and views ]

Cloud computing will have challenges processing data at the outer reach of its tentacles. Edge Computing, as it melds with the Internet of Things (IoT), needs a different approach to data processing and data storage. Data generated at source has to be processed at source, to respond to the event or events which have happened. Cloud Computing, even with 5G networks, has latency that is not sufficient to how an autonomous vehicle react to pedestrians on the road at speed or how a sprinkler system is activated in a fire, or even a fraud detection system to signal money laundering activities as they occur.

Furthermore, not all sensors, devices, and IoT end-points are connected to the cloud at all times. To understand this new way of data processing and data storage, have a look at this video by Jay Kreps, CEO of Confluent for Kafka® to view this new perspective.

Data is continuously and infinitely generated at source, and this data has to be compiled, controlled and consolidated with nanosecond precision. At Storage Field Day 19, an interesting open source project, Pravega, was introduced to the delegates by DellEMC. Pravega is an open source storage framework for streaming data and is part of Project Nautilus.

Rise of  streaming time series Data

Processing data at source has a lot of advantages and this has popularized Time Series analytics. Many time series and streams-based databases such as InfluxDB, TimescaleDB, OpenTSDB have sprouted over the years, along with open source projects such as Apache Kafka®, Apache Flink and Apache Druid.

The data generated at source (end-points, sensors, devices) is serialized, timestamped (as event occurs), continuous and infinite. These are the properties of a time series data stream, and to make sense of the streaming data, new data formats such as Avro, Parquet, Orc pepper the landscape along with the more mature JSON and XML, each with its own strengths and weaknesses.

You can learn more about these data formats in the 2 links below:

DIY is difficult

Many time series projects started as DIY projects in many organizations. And many of them are still DIY projects in production systems as well. They depend on tribal knowledge, and these databases are tied to an unmanaged storage which is not congruent to the properties of streaming data.

At the storage end, the technologies today still rely on the SAN and NAS protocols, and in recent years, S3, with object storage. Block, file and object storage introduce layers of abstraction which may not be a good fit for streaming data.

Continue reading

Komprise is a Winner

[Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

I, for one perhaps have seen far too many “file lifecycle and data management” software solutions that involved tiering, hierarchical storage management, ILM or whatever you call them these days. If I do a count, I would have managed or implemented at least 5 to 6 products, including a home grown one.

The whole thing is a very crowded market and I have seen many which have come and gone, and so when the opportunity to have a session with Komprise came at Storage Field Day 19, I did not carry a lot of enthusiasm.

Continue reading

Open Source and Open Standards open the Future

[Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

Western Digital dived into Storage Field Day 19 in full force as they did in Storage Field Day 18. A series of high impact presentations, each curated for the diverse requirements of the audience. Several open source initiatives were shared, all open standards to address present inefficiencies and designed and developed for a greater future.

Zoned Storage

One of the initiatives is to increase the efficiencies around SMR and SSD zoning capabilities and removing the complexities and overlaps of both mediums. This is the Zoned Storage initiatives a technical working proposal to the existing NVMe standards. The resulting outcome will give applications in the user space more control on the placement of data blocks on zone aware devices and zoned SSDs, collectively as Zoned Block Device (ZBD). The implementation in the Linux user and kernel space is shown below:

Continue reading

Hadoop is truly dead – LOTR version

[Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

This blog was not intended because it was not in my plans to write it. But a string of events happened in the Storage Field Day 19 week and I have the fodder to share my thoughts. Hadoop is indeed dead.

Warning: There are Lord of the Rings references in this blog. You might want to do some research. 😉

Storage metrics never happened

The fellowship of Arjan Timmerman, Keiran Shelden, Brian Gold (Pure Storage) and myself started at the office of Pure Storage in downtown Mountain View, much like Frodo Baggins, Samwise Gamgee, Peregrine Took and Meriadoc Brandybuck forging their journey vows at Rivendell. The podcast was supposed to be on the topic of storage metrics but was unanimously swung to talk about Hadoop under the stewardship of Mr. Stephen Foskett, our host of Tech Field Day. I saw Stephen as Elrond Half-elven, the Lord of Rivendell, moderating the podcast as he would have in the plans of decimating the One Ring in Mount Doom.

So there we were talking about Hadoop, or maybe Sauron, or both.

The photo of the Oliphaunt below seemed apt to describe the industry attacks on Hadoop.

Continue reading

AI needs data we can trust

[ Note: This article was published on LinkedIn on Jan 21th 2020. Here is the link to the original article ]

In 2020, the intensity on the topic of Artificial Intelligence will further escalate.

One news which came out last week terrified me. The Sarawak courts want to apply Artificial Intelligence to mete judgment and punishment, perhaps on a small scale.

Continue reading

Is General Purpose Object Storage disenfranchised?

[Disclosure: I am invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees will be covered by GestaltIT, the organizer and I am not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

This is NOT an advertisement for coloured balls.

This is the license to brag for the vendors in the next 2 weeks or so, as we approach the 2020 new year. This, of course, is the latest 2019 IDC Marketscape for Object-based Storage, released last week.

My object storage mentions

I have written extensively about Object Storage since 2011. With different angles and perspectives, here are some of them:

Continue reading