Hadoop is truly dead – LOTR version

[Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

This blog was not intended because it was not in my plans to write it. But a string of events happened in the Storage Field Day 19 week and I have the fodder to share my thoughts. Hadoop is indeed dead.

Warning: There are Lord of the Rings references in this blog. You might want to do some research. 😉

Storage metrics never happened

The fellowship of Arjan Timmerman, Keiran Shelden, Brian Gold (Pure Storage) and myself started at the office of Pure Storage in downtown Mountain View, much like Frodo Baggins, Samwise Gamgee, Peregrine Took and Meriadoc Brandybuck forging their journey vows at Rivendell. The podcast was supposed to be on the topic of storage metrics but was unanimously swung to talk about Hadoop under the stewardship of Mr. Stephen Foskett, our host of Tech Field Day. I saw Stephen as Elrond Half-elven, the Lord of Rivendell, moderating the podcast as he would have in the plans of decimating the One Ring in Mount Doom.

So there we were talking about Hadoop, or maybe Sauron, or both.

The photo of the Oliphaunt below seemed apt to describe the industry attacks on Hadoop.

Continue reading

AI needs data we can trust

[ Note: This article was published on LinkedIn on Jan 21th 2020. Here is the link to the original article ]

In 2020, the intensity on the topic of Artificial Intelligence will further escalate.

One news which came out last week terrified me. The Sarawak courts want to apply Artificial Intelligence to mete judgment and punishment, perhaps on a small scale.

Continue reading

Zoned Technologies with Western Digital

[Disclosure: I am invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees will be covered by GestaltIT, the organizer and I am not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

Storage Field Day 19 is a week away. And one of the vendors presenting is Western Digital, who also presented at Storage Field Day 18 almost a year ago. Here is my blog where I received the full force of Western Digital. In that 10 months or so, Western Digital has sold off their IntelliFlash assets to Data Direct Networks and leaving their ActiveScale object storage platform in limbo.

What is in store from Western D?

I am eager to find out what coming from Western Digital. They have tons of storage technologies that I have yet to encounter, and this anticipation is keeping me excited for the Western D session at Storage Field Day 19.

For a few years I have been keen on a few Western D’s technologies which were moving up the value chain. They are:

In my patch, the signals of the 3 Western D’s technologies have gone weak in the past year. However, there is a lot of momentum right now for Zoned Storage and Zoned Name Space and I believe this could be what is in store for the storage propeller heads like us at Storage Field Day 19.

Continue reading

NAS is the next Ransomware goldmine

I get an email like this almost every day:

It is from one of my FreeNAS customers daily security run logs, emailed to our support@katanalogic.com alias. It is attempting a brute force attack trying to crack the authentication barrier via the exposed SSH port.

Just days after the installation was completed months ago, a bot has been doing IP port scans on our system, and found the SSH port open. (We used it for remote support). It has been trying every since, and we have been observing the source IP addresses.

The new Ransomware attack vector

This is not surprising to me. Ransomware has become more sophisticated and more damaging than ever because the monetary returns from the ransomware are far more effective and lucrative than other cybersecurity threats so far. And the easiest preys are the weakest link in the People, Process and Technology chain. Phishing breaches through social engineering, emails are the most common attack vectors, but there are vhishing (via voicemail) and smshing (via SMS) out there too. Of course, we do not discount other attack vectors such as mal-advertising sites, or exploits and so on. Anything to deliver the ransomware payload.

The new attack vector via NAS (Network Attached Storage) and it is easy to understand why.

Continue reading

Is General Purpose Object Storage disenfranchised?

[Disclosure: I am invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees will be covered by GestaltIT, the organizer and I am not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

This is NOT an advertisement for coloured balls.

This is the license to brag for the vendors in the next 2 weeks or so, as we approach the 2020 new year. This, of course, is the latest 2019 IDC Marketscape for Object-based Storage, released last week.

My object storage mentions

I have written extensively about Object Storage since 2011. With different angles and perspectives, here are some of them:

Continue reading

Green Storage? Meh!

Something triggered my thoughts a few days ago. A few of us got together talking about climate change and a friend asked how green was the datacenter in IT. With cloud computing booming, I would say that green computing isn’t really the hottest thing at present. That in turn, leads us to one of the most voracious energy beasts in the datacenter, storage. Where is green storage in the equation?

What is green?

Over the past decade, several storage related technologies were touted as more energy efficient. These include

  • Tape – when tapes are offline, they do not consume power and do not require cooling
  • Virtualization – Virtualization reduces the number of servers and desktops, and of course storage too
  • MAID (Massive Array of Independent Disks) – the arrays spin down the HDDs if idle for a period of time
  • SSD (Solid State Drives) – Compared to HDDs, SSDs consume much less power, and overall reduce the cooling needs
  • Data Footprint Reduction – Deduplication, compression and other technologies to reduce copies of data
  • SMR (Shingled Magnetic Recording) Drives – Higher areal density means less drives but limited by physics.

The largest gorilla in storage technology

HDDs still dominate the market and they are the biggest producers of heat and vibration in a storage array, along with the redundant power supplies and fans. Until and unless SSDs dominate, we have to live with the fact that storage disk drives are not green. The statistics from Statistica below forecasts that in 2021, the shipment of SSDs will surpass HDDs.

Today the areal density of HDDs have increased. With SMR (shingled magnetic recording), the areal density jumped about 25% more than the 1Tb/inch (Terabit per inch) in the CMR (conventional magnetic recording) drives. The largest SMR in the market today is 16TB from Seagate with 18TB SMR in the horizon. That capacity is going to grow significantly when EAMR (energy assisted magnetic recording) – which counts heat assisted and microwave assisted – drives enter the market next year. The areal density will grow to 1.6Tb/inch with a roadmap to 4.0Tb/inch. Continue reading

ZFS Replication and Recovery with FreeNAS

We get requests to recover data from a secondary platform all the time. RPO (recovery point objective) of 30 minutes can be challenging to small to medium sized companies, especially if there is an SLA (service level agreement) to meet.

This week, my team and I took some time to create a FreeNAS replication demo for a potential client. I thought I document the whole thing about ZFS replication, the key steps to set it up and show how recovery is done.

ZFS Snapshots

ZFS replication relies on periodic ZFS snapshots. ZFS snapshot is an inherent feature from the ZFS file system, and often used as a point-in-time copy of the existing ZFS file system tree in memory. Once a snapshot has been triggered, either manually or on schedule (periodic), the file system tree and its metadata in the memory are committed to disk to ensure an updated and consistent state of the file system at all times.

To start, a running snapshot policy on a schedule must be in place. This snapshot policy can be on a specific dataset or zvol, or even the entire zpool. Yeah, I am using quite a few ZFS terminology here – zpool, zvol, dataset. You can read more about each of the structures and more here.

Once the ZFS replication task has been setup, every snapshot occurred in the snapshot policy is automatically duplicated and copied to the target ZFS dataset. Usually, the target ZFS dataset is on a secondary FreeNAS storage server, serving as a disaster recovery platform. Sending and receiving data in the snapshots rely on SSH service.

This is the network diagram explaining the FreeNAS ZFS replication setup.

Continue reading

Veaam to boost Cloud Data Management

Cloud Data Management is a tricky word. Often vague, ambigious, how exactly would you define “Cloud Data Management“?

Fresh off the boat from Commvault GO 2019 in Denver, Colorado last week, I was invited to sample Veeam a few days ago at their Solution Day and soak into their rocketing sales in Asia Pacific, and strong market growth too. They reported their Q3 numbers this week, impressing many including yours truly.

I went to the seminar early in the morning, quite in awe of their vibrant partners and resellers activities and ecosystem compared to the tepid Commvault efforts in Malaysia over the past decade. Veeam’s presence in Malaysia is shorter than Commvault’s but they are able to garner a stronger following with partners and customers alike.

Continue reading