Dell EMC Isilon is an Emmy winner!

[ Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies presented at this event. The content of this blog is of my own opinions and views ]

And the Emmy® goes to …

Yes, the Emmy® goes to Dell EMC Isilon! It was indeed a well deserved accolade and an honour!

Dell EMC Isilon had just won the Technology & Engineering Emmy® Awards a week before Storage Field Day 19, for their outstanding pioneering work on the NAS platform tiering technology of media and broadcasting content according to business value.

A lasting true clustered NAS

This is not a blog to praise Isilon but one that instill respect to a real true clustered, scale-out file system. I have known of OneFS for a long time, but never really took the opportunity to really put my hands on it since 2006 (there is a story). So here is a look at history …

Back in early to mid-2000, there was a lot of talks about large scale NAS. There were several players in the nascent scaling NAS market. NetApp was the filer king, with several competitors such as Polyserve, Ibrix, Spinnaker, Panasas and the young upstart Isilon. There were also Procom, BlueArc and NetApp’s predecessor Auspex. By the second half of the 2000 decade, the market consolidated and most of these NAS players were acquired.

Continue reading

Open Source and Open Standards open the Future

[Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

Western Digital dived into Storage Field Day 19 in full force as they did in Storage Field Day 18. A series of high impact presentations, each curated for the diverse requirements of the audience. Several open source initiatives were shared, all open standards to address present inefficiencies and designed and developed for a greater future.

Zoned Storage

One of the initiatives is to increase the efficiencies around SMR and SSD zoning capabilities and removing the complexities and overlaps of both mediums. This is the Zoned Storage initiatives a technical working proposal to the existing NVMe standards. The resulting outcome will give applications in the user space more control on the placement of data blocks on zone aware devices and zoned SSDs, collectively as Zoned Block Device (ZBD). The implementation in the Linux user and kernel space is shown below:

Continue reading

Zoned Technologies with Western Digital

[Disclosure: I am invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees will be covered by GestaltIT, the organizer and I am not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

Storage Field Day 19 is a week away. And one of the vendors presenting is Western Digital, who also presented at Storage Field Day 18 almost a year ago. Here is my blog where I received the full force of Western Digital. In that 10 months or so, Western Digital has sold off their IntelliFlash assets to Data Direct Networks and leaving their ActiveScale object storage platform in limbo.

What is in store from Western D?

I am eager to find out what coming from Western Digital. They have tons of storage technologies that I have yet to encounter, and this anticipation is keeping me excited for the Western D session at Storage Field Day 19.

For a few years I have been keen on a few Western D’s technologies which were moving up the value chain. They are:

In my patch, the signals of the 3 Western D’s technologies have gone weak in the past year. However, there is a lot of momentum right now for Zoned Storage and Zoned Name Space and I believe this could be what is in store for the storage propeller heads like us at Storage Field Day 19.

Continue reading

Green Storage? Meh!

Something triggered my thoughts a few days ago. A few of us got together talking about climate change and a friend asked how green was the datacenter in IT. With cloud computing booming, I would say that green computing isn’t really the hottest thing at present. That in turn, leads us to one of the most voracious energy beasts in the datacenter, storage. Where is green storage in the equation?

What is green?

Over the past decade, several storage related technologies were touted as more energy efficient. These include

  • Tape – when tapes are offline, they do not consume power and do not require cooling
  • Virtualization – Virtualization reduces the number of servers and desktops, and of course storage too
  • MAID (Massive Array of Independent Disks) – the arrays spin down the HDDs if idle for a period of time
  • SSD (Solid State Drives) – Compared to HDDs, SSDs consume much less power, and overall reduce the cooling needs
  • Data Footprint Reduction – Deduplication, compression and other technologies to reduce copies of data
  • SMR (Shingled Magnetic Recording) Drives – Higher areal density means less drives but limited by physics.

The largest gorilla in storage technology

HDDs still dominate the market and they are the biggest producers of heat and vibration in a storage array, along with the redundant power supplies and fans. Until and unless SSDs dominate, we have to live with the fact that storage disk drives are not green. The statistics from Statistica below forecasts that in 2021, the shipment of SSDs will surpass HDDs.

Today the areal density of HDDs have increased. With SMR (shingled magnetic recording), the areal density jumped about 25% more than the 1Tb/inch (Terabit per inch) in the CMR (conventional magnetic recording) drives. The largest SMR in the market today is 16TB from Seagate with 18TB SMR in the horizon. That capacity is going to grow significantly when EAMR (energy assisted magnetic recording) – which counts heat assisted and microwave assisted – drives enter the market next year. The areal density will grow to 1.6Tb/inch with a roadmap to 4.0Tb/inch. Continue reading

ZFS Replication and Recovery with FreeNAS

We get requests to recover data from a secondary platform all the time. RPO (recovery point objective) of 30 minutes can be challenging to small to medium sized companies, especially if there is an SLA (service level agreement) to meet.

This week, my team and I took some time to create a FreeNAS replication demo for a potential client. I thought I document the whole thing about ZFS replication, the key steps to set it up and show how recovery is done.

ZFS Snapshots

ZFS replication relies on periodic ZFS snapshots. ZFS snapshot is an inherent feature from the ZFS file system, and often used as a point-in-time copy of the existing ZFS file system tree in memory. Once a snapshot has been triggered, either manually or on schedule (periodic), the file system tree and its metadata in the memory are committed to disk to ensure an updated and consistent state of the file system at all times.

To start, a running snapshot policy on a schedule must be in place. This snapshot policy can be on a specific dataset or zvol, or even the entire zpool. Yeah, I am using quite a few ZFS terminology here – zpool, zvol, dataset. You can read more about each of the structures and more here.

Once the ZFS replication task has been setup, every snapshot occurred in the snapshot policy is automatically duplicated and copied to the target ZFS dataset. Usually, the target ZFS dataset is on a secondary FreeNAS storage server, serving as a disaster recovery platform. Sending and receiving data in the snapshots rely on SSH service.

This is the network diagram explaining the FreeNAS ZFS replication setup.

Continue reading

Did Cloud Kill LTFS?

I like LTFS (Linear Tape File System). I was hoping it would take off but it has not. And looking at its future, its significance is becoming less and less relevant. I look if Cloud has been a factor in the possible demise of LTFS in the next few years.

What is LTFS?

In a nutshell, Linear Tape File System makes LTO tapes look like a disk with a file system. It takes a tape and divides it into 2 partitions:

  • Index Partition (XML Index Schema with file names, metadata and attributes details)
  • Data Partition (where the data resides)

Diagram from https://www.snia.org/sites/default/orig/SDC2011/presentations/tuesday/DavidPease_LinearTape_File_System.pdf

It has a File System module which is implemented in supported OS of Unix/Linux, MacOS and Windows. And the mounted file system “tape partition” shows up as a drive or device.

Assassination attempts

There were many attempts to kill off tapes and so far, none has been successful.

Among the “tape-killer” technologies, I think the most prominent one is the VTL (Virtual Tape Library). There were many VTLs I encountered during my days in mid-2000s. NetApp had Alacritus and EMC had Clariion Disk Libraries. There were also IBM ProtecTIER, FalconStor VTL (which is still selling today) among others and Sepaton (read in reverse is “No Tapes’). Sepaton was acquired by Hitachi Data Systems several years back. Continue reading

The full force of Western Digital

[Preamble: I have been invited by GestaltIT as a delegate to their Tech Field Day for Storage Field Day 18 from Feb 27-Mar 1, 2019 in the Silicon Valley USA. My expenses, travel and accommodation were covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

3 weeks after Storage Field Day 18, I was still trying to wrap my head around the 3-hour session we had with Western Digital. I was like a kid in a candy store for a while, because there were too much to chew and I couldn’t munch them all.

From “Silicon to System”

Not many storage companies in the world can claim that mantra – “From Silicon to Systems“. Western Digital is probably one of 3 companies (the other 2 being Intel and nVidia) I know of at present, which develops vertical innovation and integration, end to end, from components, to platforms and to systems.

For a long time, we have always known Western Digital to be a hard disk company. It owns HGST, SanDisk, providing the drives, the Flash and the Compact Flash for both the consumer and the enterprise markets. However, in recent years, through 2 eyebrow raising acquisitions, Western Digital was moving itself up the infrastructure stack. In 2015, it acquired Amplidata. 2 years later, it acquired Tegile Systems. At that time, I was wondering why a hard disk manufacturer was buying storage technology companies that were not its usual bread and butter business.

Continue reading

StorPool – Block storage managed well

[Preamble: I have been invited by GestaltIT as a delegate to their Tech Field Day for Storage Field Day 18 from Feb 27-Mar 1, 2019 in the Silicon Valley USA. My expenses, travel and accommodation were covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

Storage technology is complex. Storage infrastructure and data management operations are not trivial, despite what the hyperscalers like Amazon Web Services and Microsoft Azure would like you to think. As the adoption of cloud infrastructure services grow, the small and medium businesses/enterprises (SMB/SME) are usually left to their own devices to manage the virtual storage infrastructure. Cloud Service Providers (CSPs) addressing the SMB/SME market are looking for easier, worry-free, software-defined storage to elevate their value to their customers.

Managed high performance block storage

Enter StorPool.

StorPool is a scale-out block storage technology, capable of delivering 1 million+ IOPS with sub-milliseconds response times. As described by fellow delegate, Ray Lucchesi in his recent blog, they were able to achieve these impressive performance numbers in their demo, without the high throughput RDMA network or the storage class memory of Intel Optane. Continue reading

Sexy HPC storage is all the rage

HPC is sexy

There is no denying it. HPC is sexy. HPC Storage is just as sexy.

Looking at the latest buzz from Super Computing Conference 2018 which happened in Dallas 2 weeks ago, the number of storage related vendors participating was staggering. Panasas, Weka.io, Excelero, BeeGFS, are the ones that I know because I got friends posting their highlights. Then there are the perennial vendors like IBM, Dell, HPE, NetApp, Huawei, Supermicro, and so many more. A quick check on the SC18 website showed that there were 391 exhibitors on the floor.

And this is driven by the unrelentless demand for higher and higher performance of computing, and along with it, the demands for faster and faster storage performance. Commercialization of Artificial Intelligence (AI), Deep Learning (DL) and newer applications and workloads together with the traditional HPC workloads are driving these ever increasing requirements. However, most enterprise storage platforms were not designed to meet the demands of these new generation of applications and workloads, as many have been led to believe. Why so?

I had a couple of conversations with a few well known vendors around the topic of HPC Storage. And several responses thrown back were to put Flash and NVMe to solve the high demands of HPC storage performance. In my mind, these responses were too trivial, too irresponsible. So I wanted to write this blog to share my views on HPC storage, and not just about its performance.

The HPC lines are blurring

I picked up this video (below) a few days ago. It was insideHPC Rich Brueckner interview with Dr. Goh Eng Lim, HPE CTO and renowned HPC expert about the convergence of both traditional and commercial HPC applications and workloads.

I liked the conversation in the video because it addressed the 2 different approaches. And I welcomed Dr. Goh’s invitation to the Commercial HPC community to work with the Traditional HPC vendors to help push the envelope towards Exascale SuperComputing.

Continue reading

Disaggregation or hyperconvergence?

[Preamble: I have been invited by  GestaltIT as a delegate to their TechFieldDay from Oct 17-19, 2018 in the Silicon Valley USA. My expenses, travel and accommodation are covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

There is an argument about NetApp‘s HCI (hyperconverged infrastructure). It is not really a hyperconverged product at all, according to one school of thought. Maybe NetApp is just riding on the hyperconvergence marketing coat tails, and just wanted to be associated to the HCI hot streak. In the same spectrum of argument, Datrium decided to call their technology open convergence, clearly trying not to be related to hyperconvergence.

Hyperconvergence has been enjoying a period of renaissance for a few years now. Leaders like Nutanix, VMware vSAN, Cisco Hyperflex and HPE Simplivity have been dominating the scene, and touting great IT benefits and eliminating IT efficiencies. But in these technologies, performance and capacity are tightly intertwined. That means that in each of the individual hyperconverged nodes, typically starting with a trio of nodes, the processing power and the storage capacity comes together. You have to accept both resources as a node. If you want more processing power, you get the additional storage capacity that comes with that node. If you want more storage capacity, you get more processing power whether you like it or not. This means, you get underutilized resources over time, and definitely not rightsized for the job.

And here in Malaysia, we have seen vendors throw in hyperconverged infrastructure solutions for every single requirement. That was why I wrote a piece about some zealots of hyperconverged solutions 3+ years ago. When you think you have a magical hammer, every problem is a nail. 😉

In my radar, NetApp and Datrium are the only 2 vendors that offer separate nodes for compute processing and storage capacity and still fall within the hyperconverged space. This approach obviously benefits the IT planners and the IT architects, and the customers too because they get what they want for their business. However, the disaggregation of compute processing and storage leads to the argument of whether these 2 companies belong to the hyperconverged infrastructure category.

Continue reading