OpenZFS 2.0 exciting new future

The OpenZFS (virtual) Developer Summit ended over a weekend ago. I stayed up a bit (not much) to listen to some of the talks because it started midnight my time, and ran till 5am on the first day, and 2am on the second day. Like a giddy schoolboy, I was excited, not because I am working for iXsystems™ now, but I have been a fan and a follower of the ZFS file system for a long time.

History wise, ZFS was conceived at Sun Microsystems in 2005. I started working on ZFS reselling Nexenta in 2009 (my first venture into business with my company nextIQ) after I was professionally released by EMC early that year. I bought a Sun X4150 from one of Sun’s distributors, and started creating a lab server. I didn’t like the workings of NexentaStor (and NexentaCore) very much, and it was priced at 8TB per increment. Later, I started my second company with a partner and it was him who showed me the elegance and beauty of ZFS through the command lines. The creed of ZFS as a volume and a file system at the same time with the CLI had an effect on me. I was in love.

OpenZFS Developer Summit 2020 Logo

OpenZFS Developer Summit 2020 Logo

Exciting developments

Among the many talks shared in the OpenZFS Developer Summit 2020 , there were a few ideas and developments which were exciting to me. Here are 3 which I liked and I provide some commentary about them.

  • Block Reference Table
  • dRAID (declustered RAID)
  • Persistent L2ARC

Continue reading

Kubernetes Persistent Storage Managed Well

[ Disclosure: This is a StorPool Storage sponsored blog ]

StorPool Storage – Distributed Storage

There is a rapid adoption of Kubernetes in the enterprise and in the cloud. The push for digital transformation to modernize businesses for a cloud native world in the next decade has lifted both containerized applications and the Kubernetes container orchestration platform to an unprecedented level. The application landscape, especially the enterprise, is looking at Kubernetes to address these key areas:

  • Scale
  • High performance
  • Availability and Resiliency
  • Security and Compliance
  • Controllable Costs
  • Simplified

The Persistent Storage Question

Enterprise applications such as relational databases, email servers, and even the cloud native ones like NoSQL, analytics engines, demand a single data source of truth. Fundamentals properties such as ACID (atomicity, consistency, isolation, durability) and BASE (Basic Availability, Soft State, Eventual Consistency) have to have persistent storage as the foundational repository for the data. And thus, persistent storage have rallied under Container Storage Interface (CSI), and fast becoming a de facto standard for Kubernetes. At last count, there are more than 80 CSI drivers from 60+ storage and cloud vendors, each providing block-level storage to Kubernetes pods.

However, at this juncture, Kubernetes is still very engineering-centric. Persistent storage is equally as challenging, despite all the new developments and hype around it.

Continue reading

Intel is still a formidable force

It is easy to kick someone who is down. Bad news have stronger ripple effects than the good ones. Intel® is going through a rough patch, and perhaps the worst one so far. They delayed their 7nm manufacturing process, one which could have given Intel® the breathing room in the CPU war with rival AMD. And this delay has been pushed back to 2021, possibly 2022.

Intel Apple Collaboration and Partnership started in 2005

Their association with Apple® is coming to an end after 15 years, and more security flaws surfaced after the Spectre and Meltdown debacle. Extremetech probably said it best (or worst) last month:

If we look deeper (and I am sure you have), all these negative news were related to their processors. Intel® is much, much more than that.

Their Optane™ storage prowess

I have years of association with the folks at Intel® here in Malaysia dating back 20 years. And I hardly see Intel® beating it own drums when it comes to storage technologies but they are beginning to. The Optane™ revolution in storage, has been a game changer. Optane™ enables the implementation of persistent memory or storage class memory, a performance tier that sits between DRAM and the SSD. The speed and more notable the latency of Optane™ are several times faster than the Enterprise SSDs.

Intel pyramid of tiers of storage medium

If you want to know more about Optane™’s latency and speed, here is a very geeky article from Intel®:

The list of storage vendors who have embedded Intel® Optane™ into their gears is long. Vast Data, StorOne™, NetApp® MAX Data, Pure Storage® DirectMemory Modules, HPE 3PAR and Nimble Storage, Dell Technologies PowerMax, PowerScale, PowerScale and many more, cement Intel® storage prowess with Optane™.

3D Xpoint, the Phase Change Memory technology behind Optane™ was from the joint venture between Intel® and Micron®. That partnership was dissolved in 2019, but it has not diminished the momentum of next generation Optane™. Alder Stream and Barlow Pass are going to be Gen-2 SSD and Persistent Memory DC DIMM respectively. A screenshot of the Optane™ roadmap appeared in Blocks & Files last week.

Intel next generation Optane roadmap

Continue reading

Down the rabbit hole with Kubernetes Storage

Kubernetes is on fire. Last week VMware® released the State of Kubernetes 2020 report which surveyed companies with 1,000 employees and above. Results were not surprising as the adoptions of this nascent technology are booming. But persistent storage remained the nagging concern for the Kubernetes serving the infrastructure resources to applications instances running in the containers of a pod in a cluster.

The standardization of storage resources have settled with CSI (Container Storage Interface). Storage vendors have almost, kind of, sort of agreed that the API objects such as PersistentVolumes, PersistentVolumeClaims, StorageClasses, along with the parameters would be the way to request the storage resources from the Pre-provisioned Volumes via the CSI driver plug-in. There are already more than 50 vendor specific CSI drivers in Github.

Kubernetes and CSI initiative

Kubernetes and the CSI (Container Storage Interface) logos

The CSI plug-in method is the only way for Kubernetes to scale and keep its dynamic, loadable storage resource integration with external 3rd party vendors, all clamouring to grab a piece of this burgeoning demands both in the cloud and in the enterprise.

Continue reading

Dell EMC Isilon is an Emmy winner!

[ Disclosure: I was invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees were covered by GestaltIT, the organizer and I was not obligated to blog or promote the vendors’ technologies presented at this event. The content of this blog is of my own opinions and views ]

And the Emmy® goes to …

Yes, the Emmy® goes to Dell EMC Isilon! It was indeed a well deserved accolade and an honour!

Dell EMC Isilon had just won the Technology & Engineering Emmy® Awards a week before Storage Field Day 19, for their outstanding pioneering work on the NAS platform tiering technology of media and broadcasting content according to business value.

A lasting true clustered NAS

This is not a blog to praise Isilon but one that instill respect to a real true clustered, scale-out file system. I have known of OneFS for a long time, but never really took the opportunity to really put my hands on it since 2006 (there is a story). So here is a look at history …

Back in early to mid-2000, there was a lot of talks about large scale NAS. There were several players in the nascent scaling NAS market. NetApp was the filer king, with several competitors such as Polyserve, Ibrix, Spinnaker, Panasas and the young upstart Isilon. There were also Procom, BlueArc and NetApp’s predecessor Auspex. By the second half of the 2000 decade, the market consolidated and most of these NAS players were acquired.

Continue reading

Zoned Technologies with Western Digital

[Disclosure: I am invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees will be covered by GestaltIT, the organizer and I am not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

Storage Field Day 19 is a week away. And one of the vendors presenting is Western Digital, who also presented at Storage Field Day 18 almost a year ago. Here is my blog where I received the full force of Western Digital. In that 10 months or so, Western Digital has sold off their IntelliFlash assets to Data Direct Networks and leaving their ActiveScale object storage platform in limbo.

What is in store from Western D?

I am eager to find out what coming from Western Digital. They have tons of storage technologies that I have yet to encounter, and this anticipation is keeping me excited for the Western D session at Storage Field Day 19.

For a few years I have been keen on a few Western D’s technologies which were moving up the value chain. They are:

In my patch, the signals of the 3 Western D’s technologies have gone weak in the past year. However, there is a lot of momentum right now for Zoned Storage and Zoned Name Space and I believe this could be what is in store for the storage propeller heads like us at Storage Field Day 19.

Continue reading

Is General Purpose Object Storage disenfranchised?

[Disclosure: I am invited by GestaltIT as a delegate to their Storage Field Day 19 event from Jan 22-24, 2020 in the Silicon Valley USA. My expenses, travel, accommodation and conference fees will be covered by GestaltIT, the organizer and I am not obligated to blog or promote the vendors’ technologies to be presented at this event. The content of this blog is of my own opinions and views]

This is NOT an advertisement for coloured balls.

This is the license to brag for the vendors in the next 2 weeks or so, as we approach the 2020 new year. This, of course, is the latest 2019 IDC Marketscape for Object-based Storage, released last week.

My object storage mentions

I have written extensively about Object Storage since 2011. With different angles and perspectives, here are some of them:

Continue reading

The full force of Western Digital

[Preamble: I have been invited by GestaltIT as a delegate to their Tech Field Day for Storage Field Day 18 from Feb 27-Mar 1, 2019 in the Silicon Valley USA. My expenses, travel and accommodation were covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

3 weeks after Storage Field Day 18, I was still trying to wrap my head around the 3-hour session we had with Western Digital. I was like a kid in a candy store for a while, because there were too much to chew and I couldn’t munch them all.

From “Silicon to System”

Not many storage companies in the world can claim that mantra – “From Silicon to Systems“. Western Digital is probably one of 3 companies (the other 2 being Intel and nVidia) I know of at present, which develops vertical innovation and integration, end to end, from components, to platforms and to systems.

For a long time, we have always known Western Digital to be a hard disk company. It owns HGST, SanDisk, providing the drives, the Flash and the Compact Flash for both the consumer and the enterprise markets. However, in recent years, through 2 eyebrow raising acquisitions, Western Digital was moving itself up the infrastructure stack. In 2015, it acquired Amplidata. 2 years later, it acquired Tegile Systems. At that time, I was wondering why a hard disk manufacturer was buying storage technology companies that were not its usual bread and butter business.

Continue reading

WekaIO controls their performance destiny

[Preamble: I have been invited by GestaltIT as a delegate to their Tech Field Day for Storage Field Day 18 from Feb 27-Mar 1, 2019 in the Silicon Valley USA. My expenses, travel and accommodation were covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

I was first introduced to WekaIO back in Storage Field Day 15. I did not blog about them back then, but I have followed their progress quite attentively throughout 2018. 2 Storage Field Days and a year later, they were back for Storage Field Day 18 with a new CTO, Andy Watson, and several performance benchmark records.

Blowout year

2018 was a blowout year for WekaIO. They have experienced over 400% growth, placed #1 in the Virtual Institute IO-500 10-node performance challenge, and also became #1 in the SPEC SFS 2014 performance and latency benchmark. (Note: This record was broken by NetApp a few days later but at a higher cost per client)

The Virtual Institute for I/O IO-500 10-node performance challenge was particularly interesting, because it pitted WekaIO against Oak Ridge National Lab (ORNL) Summit supercomputer, and WekaIO won. Details of the challenge were listed in Blocks and Files and WekaIO Matrix Filesystem became the fastest parallel file system in the world to date.

Control, control and control

I studied WekaIO’s architecture prior to this Field Day. And I spent quite a bit of time digesting and understanding their data paths, I/O paths and control paths, in particular, the diagram below:

Starting from the top right corner of the diagram, applications on the Linux client (running Weka Client software) and it presents to the Linux client as a POSIX-compliant file system. Through the network, the Linux client interacts with the WekaIO kernel-based VFS (virtual file system) driver which coordinates the Front End (grey box in upper right corner) to the Linux client. Other client-based protocols such as NFS, SMB, S3 and HDFS are also supported. The Front End then interacts with the NIC (which can be 10/100G Ethernet, Infiniband, and NVMeoF) through SR-IOV (single root IO virtualization), bypassing the Linux kernel for maximum throughput. This is with WekaIO’s own networking stack in user space. Continue reading