Stating the case for a Storage Appliance approach

I was in Indonesia last week to meet with iXsystems™‘ partner PT Maha Data Solusi. I had the wonderful opportunity to meet with many people there and one interesting and often-replayed question arose. Why aren’t iX doing software-defined-storage (SDS)? It was a very obvious and deliberate question.

After all, iX is already providing the free use of the open source TrueNAS® CORE software that runs on many x86 systems as an SDS solution and yet commercially, iX sell the TrueNAS® storage appliances.

This argument between a storage appliance model and a storage storage only model has been debated for more than a decade, and it does come into my conversations on and off. I finally want to address this here, with my own views and opinions. And I want to inform that I am open to both models, because as a storage consultant, both have their pros and cons, advantages and disadvantages. Up front I gravitate to the storage appliance model, and here’s why.

My story of the storage appliance begins …

Back in the 90s, most of my work was on Fibre Channel and NFS. iSCSI has not existed yet (iSCSI was ratified in 2003). It was almost exclusively on the Sun Microsystems® enterprise storage with Sun’s software resell of the Veritas® software suite that included the Sun Volume Manager (VxVM), Veritas® Filesystem (VxFS), Veritas® Replication (VxVR) and Veritas® Cluster Server (VCS). I didn’t do much Veritas® NetBackup (NBU) although I was trained at Veritas® in Boston in July 1997 (I remembered that 2 weeks’ trip fondly). It was just over 2 months after Veritas® acquired OpenVision. Backup Plus was the NetBackup.

Between 1998-1999, I spent a lot of time working Sun NFS servers. The prevalent networking speed at that time was 100Mbits/sec. And I remember having this argument with a Sun partner engineer by the name of Wong Teck Seng. Teck Seng was an inquisitive fella (still is) and he was raving about this purpose-built NFS server he knew about and he shared his experience with me. I detracted him, brushing aside his always-on tech orgasm, and did not find great things about a NAS storage appliance. Auspex™ was big then, and I knew of them.

I joined NetApp® as Malaysia’s employee #2. It was an odd few months working with a storage appliance but after a couple of months, I started to understand and appreciate the philosophy. The storage Appliance Model made sense to me, even through these days.

Continue reading

OpenZFS with Object Storage

At AWS re:Invent last week, Amazon Web Services announced Amazon FSx for OpenZFS. This is the 4th managed service under the Amazon FSx umbrella, joining NetApp® ONTAP™, Lustre and Windows File Server. The highly scalable OpenZFS filesystem can provide high throughput and IOPS bandwidth to Amazon EC2, ECS, EKS and VMware® Cloud on AWS.

I am assuming the AWS OpenZFS uses EBS as the block storage backend, given the announcement that it can deliver 4GB/sec of throughput and 160,000 IOPS from the “drives” without caching. How the OpenZFS is provisioned to the AWS clients is well documented in this blog here. It is an absolutely joy (for me) to see the open source OpenZFS filesystem getting the validation and recognization from AWS. This is one hell of a filesystem.

But this blog isn’t about AWS FSx for OpenZFS with block storage. It is about what is coming, and eventually AWS FSx for OpenZFS could expand into AWS’s proficient S3 storage as well.  Can OpenZFS integrate with an S3 object storage backend? This blog looks into the burning question.

In the recently concluded OpenZFS Developer Summit 2021, one of the topics was “ZFS on Object Storage“, and the short answer is a resounding YES!

OpenZFS Developer Summit 2021

Continue reading

Falconstor Software Defined Data Preservation for the Next Generation

Falconstor® Software is gaining momentum. Given its arduous climb back to the fore, it is beginning to soar again.

Tape technology and Digital Data Preservation

I mentioned that long term digital data preservation is a segment within the data lifecycle which has merits and prominence. SNIA® has proved that this is a strong growing market segment through its 2007 and 2017 “100 Year Archive” surveys, respectively. 3 critical challenges of this long, long-term digital data preservation is to keep the archives

  • Accessible
  • Undamaged
  • Usable

For the longest time, tape technology has been the king of the hill for digital data preservation. The technology is cheap, mature, and many enterprises has built their long term strategy around it. And the pulse in the tape technology market is still very healthy.

The challenges of tape remain. Every 5 years or so, companies have to consider moving the data on the existing tape technology to the next generation. It is widely known that LTO can read tapes of the previous 2 generations, and write to it a generation before. The tape transcription process of migrating digital data for the sake of data preservation is bad because it affects the structural integrity and quality of the content of the data.

In my times covering the Oil & Gas subsurface data management, I have seen NOCs (national oil companies) with 500,000 tapes of all generations, from 1/2″ to DDS, DAT to SDLT, 3590 to LTO 1-7. And millions are spent to transcribe these tapes every few years and we have folks like Katalyst DM, Troika and more hovering this landscape for their fill.

Continue reading

The Falcon to soar again

One of the historical feats which had me mesmerized for a long time was the 14-year journey China’s imperial treasures took to escape the Japanese invasion in the early 1930s, sandwiched between rebellions and civil wars in China. More than 20,000 pieces of the imperial treasures took a perilous journey to the west and back again. Divided into 3 routes over a decade and four years, not a single piece of treasure was broken or lost. All in the name of preservation.

Today, that 20,000 over pieces live in perpetuity in 2 palaces – Beijing Palace Museum in China and National Palace Museum Taipei in Taiwan

Digital data preservation

Digital data preservation is on another end of the data lifecycle spectrum. More often than not, it is not the part that many pay attention to. In the past 2 decades, digital data has grown so much that it is now paramount to keep the data forever. Mind you, this is not the data hoarding kind but to preserve the knowledge and wisdom which is in the digital content of the data.

[ Note: If you are interested to know more about Data -> Information -> Knowledge -> Wisdom, check out my 2015 article on LinkedIn ]

SNIA (Storage Networking Industry Association) conducted 2 surveys – one in 2007 and another in 2017 – called the 100 Year Archive, and found that the requirement for preserving digital data has grown multiple folds over the 10 years. In the end, the final goal is to ensure that the perpetual digital contents are

  • Accessible
  • Undamaged
  • Usable

All at an affordable cost. Therefore, SNIA has the vision that the digital content must transcend beyond the storage medium, the storage system and the technology that holds it.

The Falcon reemerges

A few weeks ago, I had the privilege to speak with Falconstor® Software‘s David Morris (VP of Global Product Strategy & Marketing) and Mark Delsman (CTO). It was my first engagement with Falconstor® in almost 9 years! I wrote a piece of Falconstor® in my blog in 2011.

Continue reading

Brainy Commvault

[Disclosure: I was invited by Commvault as a Media person and Social Ambassador to their Commvault GO 2019 Conference and also a Tech Field Day eXtra delegate from Oct 13-17, 2019 in the Denver CO, USA. My expenses, travel, accommodation and conference fees were covered by Commvault, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

The waltz across the Commvault-Hedvig mine field will not be easy. Commvault will have a lot of open discussions about their acquisition of Hedvig and how Hedvig “primary storage platform” will fit into a “secondary storage framework” of Commvault. The outcome of this consummation is yet to appear as a structured form. The storyline will eventually form as Commvault’s diligence to define their strategy moving forward.

Day 1

Day 1 was my open day at Commvault GO. I was absorbing the first impressions of Commvault again even though this was my third Commvault GO, after Washington DC and Nashville in 2017 and 2018 respectively. There was certainly a “startup” feeling again in Commvault since the appointment of Sanjay Mirchandani as CEO 9 months ago.

A lot of excitement and buzz were generated around the metallic, the Commvault venture into Software-as-a-Service (SaaS). The SaaS solution is targeted at the mid-market for organizations with 500-2500 staff count. Its simplicity and pricing were the 2 things which gave me a good feeling all over. There is even a 45-day trial for metallic.

Getting Brainy

My Day 2 itinerary was more specific because my agenda for this trip was to seek answers to the realization of Commvault-Hedvig.

Commvault took the distinction of using the vision of a DataBrain (#databrain) to define their strategy. From the picture below, the left and right hemisphere of the DataBrain forms the Storage Management piece on the left and Data Management on the right.

Continue reading

No Flash in the pan

The storage networking market now is teeming with flash solutions. Consumers are probably sick to their stomach getting a better insight which flash solution they should be considering. There are so much hype, fuzz and buzz and like a swarm of bees, in the chaos of the moment, there is actually a calm and discerning pattern slowly, but surely, emerging. Storage networking guys would probably know this thing well, but for the benefit of the other readers, how we view flash (and other solid state storage) becomes clear with the picture below: Flash performance gap

(picture courtesy of  http://electronicdesign.com/memory/evolution-solid-state-storage-enterprise-servers)

Right at the top, we have the CPU/Memory complex (labelled as Processor). Our applications, albeit bytes and pieces of them, run in this CPU/Memory complex.

Therefore, we can see Pattern #1 showing up. Continue reading

Washing too much software defined

There’s been practically a firestorm when EMC announced ViPR, its own version of “software-defined storage” at EMC World last week. Whether you want to call it Virtualization Platform Re-defined or Re-imagined, competitors such as NetApp, HDS, Nexenta have taken pot-shots at EMC, and touting their own version of software-defined storage.

In the release announcement, EMC claimed the following (a cut-&-paste from the announcement):

  • The EMC ViPR Software-Defined Storage Platform uniquely provides the ability to both manage storage infrastructure (Control Plane) and the data residing within that infrastructure (Data Plane).
  • The EMC ViPR Controller leverages existing storage infrastructures for traditional workloads, but provisions new ViPR Object Data Services (with access via Amazon S3 or HDFS APIs) for next-generation workloads. ViPR Object Data Services integrate with OpenStack via Swift and can be run against enterprise or commodity storage.
  • EMC ViPR integrates tightly with VMware’s Software Defined Data Center through industry standard APIs and interoperates with Microsoft and OpenStack.

The separation of the Control Plane and the Data Plane of the ViPR allows the abstraction of 2 main layers.

Layer 1 is the abstraction of the underlying storage hardware infrastructure. Although I don’t have the full details (EMC guys please enlighten me, please!), I believe storage administrator no longer need to carve out LUNs from RAID groups or Storage Pools, striped and sliced them and further provision them into meta file systems before they are exported or shared through NAS protocols. I am , of course, quoting the underlying provisioning architecture of Celerra, which can be quite complex. Anyone who has done manual provisioning with Celerra Manager should know what I mean.

Here’s the provisioning architecture of Celerra:

Continue reading

VMware in step 1 breaking big 6 hegemony

Happy Lunar New Year! This is the Year of the Water Snake, which just commenced 3 days ago.

I have always maintain that VMware has to power to become a storage killer. I mentioned that it was a silent storage killer in my blog post many moons ago.

And this week, VMware is not so silent anymore. Earlier this week, VMware had just acquired Virsto, a storage hypervisor technology company. News of the acquisition are plentiful on the web and can be found here and here. VMware is seriously pursuing its “Software-Defined Data Center (SDDC)” agenda and having completed its software-defined networking component with the acquisition of Nicira back in July 2012, the acquisition of Virsto represents another bedrock component of SDDC, software-defined storage.

Who is Virsto and what do they do? Well, in a nutshell, they abstract the underlying storage architecture and presents a single, global namespace for storage, a big storage pool for VM datastores. I got to know about their presence last year, when I was researching on the topic of storage virtualization.

I was looking at Datacore first, because I was familiar with Datacore. I got to know Roni Putra, Datacore’s CTO, through a mutual friend, when he was back in Malaysia. There was a sense of pride knowing that Roni is a Malaysian. That was back in 2004. But Datacore isn’t the only player in the game, because the market is teeming with folks like Tintri, Nutanix, IBM, HDS and many more. It just so happens that Virsto has caught the eye of VMware as it embarks its first high-profile step (the one that VMware actually steps on the toes of the Storage Big 6 literally) into the storage game. The Big 6 are EMC, NetApp, IBM, HP, HDS and Dell (maybe I should include Fujitsu as well, since it has been taking market share of late)

Virsto installs as a VSA (virtual storage appliance) into ESXi, and in version 2.0, it plugs right in as an almost-native feature of ESXi, not a vCenter tab like most other storage. It looks and feels very much like a vSphere functionality and this blurs the lines of storage and VM management. To the vSphere administrator, the only time it needs to be involved in storage administration is when he/she is provisioning storage or expanding it. Those are the only 2 common “touch-points” that a vSphere administrator has to deal with storage. This, therefore, simplifies the administration and management job.

Here’s a look at the Virsto Storage Hypervisor architecture (credits to Google Images):

What Virsto does, as I understand from high-level, is to take any commodity storage and provides a virtual storage layer and consolidate them into a very large storage pool. The storage pool is called vSpace (previously known as LiveSpace?) and “allocates” Virsto vDisks to each VMs. Each Visto vDisk will look like a native zeroed thick VMDK, with the space efficiency of Linked Clones, but without the performance penalty of provisioning them.  The Virsto vDisks are presented as NFS exports to each VM.

Another important component is the asynchronous write to Virsto vLogs. This is configured at the deployment stage, and this is basically a software-based write cache, quickly acknowledging all writes for write optimization and in the background, asynchronously de-staged to the vSpace. Obviously it will have its own “secret sauce” to optimize the writes.

Within the vSpace, as disk clone groups internal to the Virsto, storage related features such as tiering, thin provisioning, cloning and snapshots are part and parcel of it. Other strong features of Virsto are its workflow wizard in storage provisioning, and its intuitive built-in performance and management console.

As with most technology acquisitions, the company will eventually come to a fork where they have to decide which way to go. VMware has experienced it before with its Nicira acquisition. It had to decide between VxLAN (an IETF standard popularized by Cisco) or Nicira’s own STT (Stateless Transport Tunneling). There is no clear winner because choosing one over the other will have its rewards and losses.

Likewise, the Virsto acquisition will have to be packaged in a friendly manner by VMware. It does not want to step on all toes of its storage Big 6 partners (yet). It still has to abide to some industry “co-opetition” game rules but it has started the ball rolling.

And I see that 2 critical disruptive points about this acquisition in this:

  1. It has endorsed the software-defined storage/storage hypervisor/storage virtualization technology and started the commodity storage hardware technology wave. This could the beginning of the end of proprietary storage hardware. This is also helped by other factors such as the Open Compute Project by Facebook. Read my blog post here.
  2. It is pushing VMware into a monopoly ala-Microsoft of the yesteryear. But this time around, Microsoft Hyper-V could be the benefactor of the VMware agenda. No wonder VMware needs to restructure and streamline its business. News of VMware laying off about 900 staff can be read here. Its unfavourable news of its shares going down can be read here.

I am sure the Storage Big 6 is on the alert and is probably already building other technology and partnerships beyond VMware. It the natural thing to do but there is no stopping VMware if it wants to step on the Big 6 toes now!