Huawei Dorado – All about Speed

[Preamble: I was a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

Since Storage Field Day 15 3 weeks ago, the thoughts of the session with Huawei lingered. And one word came to describe Huawei Dorado V3, their flagship All-Flash storage platform is SPEED.

My conversation with Huawei actually started the night before our planned session at their Santa Clara facility the next day. We had a evening get-together at Bourbon Levi’s Stadium. I was with my buddy, Ammar Zolkipli, who coincidentally was in the Silicon Valley for work. Ammar is from Hitachi Vantara Japan, and has been a good friend of mine for over 17 years now.

Shortly, the Huawei team arrived to join the camaraderie. And we introduced ourselves to Chun Liu, not knowing that he is the Chief Architect at Huawei. A big part of that evening was our conversation with him. Ammar and I have immersed in the Oil & Gas EP (Exploration & Production) data management and petrotechnical applications when he was in Schlumberger and after that a reseller of NetApp. I was a Consulting Engineer with NetApp back then. So, the 2 of us started blabbering (yeah, that would be us when we get together to talk technology).

I observed that Chun was very interested to find learn about real world application use cases that would push storage performance to its limits. And I guessed that the best type of I/O characteristics would be small block, random I/O and billions of them, with near-real time latency. After that evening I did some research and could only think of a few, such as deep analytics or some applications with needs for Monte Carlo simulations. Oh, well, maybe I would share that with Chun the following day.

The moment the session started, it was already about the speed prowess of Huawei Storage. It was like the greyhounds unleashed going after the rabbit. In the lineup, the Dorado series stood out.

Continue reading

Cohesity SpanFS – a foundational shift

[Preamble: I was a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

Cohesity SpanFS impressed me. Their filesystem was designed from ground up to meet the demands of the voluminous cloud-scale data, and yes, the sheer magnitude of data everywhere needs to be managed.

We all know that primary data is always the more important piece of data landscape but there is a growing need to address the secondary data segment as well.

Like a floating iceberg, the piece that is sticking out is the more important primary data but the larger piece beneath the surface of the water, which is the secondary data, is becoming more valuable. Applications such as file shares, archiving, backup, test and development, and analytics and insights are maturing as the foundational data management frameworks and fast becoming the bedrock of businesses.

The ability of businesses to bounce back after a disaster; the relentless testing of large data sets to develop new competitive advantage for businesses; the affirmations and the insights of analyzing data to reduce risks in decision making; all these are the powerful back engine applicability that thrust businesses forward. Even the ability to search for the right information in a sea of data for regulatory and compliance reasons is part of the organization’s data management application.

Continue reading

Of Object Storage, Filesystems and Multi-Cloud

Data storage silos everywhere. The early clarion call was to eliminate IT data storage silos by moving to the cloud. Fast forward to the present. Data storage silos are still everywhere, but this time, they are in the clouds. I blogged about this.

Object Storage was all the rage when it first started. AWS, with its S3 (Simple Storage Service) offering, started the cloud storage frenzy. Highly available, globally distributed, simple to access, and fitted superbly into the entire AWS ecosystem. Quickly, a smorgasbord of S3-compatible, S3-like object-based storage emerged. OpenStack Swift, HDS HCP, EMC Atmos, Cleversafe (which became IBM SpectrumScale), Inktank Ceph (which became RedHat Ceph), Bycast (acquired by NetApp to be StorageGrid), Quantum Lattus, Amplidata, and many more. For a period of a few years prior, it looked to me that the popularity of object storage with an S3 compatible front has overtaken distributed file systems.

What’s not to like? Object storage are distributed, they are metadata rich (at a certain structural level), they are immutable (hence secure from a certain point of view), and some even claim self-healing (depending on data protection policies). But one thing that object storage rarely touted dominance was high performance I/O. There were some cases, but they were either fronted by a file system (eg. NFSv4.1 with pNFS extensions), or using some host-based, SAN-client agent (eg. StorNext or Intel Lustre). Object-based storage, in its native form, has not been positioned as high performance I/O storage.

A few weeks ago, I read an article from Storage Soup, Dave Raffo. When I read it, it felt oxymoronic. SwiftStack was just nominated as a visionary in the Gartner Magic Quadrant for Distributed File Systems and Object Storage. But according to Dave’s article, Swiftstack did not want to be “associated” with object storage that much, even though Swiftstack’s technology underpinning was all object storage. Strange.

Continue reading

The power of E8

[Preamble: I was a delegate of Storage Field Day 14 from Nov 8-10, 2017. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

E8 Storage technology update at Storage Field Day 14 was impressive. Out of the several next generation NVMe storage technologies I have explored so far, E8 came out as the most complete. It was no surprise that they won the “Best of Show” in the Flash Memory Summits for the “Most Innovative Flash Memory Technology” in 2016 and “Most Innovative Flash Memory Enterprise Business Application” for 2017.

Who is E8 Storage?

They came out of stealth in August 2016 and have been making waves with very impressive stats. When E8 was announced, their numbers were more than 10 million IOPS, with 100µsecs for reads and 40µsecs for writes. And in the SFD14 demo, they reached and past the 10 million IOPS numbers.

The design philosophy of E8 Storage is different than the traditional dual controller scale-up storage architecture design or the multi-node scale-out cluster design. In fact, from a 30,000 feet view, it is quite similar to a “SAN-client” design advocated by Lustre, leveraging a very high throughput, low latency network.

Continue reading

DellEMC SC progressing well

[Preamble: I was a delegate of Storage Field Day 14. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

I haven’t had a preview of the Compellent technology for a long time. My buddies at Impact Business Solutions were the first to introduce the Compellent technology called Data Progression to the local Malaysian market and I was invited to a preview back then. Around the same time, I also recalled another rather similar preview invitation by PTC Singapore for the 3PAR technology called Adaptive Provisioning (it is called Adaptive Optimization now).

Storage tiering was on the rise in the 2009-2010 years. Both Compellent and 3PAR were neck and neck leading the conversation and mind share of storage tiering, and IBM easyTIER and EMC FAST (Fully Automated Storage Tiering) were nowhere to be seen or heard. Vividly, the Compellent Data Progression technology was much more elegant compared to the 3PAR technology. While both intelligent storage tiering technologies were equally good, I took that the 3PAR founders were ex-Sun Microsystems folks, and Unix folks sucked at UX. In this case, Compellent’s Data Progression was a definitely a leg up better than 3PAR.

History aside, this week I have the chance to get a new preview of the Compellent technology again. Compellent was now rebranded as the SC series and was positioned as the mid-range storage arrays of DellEMC. And together with the other Storage Field Day 14 delegates, I have the pleasure to experience the latest SC Data Progression technology update, as well their latest SC All-Flash.

In Data Progression, one interesting feature which caught my attention was the RAID Tiering. This was a dynamic auto expand and auto contract set of RAID tiersRAID 10 and RAID 5/6 in the Fast Tier and RAID 5/6 in the Lower Tier. RAID 10, RAID 5 and RAID 6 on the same set of drives (including SSDs), and depending on the “hotness” of the data, the location of the data blocks switched between the several RAID tiers in the Fast Tier. Over a longer period, the data blocks would relocate transparently to the Capacity Tier from the Fast Tier.

The Data Progression technology is extremely efficient. The movement of the data between the RAID Tiers and between the Performance/Capacity Tiers are in pages instead of blocks, making the write penalty and bandwidth to a negligible minimum.

The Storage Field Day 14 delegates were also privileged to be the first to get into the deep dive of the new All-Flash SC, just days of the announcement of the All-Flash SC. The All Flash SC redefines and refines the Data Progression to the next level. Among the new optimization, NAND Flash in the SC (both SLCs and MLCs, read-intensive and write-intensive) set the Data Progression default page size from 2MB to 512KB. These smaller 512KB pages enabled reduced bandwidth for tiering between the write-intensive and the read-intensive tier.

I didn’t get the latest SC family photos yet, but I managed to grab a screenshot of the announcement from The Register of the new DellEMC SC Series.

I was very encouraged with the DellEMC Midrange Storage presentation. Besides giving us a fantastic deep dive about the DellEMC SC All-Flash Storage, I was also very impressed by the candid and straightforward attitude of the team, led by their VP of Product Management, Pierluca Chiodelli. An EMC veteran, he was taking up the hard questions onslaught by the SFD14 delegate like a pro. His team’s demeanour was critical in instilling confidence and trust in how the bloggers and the analysts viewed Dell EMC merger, and how the SC and the Unity series would pan out in the technology roadmap.

Unlike the fiasco I went through with the DellEMC Forum 2017 in Malaysia, where I was disturbed with 3 calls in 3 consecutive days by DellEMC Malaysia, I was left with a profound respect for this DellEMC Storage team. They strongly supported their position within the DellEMC storage universe, and imparted their confidence in their technology solution in the marketplace.

Without a doubt, in my point of view, this DellEMC Mid-Range Storage team was the best I have enjoyed in Storage Field Day 14. Thank you.

The rise of RDMA

I have known of RDMA (Remote Direct Memory Access) for quite some time, but never in depth. But since my contract work ended last week, and I have some time off to do some personal development, I decided to look deeper into RDMA. Why RDMA?

In the past 1 year or so, RDMA has been appearing in my radar very frequently, and rightly so. The speedy development and adoption of NVMe (Non-Volatile Memory Express) have pushed All Flash Arrays into the next level. This pushes the I/O and the throughput performance bottlenecks away from the NVMe storage medium into the legacy world of SCSI.

Most network storage interfaces and protocols like SAS, SATA, iSCSI, Fibre Channel today still carry SCSI loads and would have to translate between NVMe and SCSI. NVMe-to-SCSI bridges have to be present to facilitate the translation.

In the slide below, shared at the Flash Memory Summit, there were numerous red boxes which laid out the SCSI connections and interfaces where SCSI-to-NVMe translation (and vice versa) would be required.

Continue reading

The engineering of Elastifile

[Preamble: I was a delegate of Storage Field Day 12. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented in this event]

When it comes to large scale storage capacity requirements with distributed cloud and on-premise capability, object storage is all the rage. Amazon Web Services started the object-based S3 storage service more than a decade ago, and the romance with object storage started.

Today, there are hundreds of object-based storage vendors out there, touting features after features of invincibility. But after researching and reading through many design and architecture papers, I found that many object-based storage technology vendors began to sound the same.

At the back of my mind, object storage is not easy when it comes to most applications integration. Yes, there is a new breed of cloud-based applications with RESTful CRUD API operations to access object storage, but most applications still rely on file systems to access storage for capacity, performance and protection.

These CRUD and CRUD-like APIs are the common semantics of interfacing object storage platforms. But many, many real-world applications do not have the object semantics to interface with storage. They are mostly designed to interface and interact with file systems, and secretly, I believe many application developers and users want a file system interface to storage. It does not matter if the storage is on-premise or in the cloud.

Let’s not kid ourselves. We are most natural when we work with files and folders.

Implementing object storage also denies us the ability to optimally utilize Flash and solid state storage on-premise when the compute is in the cloud. Similarly, when the compute is on-premise and the flash-based object storage is in the cloud, you get a mismatch of performance and availability requirements as well. In the end, there has to be a compromise.

Another “feature” of object storage is its poor ability to handle transactional data. Most of the object storage do not allow modification of data once the object has been created. Putting a NAS front (aka a NAS gateway) does not take away the fact that it is still object-based storage at the very core of the infrastructure, regardless if it is on-premise or in the cloud.

Resiliency, latency and scalability are the greatest challenges when we want to build a true globally distributed storage or data services platform. Object storage can be resilient and it can scale, but it has to compromise performance and latency to be so. And managing object storage will not be as natural as to managing a file system with folders and files.

Enter Elastifile.

Continue reading

FlashForward to Beyond

The flash frenzy has reached its zenith in 2016. We now no longer are interested in listening to storage technology vendors touting the power of solid state storage (NAND Flash included) over spinning drives.

The capacity of 3D NAND Flash SSDs has reached a whopping 15.3TB (that is even bigger than the 12TB 7200RPM HDDs of today), and with deduplication and compression, the storage efficiency has reached a conservative 4:1 or 5:1. Effective capacity of most mid-end storage arrays can easily reach 1-2 Petabytes.

And flash and hybrid platforms have reached maturity in these few short years. So what is next?

The landscape has obviously changed. The performance landscape, the capacity landscape and all related to the storage data points have changed. And the speed of SSDs together with the up-and-coming NVMe and NVDIMM technology in new storage array controllers are also shifting the data bottlenecks to another part of the architecture. The development of I/O communications and interfaces has to change as well, to take advantage of the asynchronous I/Os in storage tiering and caching using NAND Flash.

With this mature and well understood landscape, it is time to take Flash to the next level. This next level comes in the form of an exciting end-user conference in Singapore on 25th April 2017. It is called FlashForward.

The 2016 FlashForward event in Europe has already garnered great support from the cream of the storage technologists around the world, and had fantastic feedbacks from the end-user attendees. That FlashForward event has also seen the birth of an international business and technology exchange in its inaugural introduction.  Yes, it is time to learn from the field experts, and it is time to build on the Flash Platform for new Data Services.

From the sponsorship package brochure I have received, it is definitely an event not to be missed.

The FlashForward Conference in Singapore is exquisitely procured by Evito Ltd, under the stewardship of Mr. Paul Talbut. Paul is a very seasoned veteran in the global circuit as an SNIA director of several initiatives. He has been immensely involved in the development of several SNIA chapters around the world, including South Asia, Malaysia, India, China, and even Brazil. He also leads by example with the SNIA Global Steering Committee (GSC); he is the SNIA Global Education Director and at one time, SNIA DPCO (Data Protection & Capacity Optimization) global proctor.

I have had the honour working with Paul for almost 8 years now, and I am sure he will lead the FlashForward Conference with valuable insights and experiences.

This is probably the greatest period for the industry and end users to get involved in the FlashForward Conference. For one, it is endorsed by SNIA, the vendor-neutral association which has been the growth beacon of the storage networking industry.

Secondly, it is the perfect opportunity for technology vendors to build their mindshare with end users and customers. And with the endorsement of the independent field experts and technology practitioners, end users would have a field day garnering approvals for their decisions, as well as learning the best practices to build upon the Flash technology they have implemented in their data center space.

The sponsorship packages are listed below, and I do encourage technology vendors, especially the All-Flash vendors to use the FlashForward conference as a platform to build their mindshare, and most of all, their branding. Continue reading

Let’s smoke the storage peace pipe

NVMe (Non-Volatile Memory Express) is upon us. And in the next 2-3 years, we will see a slew of new storage solutions and technology based on NVMe.

Just a few days ago, The Register released an article “Seventeen hopefuls fight for the NVMe Fabric array crown“, and it was timely. I, for one, cannot be more excited about the development and advancement of NVMe and the upcoming NVMeF (NVMe over Fabrics).

This is it. This is the one that will end the wars of DAS, NAS and SAN and unite the warring factions between server-based SAN (the sexy name differentiating old DAS and new DAS) and the networked storage of SAN and NAS. There will be PEACE.

Remember this?

nutanix-nosan-buntingNutanix popularized the “No SAN” movement which later led to VMware VSAN and other server-based SAN solutions, hyperconverged techs such as PernixData (acquired by Nutanix), DataCore, EMC ScaleIO and also operated in hyperscalers – the likes of Facebook and Google. The hyperconverged solutions and the server-based SAN lines blurred of storage but still, they are not the usual networked storage architectures of SAN and NAS. I blogged about this, mentioning about how the pendulum has swung back to favour DAS, or to put it more appropriately, server-based SAN. There was always a “Great Divide” between the 2 modes of storage architectures. Continue reading

Considerations of Hadoop in the Enterprise

I am guilty. I have not been tendering this blog for quite a while now, but it feels good to be back. What have I been doing? Since leaving NetApp 2 months or so ago, I have been active in the scenes again. This time I am more aligned towards data analytics and its burgeoning impact on the storage networking segment.

I was intrigued by an article posted by a friend of mine in Facebook. The article (circa 2013) was titled “Never, ever do this to Hadoop”. It described the author’s gripe with the SAN bigots. I have encountered storage professionals who throw in the SAN solution every time, because that was all they know. NAS, to them, was like that old relative smelled of camphor oil and they avoid NAS like a plague. Similar DAS was frowned upon but how things have changed. The pendulum has swung back to DAS and new market segments such as VSANs and Hyper Converged platforms have been dominating the scene in the past 2 years. I highlighted this in my blog, “Praying to the Hypervisor God” almost 2 years ago.

I agree with the author, Andrew C. Oliver. The “locality” of resources is central to Hadoop’s performance.

Consider these 2 models:

moving-compute-storage

In the model on your left (Moving Data to Compute), the delivery process from Storage to Compute is HEAVY. That is because data has dependencies; data has gravity. However, if you consider the model on your right (Moving Compute to Data), delivering data processing to the storage layer is much lighter. Compute or data processing is transient, and the data in the compute layer is volatile. Once compute’s power is turned off, everything starts again from a clean slate, hence the volatile stage.

Continue reading