The rise of RDMA

I have known of RDMA (Remote Direct Memory Access) for quiet some time, but never in depth. But since my contract work ended last week, and I have some time off to do some personal development, I decided to look deeper into RDMA. Why RDMA?

In the past 1 year or so, RDMA has been appearing in my radar very frequently, and rightly so. The speedy development and adoption of NVMe (Non-Volatile Memory Express) have pushed All Flash Arrays into the next level. This pushes the I/O and the throughput performance bottlenecks away from the NVMe storage medium into the legacy world of SCSI.

Most network storage interfaces and protocols like SAS, SATA, iSCSI, Fibre Channel today still carry SCSI loads and would have to translate between NVMe and SCSI. NVMe-to-SCSI bridges have to be present to facilitate the translation.

In the slide below, shared at the Flash Memory Summit, there were numerous red boxes which laid out the SCSI connections and interfaces where SCSI-to-NVMe translation (and vice versa) would be required.

Continue reading

Can NetApp do it a bit better?

[Preamble: I was a delegate of Storage Field Day 12. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented in this event]

In Day 2 of Storage Field Day 12, I and the other delegates were hustled to NetApp’s Sunnyvale campus headquarters. That was a homecoming for me, and it was a bit ironic too.

Just 8 months ago, I was NetApp Malaysia Country Manager. That country sales lead role was my second stint with NetApp. I lasted almost 1 year.

17 years ago, my first stint with NetApp was the employee #2 in Malaysia as an SE. That SE stint went by quickly for 5 1/2 years, and I loved that time. Those Fall Classics NetApp used to have at the Batcave and the Fortress of Solitude left a mark with me, and the experiences still are as vivid as ever.

Despite what has happened in both stints and even outside the circle, I am still one of NetApp’s active cheerleaders in the Asia Pacific region. I even got accused by being biased as a community leader in the SNIA Malaysia Facebook page (unofficial but recognized by SNIA), because I was supposed to be neutral. I have put in 10 years to promote the storage technology community with SNIA Malaysia. [To the guy named Stanley, my response was be “Too bad, pick a religion“.]

The highlight of the SFD12 NetApp visit was of course, having lunch with Dave Hitz, one of the co-founders and the one still remaining. But throughout the presentations, I was unimpressed.

For me, the only one which stood out was CloudSync. I have read about CloudSync since NetApp Insight 2016 and yes, it’s a nice little piece of data shipping service between on-premise and AWS cloud.

Here’s how CloudSync looks like:

Continue reading

The engineering of Elastifile

[Preamble: I was a delegate of Storage Field Day 12. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented in this event]

When it comes to large scale storage capacity requirements with distributed cloud and on-premise capability, object storage is all the rage. Amazon Web Services started the object-based S3 storage service more than a decade ago, and the romance with object storage started.

Today, there are hundreds of object-based storage vendors out there, touting features after features of invincibility. But after researching and reading through many design and architecture papers, I found that many object-based storage technology vendors began to sound the same.

At the back of my mind, object storage is not easy when it comes to most applications integration. Yes, there is a new breed of cloud-based applications with RESTful CRUD API operations to access object storage, but most applications still rely on file systems to access storage for capacity, performance and protection.

These CRUD and CRUD-like APIs are the common semantics of interfacing object storage platforms. But many, many real-world applications do not have the object semantics to interface with storage. They are mostly designed to interface and interact with file systems, and secretly, I believe many application developers and users want a file system interface to storage. It does not matter if the storage is on-premise or in the cloud.

Let’s not kid ourselves. We are most natural when we work with files and folders.

Implementing object storage also denies us the ability to optimally utilize Flash and solid state storage on-premise when the compute is in the cloud. Similarly, when the compute is on-premise and the flash-based object storage is in the cloud, you get a mismatch of performance and availability requirements as well. In the end, there has to be a compromise.

Another “feature” of object storage is its poor ability to handle transactional data. Most of the object storage do not allow modification of data once the object has been created. Putting a NAS front (aka a NAS gateway) does not take away the fact that it is still object-based storage at the very core of the infrastructure, regardless if it is on-premise or in the cloud.

Resiliency, latency and scalability are the greatest challenges when we want to build a true globally distributed storage or data services platform. Object storage can be resilient and it can scale, but it has to compromise performance and latency to be so. And managing object storage will not be as natural as to managing a file system with folders and files.

Enter Elastifile.

Continue reading

FlashForward to Beyond

The flash frenzy has reached its zenith in 2016. We now no longer are interested in listening to storage technology vendors touting the power of solid state storage (NAND Flash included) over spinning drives.

The capacity of 3D NAND Flash SSDs has reached a whopping 15.3TB (that is even bigger than the 12TB 7200RPM HDDs of today), and with deduplication and compression, the storage efficiency has reached a conservative 4:1 or 5:1. Effective capacity of most mid-end storage arrays can easily reach 1-2 Petabytes.

And flash and hybrid platforms have reached maturity in these few short years. So what is next?

The landscape has obviously changed. The performance landscape, the capacity landscape and all related to the storage data points have changed. And the speed of SSDs together with the up-and-coming NVMe and NVDIMM technology in new storage array controllers are also shifting the data bottlenecks to another part of the architecture. The development of I/O communications and interfaces has to change as well, to take advantage of the asynchronous I/Os in storage tiering and caching using NAND Flash.

With this mature and well understood landscape, it is time to take Flash to the next level. This next level comes in the form of an exciting end-user conference in Singapore on 25th April 2017. It is called FlashForward.

The 2016 FlashForward event in Europe has already garnered great support from the cream of the storage technologists around the world, and had fantastic feedbacks from the end-user attendees. That FlashForward event has also seen the birth of an international business and technology exchange in its inaugural introduction.  Yes, it is time to learn from the field experts, and it is time to build on the Flash Platform for new Data Services.

From the sponsorship package brochure I have received, it is definitely an event not to be missed.

The FlashForward Conference in Singapore is exquisitely procured by Evito Ltd, under the stewardship of Mr. Paul Talbut. Paul is a very seasoned veteran in the global circuit as an SNIA director of several initiatives. He has been immensely involved in the development of several SNIA chapters around the world, including South Asia, Malaysia, India, China, and even Brazil. He also leads by example with the SNIA Global Steering Committee (GSC); he is the SNIA Global Education Director and at one time, SNIA DPCO (Data Protection & Capacity Optimization) global proctor.

I have had the honour working with Paul for almost 8 years now, and I am sure he will lead the FlashForward Conference with valuable insights and experiences.

This is probably the greatest period for the industry and end users to get involved in the FlashForward Conference. For one, it is endorsed by SNIA, the vendor-neutral association which has been the growth beacon of the storage networking industry.

Secondly, it is the perfect opportunity for technology vendors to build their mindshare with end users and customers. And with the endorsement of the independent field experts and technology practitioners, end users would have a field day garnering approvals for their decisions, as well as learning the best practices to build upon the Flash technology they have implemented in their data center space.

The sponsorship packages are listed below, and I do encourage technology vendors, especially the All-Flash vendors to use the FlashForward conference as a platform to build their mindshare, and most of all, their branding. Continue reading

Let’s smoke the storage peace pipe

NVMe (Non-Volatile Memory Express) is upon us. And in the next 2-3 years, we will see a slew of new storage solutions and technology based on NVMe.

Just a few days ago, The Register released an article “Seventeen hopefuls fight for the NVMe Fabric array crown“, and it was timely. I, for one, cannot be more excited about the development and advancement of NVMe and the upcoming NVMeF (NVMe over Fabrics).

This is it. This is the one that will end the wars of DAS, NAS and SAN and unite the warring factions between server-based SAN (the sexy name differentiating old DAS and new DAS) and the networked storage of SAN and NAS. There will be PEACE.

Remember this?

nutanix-nosan-buntingNutanix popularized the “No SAN” movement which later led to VMware VSAN and other server-based SAN solutions, hyperconverged techs such as PernixData (acquired by Nutanix), DataCore, EMC ScaleIO and also operated in hyperscalers – the likes of Facebook and Google. The hyperconverged solutions and the server-based SAN lines blurred of storage but still, they are not the usual networked storage architectures of SAN and NAS. I blogged about this, mentioning about how the pendulum has swung back to favour DAS, or to put it more appropriately, server-based SAN. There was always a “Great Divide” between the 2 modes of storage architectures. Continue reading

The reverse wars – DAS vs NAS vs SAN

It has been quite an interesting 2 decades.

In the beginning (starting in the early to mid-90s), SAN (Storage Area Network) was the dominant architecture. DAS (Direct Attached Storage) was on the wane as the channel-like throughput of Fibre Channel protocol coupled by the million-device addressing of FC obliterated parallel SCSI, which was only able to handle 16 devices and throughput up to 80 (later on 160 and 320) MB/sec.

NAS, defined by CIFS/SMB and NFS protocols – was happily chugging along the 100 Mbit/sec network, and occasionally getting sucked into the arguments about why SAN was better than NAS. I was already heavily dipped into NFS, because I was pretty much a SunOS/Solaris bigot back then.

When I joined NetApp in Malaysia in 2000, that NAS-SAN wars were going on, waiting for me. NetApp (or Network Appliance as it was known then) was trying to grow beyond its dot-com roots, into the enterprise space and guys like EMC and HDS were frequently trying to put NetApp down.

It’s a toy”  was the most common jibe I got in regular engagements until EMC suddenly decided to attack Network Appliance directly with their EMC CLARiiON IP4700. EMC guys would fondly remember this as the “NetApp killer“. Continue reading

MASSive, Impressive, Agile, TEGILE

Ah, my first blog after Storage Field Day 6!

It was a fantastic week and I only got to fathom the sensations and effects of the trip after my return from San Jose, California last week. Many thanks to Stephen Foskett (@sfoskett), Tom Hollingsworth (@networkingnerd) and Claire Chaplais (@cchaplais) of Gestalt IT for inviting me over for that wonderful trip 2 weeks’ ago. Tegile was one of the companies I had the privilege to visit and savour.

In a world of utterly confusing messaging about Flash Storage, I was eager to find out what makes Tegile tick at the Storage Field Day session. Yes, I loved Tegile and the campus visit was very nice. I was also very impressed that they have more than 700 customers and over a thousand systems shipped, all within 2 years since they came out of stealth in 2012. However, I was more interested in the essence of Tegile and what makes them stand out.

I have been a long time admirer of ZFS (Zettabyte File System). I have been a practitioner myself and I also studied the file system architecture and data structure some years back, when NetApp and Sun were involved in a lawsuit. A lot of have changed since then and I am very pleased to see Tegile doing great things with ZFS.

Tegile’s architecture is called IntelliFlash. Here’s a look at the overview of the IntelliFlash architecture:

Tegile IntelliFlash Architecture

So, what stands out for Tegile? I deduce that there are 3 important technology components that defines Tegile IntelliFlash ™ Operating System.

  • MASS (Metadata Accelerator Storage System)
  • Media Management
  • Inline Compression and Inline Deduplication

What is MASS? Tegile has patented MASS as an architecture that allows optimized data path to the file system metadata.

Often a typical file system metadata are stored together with the data. This results in a less optimized data access because both the data and metadata are given the same priority. However, Tegile’s MASS writes and stores the filesystem metadata in very high speed, low latency DRAM and Flash SSD. The filesystem metadata probably includes some very fine grained and intimate details about the mapping of blocks and pages to the respective capacity Flash SSDs and the mechanical HDDs. (Note: I made an educated guess here and I would be happy if someone corrected me)

Going a bit deeper, the DRAM in the Tegile hybrid storage array is used as a L1 Read Cache, while Flash SSDs are used as a L2 Read and Write Cache. Tegile takes further consideration that the Flash SSDs used for this caching purpose are different from the denser and higher capacity Flash SSDs used for storing data. These Flash SSDs for caching are obviously the faster, lower latency type of eMLCs and in the future, might be replaced by PCIe Flash optimized by NVMe.

Tegile DRAM-Flash Caching

This approach gives absolute priority, and near-instant access to the filesystem’s metadata, making the Tegile data access incredibly fast and efficient.

Tegile’s Media Management capabilities excite me. This is because it treats every single Flash SSD in the storage array with very precise organization of 3 types of data patterns.

  1. Write caching, which is high I/O is focused on a small segment of the drive
  2. Metadata caching, which has both Read and Write I/O  is targeted to a slight larger segment of the drive
  3. Data is laid out on the rest of the capacity of the drive

Drilling deeper, the write caching (in item 1 above) high I/O writes are targeted at the drive segment’s range which is over-provisioned for greater efficiency and care. At the same time, the garbage collection(GC) of this segment is handled by the respective drive’s controller. This is important because the controller will be performing the GC function without inducing unnecessary latency to the storage array processing cycles, giving further boost to Tegile’s already awesome prowess.

In addition to that, IntelliFlash ™ aligns every block and every page exactly to each segment and each page boundary of the drives. This reduces block and page segmentation, and thereby reduces issues with file locality and free blocks locality. It also automatically adjust its block and page alignments to different drive types and models. Therefore, I believe, it would know how to align itself to a 512-bytes or a 520-bytes sector drives.

The Media Management function also has advanced cell care. The wear-leveling takes on a newer level of advancement where how the efficient organization of blocks and pages to the drives reduces additional and often unnecessary erase and rewrites. Furthermore, the use of Inline Compression and Inline Deduplication also reduces the number of writes to drives media, increasing their longevity.

Tegile Inline Compression and Deduplication

Compression and deduplication are 2 very important technology features in almost all flash arrays. Likewise, these 2 technologies are crucial in the performance of Tegile storage systems. They are both inline i.e – Inline Compression and Inline Deduplication, and therefore both are boosted by the multi-core CPUs as well as the fast DRAM memory.

I don’t have the secret sauce formula of how Tegile designed their inline compression and deduplication. But there’s a very good article of how Tegile viewed their method of data reduction for compression and deduplication. Check out their blog here.

The metadata of data access of each and every customer is probably feeding into their Intellicare, a cloud-based customer care program. Intellicare is another a strong differentiator in Tegile’s offering.

Oh, did I mentioned they are unified storage as well with both SAN and NAS, including SMB 3.0 support?

I left Tegile that afternoon on November 5th feeling happy. I was pleased to catch up with Narayan Venkat, my old friend from NetApp, who is now their Chief Marketing Officer. I was equally pleased to see Tegile advancing ZFS further than the others I have known. With so much technological advancement and more coming, the world is their oyster.

Praying to the hypervisor God

I was reading a great article by Frank Denneman about storage intelligence moving up the stack. It was pretty much in line with what I have been observing in the past 18 months or so, about the storage pendulum having swung back to DAS (direct attached storage). To be more precise, the DAS form factor I am referring to are physical server hardware that houses many disk drives.

Like it or not, the hypervisor has become the center of the universe in the IT space. VMware has become the indomitable force in the hypervisor technology, with Microsoft Hyper-V playing catch-up. The seismic shift of these 2 hypervisor technologies are leading storage vendors to place them on to the altar and revering them as deities. The others, with the likes of Xen and KVM, and to lesser extent Solaris Containers aren’t really worth mentioning.

This shift, as the pendulum swings from networked storage back to internal “direct-attached” storage are dictated by 4 main technology factors:

  • The x86 server architecture
  • Software-defined
  • Scale-out architecture
  • Flash-based storage technology

Anyone remember Thumper? Not the Disney character from the Bambi movie!

thumper-bambi-cartoon-character

When the SunFire X4500 (aka Thumper) was first released in (intermission: checking Wiki for the right year) in 2006, I felt that significant wound inflicted in the networked storage industry. Instead of the usual 4-8 hard disk drives in the all the industry servers at the time, the X4500 4U chassis housed 48 hard disk drives. The design and architecture were so astounding to me, I even went and bought a 1U SunFire X4150 for my personal server collection. Such was my adoration for Sun’s technology at the time.

Continue reading

No Flash in the pan

The storage networking market now is teeming with flash solutions. Consumers are probably sick to their stomach getting a better insight which flash solution they should be considering. There are so much hype, fuzz and buzz and like a swarm of bees, in the chaos of the moment, there is actually a calm and discerning pattern slowly, but surely, emerging. Storage networking guys would probably know this thing well, but for the benefit of the other readers, how we view flash (and other solid state storage) becomes clear with the picture below: Flash performance gap

(picture courtesy of  http://electronicdesign.com/memory/evolution-solid-state-storage-enterprise-servers)

Right at the top, we have the CPU/Memory complex (labelled as Processor). Our applications, albeit bytes and pieces of them, run in this CPU/Memory complex.

Therefore, we can see Pattern #1 showing up. Continue reading

Boosting Solid States beyond SATA

Lately, I have been getting deeper and deeper into low-level implementation related to storage technologies. In my previous blog, I was writing my learning adventure with Priority Flow Control (PFC) and intend to further the Data Center Bridging concepts with future blog entries.

Before I left for Sydney for a holiday last week, I got sidetracked into exciting stuff that’s happening in my daily encounters with friends and new friends. 2 significant storage related technologies fell onto my lap. One is NVMe (Non-Volatile Memory express) and the other FPGA (Field Programmable Gate Array).

While this blog is going to be about NVMe, I actually found FPGA much more exciting to me. Through conversations, I found that there are 2 “biggies” in the FPGA world, and they are designed and manufactured by Xilink and Altera. I admit that I have not done my homework on FPGA yet, having just returned from Sydney last night. I will blog about FPGA in future blogs.

But NVMe is also an important technology direction to the storage world as well.

I think most of us are probably already mesmerized by solid state drives. The bombardment of marketing, presentations, advertising and whatever else the vendors do to promote (and self-promote) solid state drives are inundating the intellectual senses of consumers and enterprises alike. And yet, many vendors do not explain both the pros and cons of integrating solid states into their IT environment. Even worse, many don’t even know the strengths and weaknesses of solid states, hence creating some exaggeration that continues to create a spiral vortex of inaccuracies. Like a self-feeding frenzy, the industry seems to have placed solid state storage as the saviour of the enterprise storage world. Go figure with that!

Continue reading