The rise of RDMA

I have known of RDMA (Remote Direct Memory Access) for quiet some time, but never in depth. But since my contract work ended last week, and I have some time off to do some personal development, I decided to look deeper into RDMA. Why RDMA?

In the past 1 year or so, RDMA has been appearing in my radar very frequently, and rightly so. The speedy development and adoption of NVMe (Non-Volatile Memory Express) have pushed All Flash Arrays into the next level. This pushes the I/O and the throughput performance bottlenecks away from the NVMe storage medium into the legacy world of SCSI.

Most network storage interfaces and protocols like SAS, SATA, iSCSI, Fibre Channel today still carry SCSI loads and would have to translate between NVMe and SCSI. NVMe-to-SCSI bridges have to be present to facilitate the translation.

In the slide below, shared at the Flash Memory Summit, there were numerous red boxes which laid out the SCSI connections and interfaces where SCSI-to-NVMe translation (and vice versa) would be required.

Continue reading

Can NetApp do it a bit better?

[Preamble: I was a delegate of Storage Field Day 12. My expenses, travel and accommodation were paid for by GestaltIT, the organizer and I was not obligated to blog or promote the technologies presented in this event]

In Day 2 of Storage Field Day 12, I and the other delegates were hustled to NetApp’s Sunnyvale campus headquarters. That was a homecoming for me, and it was a bit ironic too.

Just 8 months ago, I was NetApp Malaysia Country Manager. That country sales lead role was my second stint with NetApp. I lasted almost 1 year.

17 years ago, my first stint with NetApp was the employee #2 in Malaysia as an SE. That SE stint went by quickly for 5 1/2 years, and I loved that time. Those Fall Classics NetApp used to have at the Batcave and the Fortress of Solitude left a mark with me, and the experiences still are as vivid as ever.

Despite what has happened in both stints and even outside the circle, I am still one of NetApp’s active cheerleaders in the Asia Pacific region. I even got accused by being biased as a community leader in the SNIA Malaysia Facebook page (unofficial but recognized by SNIA), because I was supposed to be neutral. I have put in 10 years to promote the storage technology community with SNIA Malaysia. [To the guy named Stanley, my response was be “Too bad, pick a religion“.]

The highlight of the SFD12 NetApp visit was of course, having lunch with Dave Hitz, one of the co-founders and the one still remaining. But throughout the presentations, I was unimpressed.

For me, the only one which stood out was CloudSync. I have read about CloudSync since NetApp Insight 2016 and yes, it’s a nice little piece of data shipping service between on-premise and AWS cloud.

Here’s how CloudSync looks like:

Continue reading

Let’s smoke the storage peace pipe

NVMe (Non-Volatile Memory Express) is upon us. And in the next 2-3 years, we will see a slew of new storage solutions and technology based on NVMe.

Just a few days ago, The Register released an article “Seventeen hopefuls fight for the NVMe Fabric array crown“, and it was timely. I, for one, cannot be more excited about the development and advancement of NVMe and the upcoming NVMeF (NVMe over Fabrics).

This is it. This is the one that will end the wars of DAS, NAS and SAN and unite the warring factions between server-based SAN (the sexy name differentiating old DAS and new DAS) and the networked storage of SAN and NAS. There will be PEACE.

Remember this?

nutanix-nosan-buntingNutanix popularized the “No SAN” movement which later led to VMware VSAN and other server-based SAN solutions, hyperconverged techs such as PernixData (acquired by Nutanix), DataCore, EMC ScaleIO and also operated in hyperscalers – the likes of Facebook and Google. The hyperconverged solutions and the server-based SAN lines blurred of storage but still, they are not the usual networked storage architectures of SAN and NAS. I blogged about this, mentioning about how the pendulum has swung back to favour DAS, or to put it more appropriately, server-based SAN. There was always a “Great Divide” between the 2 modes of storage architectures. Continue reading

Can CDMI emancipate an interoperable medical records cloud ecosystem?

PREFACE: This is just a thought, an idea. I am by no means an expert in this area. I have researched this to inspire a thought process of how we can bring together 2 disparate worlds of medical records and imaging with the emerging cloud services for healthcare.

Healthcare has been moving out of its archaic shell in the past few years, and digital healthcare technology and services are booming. And this movement is part of the digital transformation which could eventually lead to a secure and compliant distribution and collaboration of health data, medical imaging and electronic medical records (EMR).

It is a blessing that today’s medical imaging industry has been consolidated with the DICOM (Digital Imaging and Communications in Medicine) standard. DICOM dictates the how medical imaging information and pictures are used, stored, printed, transmitted and exchanged. It is also a communication protocol which runs over TCP/IP, and links up different service class providers (SCPs) and service class users (SCUs), and the backend systems such as PACS (Picture Archiving & Communications Systems) and RIS (Radiology Information Systems).

Another well accepted standard is HL7 (Health Level 7), a similar Layer 7, application-level communication protocol for transferring and exchanging clinical and administrative data.

The diagram below shows a self-contained ecosystem involving the front-end HIS (Hospital Information Systems), and the integration of healthcare, medical systems and other DICOM modalities.

Hospital Enterprise

(Picture courtesy of Meddiff Technologies)

Continue reading

Oops, excuse me but your silo is showing

It is the morning that the SNIA Global Steering Committee reporting session is starting soon. I am in the office extremely early waiting for my turn to share the happenings in SNIA Malaysia.

And of late, I have been getting a lot of calls to catch up on hot technologies, notably All Flash Storage arrays and hyper-converged infrastructure. Even though I am now working for Interica, a company that focuses on Oil & Gas exploration and production software, my free coffee sessions with folks from the IT side have not diminished. And I recalled a week back in mid-March where I had coffee overdose!

Flash storage and hyperconvergence are HOT! Despite the hypes and frenzies of both flash storage and hyperconvergence, I still believe that integrating either or, or both, still have an effect that many IT managers overlook. The effect is a data silo.

Continue reading

Praying to the hypervisor God

I was reading a great article by Frank Denneman about storage intelligence moving up the stack. It was pretty much in line with what I have been observing in the past 18 months or so, about the storage pendulum having swung back to DAS (direct attached storage). To be more precise, the DAS form factor I am referring to are physical server hardware that houses many disk drives.

Like it or not, the hypervisor has become the center of the universe in the IT space. VMware has become the indomitable force in the hypervisor technology, with Microsoft Hyper-V playing catch-up. The seismic shift of these 2 hypervisor technologies are leading storage vendors to place them on to the altar and revering them as deities. The others, with the likes of Xen and KVM, and to lesser extent Solaris Containers aren’t really worth mentioning.

This shift, as the pendulum swings from networked storage back to internal “direct-attached” storage are dictated by 4 main technology factors:

  • The x86 server architecture
  • Software-defined
  • Scale-out architecture
  • Flash-based storage technology

Anyone remember Thumper? Not the Disney character from the Bambi movie!

thumper-bambi-cartoon-character

When the SunFire X4500 (aka Thumper) was first released in (intermission: checking Wiki for the right year) in 2006, I felt that significant wound inflicted in the networked storage industry. Instead of the usual 4-8 hard disk drives in the all the industry servers at the time, the X4500 4U chassis housed 48 hard disk drives. The design and architecture were so astounding to me, I even went and bought a 1U SunFire X4150 for my personal server collection. Such was my adoration for Sun’s technology at the time.

Continue reading

SMB Witness Protection Program

No, no, FBI is not in the storage business and there are no witnesses to protect.

However, SMB 3.0 has introduced a RPC-based mechanism to inform the clients of any state change in the SMB servers. Microsoft calls it Service Witness Protocol [SWP], and its objective is provide a much faster notification service allow the SMB 3.0 clients to do a failover. In previous SMB 1.0 and even in SMB 2.x, the SMB clients rely on time-out services. The time-out services, either SMB or TCP, could take up as much as 30-45 seconds, and this creates a high latency that is disruptive to enterprise applications.

SMB 3.0, as mentioned in my previous post, had a total revamp, and is now enterprise ready. In what Microsoft calls “Continuously Available” File Service, the SMB 3.0 supports clustered or scale-out file servers. The SMB shares must be shared as “Continuously Available” shares and mapped to SMB 3.0 clients. As shown in the diagram below (provided by SNIA’s webinar),

SMB 3.0 CA Shares

Client A mapping to Server 1 share (\\srv1\CAshr). Client A has a share “handle” that establishes a connection with a corresponding state of the session. The state of the session is synchronously kept consistent with a corresponding state in Server 2.

The Service Witness Protocol is not responsible for the synchronization of the states in the SMB file server cluster. Microsoft has left the HA/cluster/scale-out capability to the proprietary technology method of the NAS vendor. However, SWP regularly observes the status of all services under its watch. Continue reading

Has Object Storage become the everything store?

I picked up a copy of latest Brad Stone’s book, “The Everything Store: Jeff Bezos and the Age of Amazon at the airport on my way to Beijing last Saturday. I have been reading it my whole time I have been in Beijing, reading in awe about the turbulent ups and downs of Amazon.com.

The Everything Store cover

In its own serendipitous ways, Object-based Storage Devices (OSDs) have been floating in my universe in the past few weeks. Seems like OSDs have been getting a lot of coverage lately and suddenly, while in the shower, I just had an epiphany!

Are storage vendors now positioning Object-based Storage Devices (OSDs) as Everything Store?

Continue reading

The Storage Compass

I am sure many people in IT get pissed with IT jargons and terminologies. More so if it is a customer, especially when he or she is not well versed with the fundamental concept behind the technology architecture.

Even after 20 years, with most of it in storage, I have a hard time switching from one vendor’s jargon to another (sometimes). But it has gotten harder for me lately, since I teach ONTAP courses for NetApp, EMC Cloud Infrastructure and doing my work with the ZFS stuff. Soon, I will take on EMC VNX, Information Storage Management (ISM), Big Data courses as well, and I also plan to do some Nexenta training too.

Who would know that an ONTAP NAS volume would be known as file system in EMC VNX for File (aka Celerra), and a data set in ZFS? Or a ONTAP aggregate is almost like a ZFS pool but with some differences or a clone might be called a replica in HDS and so on …

In fact, all the definitions above could be wrong because I am getting confused. 😉 You would be too if you have to switch from one vendor’s jargon to another. And the poor EMC pre-sales who has not been with any other vendor except for EMC all his career would have a hard time rewiring his brain if he had joined another vendor like NetApp.  Or IBM, or Dell, or Oracle or anyone for that matter.  No wonder the customers are pissed.  Continue reading

Not all SSDs are the same

Happy Lunar New Year! The Chinese around world has just ushered in the Year of the Water Dragon yesterday. To all my friends and family, and readers of my blog, I wish you a prosperous and auspicious Chinese New Year!

Over the holidays, I have been keeping up with the progress of Solid State Drives (SSDs). I am sure many of us are mesmerized by SSDs and the storage vendors are touting the best of SSDs have to offer. But let me tell you one thing – you are probably getting the least of what the best SSDs have to offer. You might be puzzled why I say things like this.

Let me share with a common sales pitch. Most (if not all) storage vendors will tout performance (usually IOPS) as the greatest benefits of SSDs. The performance numbers have to be compared to something, and that something is your regular spinning Hard Disk Drives (HDDs). The slowest SSDs in terms of IOPS is about 10-15x faster than the HDDs. A single SSD can at least churn 5,000 IOPS when compared to the fastest 15,000 RPM HDDs, which churns out about 200 IOPS (depending on HDD vendors). Therefore, the slowest SSDs can be 20-25x faster than the fastest HDDs, when measured in IOPS.

But the intent of this blogger is to share with you more about SSDs. There’s more to know because SSDs are not built the same. There are write-bias SSDs, read-bias SSDs; there are SLC (single level cell) and MLC (multi level cell) SSDs and so on. How do you differentiate them if Vendor A touts their SSDs and Vendor B touts their SSDs as well? You are not comparing SSDs and HDDs anymore. How do you know what questions to ask when they show you their performance statistics?

SNIA has recently released a set of methodology called “Solid State Storage (SSS) Performance Testing Specifications (PTS)” that helps customers evaluate and compare the SSD performance from a vendor-neutral perspective. There is also a whitepaper related to SSS PTS. This is something very important because we have to continue to educate the community about what is right and what is wrong.

In a recent webcast, the presenters from the SNIA SSS TWG (Technical Working Group) mentioned a few questions that I  think we as vendors and customers should think about when working with an SSD sales pitch. I thought I share them with you.

  • Was the performance testing done at the SSD device level or at the file system level?
  • Was the SSD pre-conditioned before the testing? If so, how?
  • Was the performance results taken at a steady state?
  • How much data was written during the testing?
  • Where was the data written to?
  • What data pattern was tested?
  • What was the test platform used to test the SSDs?
  • What hardware or software package(s) used for the testing?
  • Was the HBA bandwidth, queue depth and other parameters sufficient to test the SSDs?
  • What type of NAND Flash was used?
  • What is the target workload?
  • What was the percentage weight of the mix of Reads and Writes?
  • Are there warranty life design issue?

I thought that these questions were very relevant in understanding SSDs’ performance. And I also got to know that SSDs behave differently throughout the life stages of the device. From a performance point of view, there are 3 distinct performance life stages

  • Fresh out of the box (FOB)
  • Transition
  • Steady State

 

As you can see from the graph below, a SSD, fresh out of the box (FOB) displayed considerable performance numbers. Over a period of time (the graph shown minutes), it transitioned into a mezzanine stage of lower IOPS and finally, it normalized to the state called the Steady State. The Steady State is the desirable test range that will give the most accurate type of IOPS numbers. Therefore, it is important that your storage vendor’s performance numbers should be taken during this life stage.

Another consideration when understanding the SSDs’ performance numbers are what type of tests used? The test could be done at the file system level or at the device level. As shown in the diagram below, the test numbers could be taken from many different elements through the stack of the data path.

 

Performance for cached data would given impressive numbers but it is not accurate. File system performance will not be useful because the data travels through different layers, masking the true performance capability of the SSDs. Therefore, SNIA’s performance is based on a synthetic device level test to achieve consistency and a more accurate IOPS numbers.

There are many other factors used to determine the most relevant performance numbers. The SNIA PTS test has 4 main test suite that addresses different aspects of the SSD’s performance. They are:

  • Write Saturation test
  • Latency test
  • IOPS test
  • Throughput test

The SSS PTS would be able to reveal which is a better SSD. Here’s a sample report on latency.

Once again, it is important to know and not to take vendors’ numbers in verbatim. As the SSD market continue to grow, the responsibility lies on both side of the fence – the vendor and the customer.