The changing face of storage

No, we are not a storage company anymore. We are a data management company now.

I was reading a Forbes article interviewing NetApp’s CIO, Bill Miller. It was titled:

NetApp’s CIO Helps Drive Company’s Shift From Data Storage To Data Management

I was fairly surprised about the time it took for that mindset shift messaging from storage to data management. I am sure that NetApp has been doing that for years internally.

To me, the writing has been in the wall for years. But weak perception of storage, at least in this part of Asia, still lingers as that clunky, behind the glassed walls and crufty closets, noisy box of full of hard disk drives lodged with snakes and snakes of orange, turquoise or white cables. ūüėČ

The article may come as a revelation to some, but the world of storage has changed indefinitely. The blurring of the lines began when software defined storage, or even earlier in the form of storage virtualization, took form. I even came up with my definition a couple of years ago about the changing face of storage framework. Instead of calling it data management, I called the new storage framework,  the Data Services Platform.

So, this is my version of the storage technology platform of today. This is the Data Services Platform I have been touting to many for the last couple of years. It is not just storage technology anymore; it is much more than that.

Continue reading

Can CDMI emancipate an interoperable medical records cloud ecosystem?

PREFACE: This is just a thought, an idea. I am by no means an expert in this area. I have researched this to inspire a thought process of how we can bring together 2 disparate worlds of medical records and imaging with the emerging cloud services for healthcare.

Healthcare has been moving out of its archaic shell in the past few years, and digital healthcare technology and services are booming. And this movement is part of the digital transformation which could eventually lead to a secure and compliant distribution and collaboration of health data, medical imaging and electronic medical records (EMR).

It is a blessing that today’s medical imaging industry has been consolidated with the DICOM (Digital Imaging and Communications in Medicine) standard. DICOM dictates the how medical imaging information and pictures are used, stored,¬†printed, transmitted and exchanged. It is also a communication protocol which runs over TCP/IP, and links up different service class providers (SCPs) and service class users (SCUs), and the backend systems such as PACS (Picture Archiving & Communications Systems) and RIS (Radiology Information Systems).

Another well accepted standard is HL7 (Health Level 7), a similar Layer 7, application-level communication protocol for transferring and exchanging clinical and administrative data.

The diagram below shows a self-contained ecosystem involving the front-end HIS (Hospital Information Systems), and the integration of healthcare, medical systems and other DICOM modalities.

Hospital Enterprise

(Picture courtesy of Meddiff Technologies)

Continue reading

And Cloud Storage will make us even stranger

It was a dark and stormy night ….

I was in a car with my host in the stifling traffic jams on the streets of Jakarta. We had just finished dinner and his driver was taking me back to the hotel. It was about 9pm and we were making conversation trying to figure out how we can work together. My host, a wonderful Singaporean who has been residing in Jakarta for more than a decade and a half, owns a distributorship focusing mainly on IT security solutions. He had invited me over to Jakarta to give a talk on Cloud Storage at the Indonesia CIO Network event on January 9th 2013.

I was there to represent SNIA South Asia to give a talk about¬†CDMI¬†(Cloud Data Management Interface), and my host also took the opportunity to introduce¬†Nutanix, a SAN-less 2-tier, high-performance, virtualized data center platform. (Note: That’s quite a mouthful, but gotta include all the buzz-words in there). It was my host’s first foray into storage networking solutions, away from his usual security solutions spread. As the conversation went on in the car, he said “You storage guys are so strange!“.

To many of the IT folks who have been involved in OS, applications, security, and networking, to say a few, storage is like a¬†dark art, some mumbo jumbo, voodoo-like science¬†known to a select few. That’s great, because this perception will keep us relevant, and still have the value and a job. To me, that just fine and dandy, and I like it that way. ūüôā

In preparation to the event, I have to learn up SNIA CDMI.¬†Cloud and Storage … Cloud and Storage … Cloud and Storage. Hmmm …. Continue reading

The marriage in the cloud

Admit it! You are a terabyte junkie! I am sure many of us have one terabyte or more of your personal “stuff” at home. Heck, I even heard from a friend that he has almost 20TB of high definition movies (thank you Torrent!) at home! That’s crazy!

And what the typical Malaysian consumer would do after he or she runs out of hard disk space? In KL (our beloved capital city, Kuala Lumpur), they would throng the Low Yat IT mall or extensions of it, like Digital Mall in PJ Section 14. In other towns and cities in Malaysia, PC fairs are popular, as consumers try to get the best price possible (We Malaysian are good at squeezing the max of a deal)

It is difficult for the not-so-IT-literate consumer to differentiate which brand is the best. Buffalo, Iomega, DLink, Western Digital, etc, etc. But the tides are changing, because these vendors want to tie you down for the rest of your digital life. You see, buying a small NAS for the home now comes with a big carrot, an incentive to keep you wanting for more, and yet you can’t unbind yourself from the tether once you are hooked.

Cloud storage hasn’t taken off in a big way last year. But many cloud storage vendors know there are plenty of opportunities out there but how do they get the consumers to upload their files, photos and whatever stuff they might have, to cloud storage? Ingeniously, they work together with other smaller NAS storage players and use these vendor’s product offerings as baits. They bundle a significantly large FREE capacity or data protection offering in the Cloud Storage as the carrot,¬†and once the consumer decides to put their files in the cloud storage, boom, they are ensnared to become a long term ATM machine to the Cloud Storage Provider.

Sneaky? No?¬†I call this good, smart marketing. You have a market of opportunities out there, but cloud storage isn’t catching on. You have small NAS vendors that is reaching out to the market of consumer, but it’s a brutal, competitive arena and margins are razor thin. It’s a win-win situation for both sides.

And this trend is catching on. When I first read about Drobo (a high-end consumer NAS storage) partnering Carbonite (a remote backup vendor now repackaged as a Cloud storage backup provider), I thought it was a pretty darn good idea. It was a marriage that happened in the cloud. Late last year, another consumer NAS company, QNAP paired up with Symform, a cloud storage and backup vendor.

This was moving towards a market that scratches the itch. The consumers wanted reliable backup too, but consumer-grade disk drives fail ever so often. Laptops get stolen, and files could be infected by viruses. The list goes on, but the point is that the Cloud Storage Providers may have found a silver lining in getting the consumers to leap into the cloud. And the whole idea of small NAS vendor-big Cloud Backup dynamic duo, just got a big endorsement last night. Guess who has decided to dip its grubby hands into the pie?

EMC, the 800-pound gorilla of the information and storage world, through its Iomega subsidiary, wants your money! They had just married Iomega with EMC Atmos. It was quoted:

“EMC subsidiary and data protection specialist Iomega announced the integration between Iomega network storage solutions and EMC Atmos, extending Atmos cloud-based data protection and sharing to Iomega’s network storage product offerings. The new integration gives small and midsize businesses (SMBs), remote offices and distributed enterprises access to any Atmos powered cloud around the world.”

Surprised? Not really, but I guess EMC needs to breath new life into Atmos and this marriage just extended Atmos’ life support system.

Solid?

The next all-Flash product in my review list is SolidFire. Immediately, the niche that SolidFire is trying to carve out is obvious. It’s not for regular commercial customers. It is meant for Cloud Service Providers, because the features and the technology that they have innovated are quite cloud-intended.

Are they solid (pun intended)? Well, if they have managed to secure a Series B funding of USD$25 million (total of USD$37 million overall) from VCs such as NEA and Valhalla, and also angel investors such as Frank Slootman (ex-Data Domain CEO) and Greg Papadopoulus(ex-Sun Microsystems CTO), then obviously there is something more than meets the eye.

The one thing I got while looking up SolidFire is there is probably a lot of technology and innovation behind their ¬†Nodes and their Element OS. They hold their cards very, very close to their chest, and I couldn’t not get much good technology related information from their website or in Google. But here’s a look of how the SolidFire is like:

The SolidFire only has one product model, and that is the 1U SF3010. The SF3010 has 10 x 2.5″ 300GB SSDs giving it a raw total of 3TB per 1U. The minimum configuration is 3 nodes, and it scales to 100 nodes. The reason for starting with 3 nodes is of course, for redundancy. Each SF3010 node has 8GB NVRAM and 72GB RAM and sports 2 x 10GbE ports for iSCSI connectivity, especially when the core engineering talents were from LeftHand Networks. LeftHand Networks product is now HP P4000. There is no Fibre Channel or NAS front end to the applications.

Each node runs 2 x Intel Xeon 2.4GHz 6-core CPUs. The 1U height is important to the cloud provider, as the price of floor space is an important consideration.

Aside from the SF3010 storage nodes, the other important ingredient is their SolidFire Element OS.

Cloud storage needs to be available. The SolidFire Helix Self-Healing data protection is a feature that is capable of handling multiple concurrent failures across all levels of their storage. Data blocks are replicated randomly but intelligently across all storage nodes to ensure that the failure or disruption of access to a particular data block is circumvented with another copy of the data block somewhere else within the cluster. The idea is not new, but effective because solutions such as EMC Centera and IBM XIV employ this idea in their data availability. But still, the ability for self-healing ensures a very highly available storage where data is always available.

To address the efficiency of storage, having 3TB raw in the SF3010 is definitely not sufficient. Therefore, the Element OS always have thin provision, real-time compression and in-line deduplication turned on. These features cannot be turned off and operate at a fine-grained 4K blocks. Also important is the intelligence to reclaim of zeroed blocks, no-reservation,  and no data movement in these innovations. This means that there will be no I/O impact, as claimed by SolidFire.

But the one feature that differentiates SolidFire when targeting storage for Cloud Service Providers is their guaranteed volume level Quality of Service (QOS). This is important and SolidFire has positioned their QOS settings into an advantage. As best practice, Cloud Service Providers should always leverage the QOS functionality to improve their storage utilization

The QOS has:

  • Minimum IOPS – Lower IOPS means lower performance priority (makes good sense)
  • Maximum IOPS
  • Burst IOPS – for those performance spikes moments
  • Maximum and Burst MB/sec

The combination of QOS and storage capacity efficiency gives SolidFire the edge when cloud providers can scale both performance and capacity in a more balanced manner, something that is not so simple with traditional storage vendors that relies on lots of spindles to achieve IOPS performance sacrificing capacity in the process. But then again, with SSDs, the IOPS are plenty (for now). SolidFire does not boast performance numbers of millions of IOPS or having throughput into the tens of Gigabytes like Violin, Virident or Kaminario, but what they want to be recognized as the cloud storage as it should be in a cloud service provider environment.

SolidFire calls this Performance Virtualization. Just as we would get to carve our storage volumes from a capacity pool, SolidFire allows different performance profiles to be carved out from the performance pool. This gives SolidFire the ability to mix storage capacity and storage performance in a seemingly independent manner, customizing the type of storage bundling required of cloud storage.

In fact, SolidFire only claims 50,000 IOPS per storage node (including the IOPS means for replicating data blocks). Together with their native multi-tenancy capability, the 50,000 or so IOPS will align well with many virtualized applications, rather than focusing on a 10x performance improvement on a single applications. Their approach is more about a more balanced and spread-out I/O architecture for cloud service providers and the applications that they service.

Their management is also targeted to the cloud. It has a REST API that integrates easily into OpenStack, Citrix CloudStack and VMware vCloud Director. This seamless and easy integration, is more relevant because the CSPs already have their own management tools. That is why SolidFire API is a REST-ready, integration ready to do just that.

The power of the SolidFire API is probably overlooked by storage professionals trained in the traditional manner. But what SolidFire API has done is to provide the full (I mean FULL) capability of the management and provisioning of the SolidFire storage. Fronting the API with REST means that it is real easy to integrate with existing CSP management interface.

Together with the Storage Nodes and the Element OS, the whole package is aimed towards a more significant storage platform for Cloud Service Providers(CSPs). Storage has always been a tricky component in Cloud Computing (despite what all the storage vendors might claim), but SolidFire touts that their solution focuses on what matters most for CSPs.

CSPs would want to maximize their investment without losing their edge in the cloud offerings to their customers. SolidFire lists their benefits in these 3 areas:

  • Performance
  • Efficiency
  • Management

The edge in cloud storage is definitely solid for SolidFire. Their ability to leverage on their position and steering away from other all-Flash vendors’ battlezone could all make sense, as they aim to gain market share in the Cloud Service Provider space. I only wish they can share more about their technology online.

Fortunately, I found a video by SolidFire’s CEO, Dave Wright which gives a great insight about SolidFire’s technology. Have a look (it’s almost 2 hour long):

[2 hours later]: Phew, I just finished the video above and the technology is solid. Just to summarize,

  • No RAID (which is a Godsend for service providers)
  • Aiming for USD5.00 or less per Gigabyte (a good number!)
  • General availability in Q1 2012

Lots of confidence about the superiority of their technology, as portrayed by their CEO, Dave Wright.

Solid? Yes, Solid!

Amazon makes it easy

I like the way Amazon is building their Cloud Computing services. Amazon Web Services (AWS) is certainly on track to become the most powerful Cloud Computing company in the world. In fact, AWS might already is.  But they are certainly not resting on their laurels when they launched 2 new services in as many weeks РAmazon DynamoDB (last week) and Amazon Storage Gateway (this week).

I am particularly interested in the Amazon Storage Gateway, because it is addressing one of the biggest fears of Cloud Computing head-on. A lot of large corporations are still adamant to keep their data on-premise where it is private and secure. Many large corporations are still very skeptical about it even though Cloud Computing is changing the IT landscape in a massive way. The barrier to entry for large corporations is not something easy, but Amazon is adapting to get more IT divisions and departments to try out Cloud Computing in a less disruptive way.

The new service, is really about data storage and data backup for large corporations. This is important because large corporations have plenty of requirements for data storage and data to be backed up. And as we know, a large portion of the data stored does not need to be transactional or to be accessed frequently. This set of data is usually less frequently used, for archiving or regulatory compliance reasons, particular in the banking and healthcare industry.

In the data backup operations, the reason data is backed up is to provide a data recovery mechanism when a disaster strikes. Large corporations back up tons of data every day, weeks or month and this data only has value when there is a situation that requires data relevance, data immediacy or data recovery. Otherwise, it is just plenty of data taking up storage space, be it on disk or on tape.

Both data storage and data backup cost a lot of money, both CAPEX and OPEX. In CAPEX, you are constantly pressured to buy more storage to store the ever growing data. This leads to greater management and administration costs, both contributing heavily into OPEX costs. And I have not included the OPEX costs of floor space, power and cooling, people (training, salary, time and so on) typically adding up to 3-5x the operations costs relative to the capital investments. Such a model of IT operations related to storage cannot continue forever, and storage in the Cloud offers an alternative.

These 2 scenarios – data storage and data backup – are exactly the type of market AWS is targeting. In order to simplify and pacify large corporations, AWS introduced the Amazon Storage Gateway, that eases the large corporations to take some of their IT storage operations to the Cloud in the form of Amazon S3.

The video below shows the Amazon Storage Gateway:

The Amazon Storage Gateway is a piece of software “appliance” that is installed on-premise in the large corporation’s data center. It seamlessly integrates into the LAN and provides a SSL (Secure Socket Layer) connection to the Amazon S3. The data being transferred to the S3 is also encrypted with AES (Advanced Encryption Standard) 256-bit. Both SSL and AES-256 can give customers a sense of security and AWS claims that the implementation meets the data storage and data recovery standards used in the banking and healthcare industries.

The data storage and backup service regularly protects the customer’s data in snapshots, and giving the customer a rapid recovery platform should the customer experienced on-premise data corruption or data disruption. At the same time, the snapshot copies in the Amazon S3 can also be uploaded into Amazon EBS (Elastic Block Store) and testing or development environments can be evaluated and testing with Amazon EC2 (Elastic Compute Cloud). The simplicity of sharing and combining different Amazon services will no doubt, give customers a peace of mind, easing their adoption of Cloud Computing with AWS.

This new service starts with a 60-day free trial and moving on to a USD$125.00 (about Malaysian Ringgit $400.00) per gateway per month subscription fee. The data storage (inclusive of the backup service), costs only 14 cents per gigabyte per month. For 1TB of data, that is approximately MYR$450 per month. Therefore, minus the initial setup costs, that comes to a total of MYR$850 per month, slightly over MYR$10,000 per year.

At this point, I like to relate an experience I had a year ago when implementing a so-called private cloud for an oil-and-gas customers in KL. They were using the HP EVS (Electronic Vaulting Service) to an undisclosed HP data center hosting site in the Klang Valley. The HP EVS, which was an OEM of Asigra, was not an easy solution to implement but what was more perplexing was the fact that the customer had a poor understanding of what would be the objectives and their 5-year plan in keeping with the data protected.

When the first 3-4TB data storage and backup were almost used up, the customer asked for a quotation for an additional 1TB of the EVS solution. The subscription for 1TB was MYR$70,000 per year. That is 7x time more than the AWS MYR$10,000 per year cost! I have to salute the HP sales rep. It must have been a damn good convincing sell!

In the long run, the customer could be better off running their storage and backup on-premise with their HP EVA4400 and adding an additional of 1TB (and hiring another IT administrator) would have cost a whole lot less.

Amazon Web Services has already operating in Singapore for the past 2 years, and I am sure they are eyeing Malaysia as their regional market. Unless and until Malaysian companies offering Cloud Services know to use economies-of-scale to capitalize the Cloud Computing market, AWS is always going to be a big threat to CSP companies in Malaysia and a boon of any companies seeking cloud computing services anywhere in the world.

I urge customers in Malaysia to start questioning their so-called Cloud Service Providers if they can do what AWS is doing. I have low confidence of what the most local “cloud computing” companies can deliver right now. I hope they stop window dressing their service offerings and start giving real cloud computing services to customers. And for customers, you must continue to research and find out more which cloud services meet your business objectives. Don’t be flashed by the fancy jargons or technical idealism thrown at you. Always, always find out more because your business cost is at stake. Don’t be like the customer who paid MYR$70,000 for 1TB per year.

AWS is always innovating and the Amazon Storage Gateway is just another easy-to-adopt step in their quest for world domination.

Is there IOPS for Cloud Storage? – Nasuni style

I was in Singapore last week attending the Cloud Infrastructure Services course.

In the class, one of the foundation components of Cloud Computing is of course, storage. As the students and the instructor talked about Storage, one very interesting argument surfaced. It revolved around the storage, if it was offered on the cloud. A lot of people assumed that Cloud Storage would be for their databases, and their virtual machines, which of course, is true when the communication between the applications, virtual machines and databases are in the local area network of the Cloud Service Provider (CSP).

However, if the storage is offered through the cloud to applications that are sitting on-premise in the customer’s server room, then we have to think twice of how we perceive Cloud Storage. In this aspect, the Cloud Storage offered by the CSP is a Infrastructure-as-a-Service (IaaS), where the key service is Storage.¬†We have to differentiate that this Storage functions as a data container, and usually not for I/O performance reasons.

Though this concept probably will be easily understood by storage professionals like us, this can cause a bit confusion for someone new to the concept of Cloud Computing and Cloud Storage. This confusion, unfortunately, is caused by many of us who are vendors or solution providers, or even publications and magazines. We are responsible to disseminate correct information to customers, but due to our lack of knowledge and experience in this extremely new market of Cloud Storage, we have created the FUDs (Fear, Uncertainty and Doubt) and hype.

Therefore, it is the duty of this blogger to clear the vapourware, and hopefully pass on the right information to accelerate  the adoption of Cloud Storage in the near future. At this moment, given the various factors such as network costs, high network latency and lack of key network technologies similar to LAN in Cloud Computing, Cloud Storage is, most of the time, for data storage containership and archiving only. And there are no IOPS or any performance related statistics related to Cloud Storage. If any engineer or vendor tells you that they have the fastest Cloud Storage in the industry, do me a favour. Give him/her a knock on the head for me!

Of course, as technologies evolve, this could change in the near future. For now, Cloud Storage is a container, NOT a high performance storage in the cloud. It is usually not meant for transactional data. There are many vendors in the Cloud Storage space from real CSPs to storage companies offering re-packaged storage boxes that are “cloud-ready”. A good example of a CSP offering Cloud Storage is Amazon S3 (Simple Storage Service). And storage vendors such as EMC and HDS are repackaging and rebranding their storage technologies as object storage, ready for the cloud. EMC Atmos is really a repackaged and rebranded Centera, with some slight modifications, while HDS , using their Archiving solution, has HCP (aka HCAP). There’s nothing wrong with what EMC and HDS have done, but before the overhyping of the world of Cloud Computing, these platforms were meant for immutable data archiving reasons. Just thought you should know.

One particular company that captured my imagination and addresses the storage performance portion is Nasuni. Of course, they are quite inventive with the Cloud Storage Gateway approach. Nasuni comes up with a Cloud Storage Gateway filer appliance, which can be either a physical 1U server or as a VMware or Hyper-V virtual appliance sitting on-premise at the customer’s site.

The key to this is “on-premise”, which allows access to data much faster because they are locally-cached¬†in the Nasuni filer appliance itself. This Nasuni filer piece addresses the Cloud Storage “performance” piece but Nasuni do not claim any performance statistics with such implementation. The clever bit is that this addresses data or files that are transactional in nature, i.e. NFS or CIFS, to serve data or files “locally”. (I wonder if Nasuni filer has iSCSI as well. Hmmmm….)

In the Nasuni architecture, they “break up” their “Cloud Storage” into 2 pieces. Piece #1 sits on-premise, at the customer site, and acts as a bridge to the Piece #2, that is sitting in a Cloud Storage. From a simplified view, have a look at the diagram below:

 

 

Piece #1 is the component that handles some of the transactional traffic related to files. In a more technical diagram below, you can see that the Nasuni filer addresses the file sharing portion, using the local disks on the filer appliance as a local caching mechanism.

 

Furthermore, older file pieces are whiffed away to the any Cloud Storage using the Cloud Connector interface, hence giving the customer a sense that their storage capacity needs can be limitless if they want to (for a fee, of course). At the same time, the Nasuni filer support thin provisioning and snapshots. How cool is that!

The Cloud Storage piece (Piece #2) is used for the data container and archiving reasons. This component can be sitting and hosted at Amazon S3, Microsoft Azure, Rackspace Cloud Files, Nirvanix Storage Delivery Network and Iron Mountain Archive Services Platform.

The data communication and transfer between the Nasuni filer is secure, encrypted, deduplication and compressed, giving it the efficiency and security that most customers would be concerned about. The diagram below explains the dat communication and data transfer bit.

 

In this manner, the Nasuni filer can replace traditional NAS platforms and can potentially provide a much lower total cost of ownership (TCO) in the long run. Nasuni does not pretend to be a NAS replacement. To me, this concept is very inventive and could potentially change the way we perceive file sharing and file server, obscuring and blurring concept of NAS.

Again, I would like to reiterate that Nasuni does not attempt to say their solution is a NAS or a performance-based Cloud Storage but what they have cleverly packaged seems to be appealing to customers. Their customer base has grown 78% in Q2 of 2011. It’s just too bad they are not here in Malaysia or this part of the world (yet).

IOPS in Cloud Storage? Not yet.