ARC reactor also caches?

The fictional arc reactor in Iron Man’s suit was the epitome of coolness for us geeks. In the latest edition of Oracle Magazine, Iron Man is on the cover, as well as the other 5 Avengers in a limited edition series (see below).

Just about the same time, I am reading up on the ARC (Adaptive Replacement Caching) that is adopted in ZFS. I am learning in depth of how ZFS caching works as opposed to the more popular LRU (Least Recently Used) caching algorithm that is used in most storage cache memory. Having said that, most storage vendors employed a modified LRU algorithm, with the intention to keep the most recently accessed pages in memory as long as possible. This is true in NetApp’s Data ONTAP (maybe not the ONTAP GX in which I have little experience) and EMC FlareOE. ONTAP goes further to by keeping the most frequently accessed pages permanently in memory. EMC folks would probably refer to most recently accessed as spatial locality while most frequently accessed as temporal locality.

Why is ZFS using ARC and what is ARC? Continue reading

Server way of locked-in storage

It is kind of interesting when every vendor out there claims that they are as open as they can be but the very reality is, the competitive nature of the game is really forcing storage vendors to speak open, but their actions are certainly not.

Confused? I am beginning to see a trend … a trend that is forcing customers to be locked-in with a certain storage vendor. I am beginning to feel that customers are given lesser choices, especially when the brand of the server they select for their applications  will have implications on the brand of storage they will be locked in into.

And surprise, surprise, SSDs are the pawns of this new cloak-and-dagger game. How? Well, I have been observing this for quite a while now, and when HP announced their SMART portfolio for their storage, it’s time for me to say something.

In the announcement, it was reported that HP is coming out with its 8th generation ProLiant servers. As quoted:

The eighth generation ProLiant is turbo-charging its storage with a Smart Array containing solid state drives and Smart Caching.

It also includes two Smart storage items: the Smart Array controllers and Smart Caching, which both feature solid state storage to solve the disk I/O bottleneck problem, as well as Smart Data Services software to use this hardware

From the outside, analysts are claiming this is a reaction to the recent EMC VFCache product. (I blogged about it here) and HP was there to put the EMC VFcache solution as a first generation product, lacking the smarts (pun intended) of what the HP products have to offer. You can read about its performance prowess in the HP Connect blog.

Similarly, Dell announced their ExpressFlash solution that ties up its 12th generation PowerEdge servers with their flagship (what else), Dell Compellent storage.

The idea is very obvious. Put in a PCIe-based flash caching card in the server, and use a condescending caching/tiering technology that ties the server to a certain brand of storage. Only with this card, that (incidentally) works only with this brand of servers, will you, Mr. Customer, be able to take advantage of the performance power of this brand of storage. Does that sound open to you?

HP is doing it with its ProLiant servers; Dell is doing it with its ExpressFlash; EMC’s VFCache, while not advocating any brand of servers, is doing it because VFCache works only with EMC storage. We have seen Oracle doing it with Oracle ExaData. Oracle Enterprise database works best with Oracle’s own storage and the intelligence is in its SmartScan layer, a proprietary technology that works exclusively with the storage layer in the Exadata. Hitachi Japan, with its Hitachi servers (yes, Hitachi servers that we rarely see in Malaysia), already has such a technology since the last 2 years. I wouldn’t be surprised that IBM and Fujitsu already have something in store (or probably I missed the announcement).

NetApp has been slow in the game, but we hope to see them coming out with their own server-based caching products soon. More pure play storage are already singing the tune of SSDs (though not necessarily server-based).

The trend is obviously too, because the messaging is almost always about storage performance.

Yes, I totally agree that storage (any storage) has a performance bottleneck, especially when it comes to IOPS, response time and throughput. And every storage vendor is claiming SSDs, in one form or another, is the knight in shining armour, ready to rid the world of lousy storage performance. Well, SSDs are not the panacea of storage performance headaches because while they solve some performance issues, they introduce new ones somewhere else.

But it is becoming an excuse to introduce storage vendor lock-in, and how has the customers responded this new “concept”? Things are fairly new right now, but I would always advise customers to find out and ask questions.

Cloud storage for no vendor lock-in? Going to the cloud also has cloud service provider lock-in as well, but that’s another story.


Lightning about to strike

Watch out for February 6th, 2012 folks! The Lightning is about to strike!

Yes, it is likely that EMC will be announcing their server-based, 8-lane PCIe Flash memory card in early week of February. The PCIe card was dubbed “Project Lightning” when it was first announced in EMC World in May last year. It represents EMC’s first foray of products that sits on the server side, giving the impression that EMC could be entering the server business. I blogged about this way back in September last year. As explained by the EMC folks, they are not going into the server business but rather “extending” their performance tiering into the server space. Think of it like an umbilical cord that  sucks the server’s CPU processing power to give maximum performance boost for the EMC storage.

The card will sport Solid State Drive from LSI Warp Drive and comes in 100/200/300GB capacity. Here’s a picture of how the Lightning card would look like:

The SSD is an SLC (Single Level Cell) and is capable of delivering 150,000 random reads IOPS based on 4K blocks and 190,000 random writes IOPS. It can squeeze 1.4GB/sec in read throughput. While it is not on par with the performance of Fusion-IO, it can definitely do well leveraging EMC’s huge customer base. Furthermore, PCIe-based Flash memory cards such as Fusion-IO will not be able to take advantage of the bridge that links the server and the storage, making it confined to the server’s resources. The advantage is definitely EMC when you explore the possibilities.

Here’s a view of a slide from Virtual Geek summarizing the Project Lightning:

The Lightning card is aimed at customers who demand the highest performance, even higher that Tier 0. It will be integrated with EMC’s FAST (Fully Automated Storage Tiering) technology and is available to the VNX and VMAX platforms.

So watch out folks, because Lightning is about to strike soon!

Is there IOPS for Cloud Storage? – Nasuni style

I was in Singapore last week attending the Cloud Infrastructure Services course.

In the class, one of the foundation components of Cloud Computing is of course, storage. As the students and the instructor talked about Storage, one very interesting argument surfaced. It revolved around the storage, if it was offered on the cloud. A lot of people assumed that Cloud Storage would be for their databases, and their virtual machines, which of course, is true when the communication between the applications, virtual machines and databases are in the local area network of the Cloud Service Provider (CSP).

However, if the storage is offered through the cloud to applications that are sitting on-premise in the customer’s server room, then we have to think twice of how we perceive Cloud Storage. In this aspect, the Cloud Storage offered by the CSP is a Infrastructure-as-a-Service (IaaS), where the key service is Storage. We have to differentiate that this Storage functions as a data container, and usually not for I/O performance reasons.

Though this concept probably will be easily understood by storage professionals like us, this can cause a bit confusion for someone new to the concept of Cloud Computing and Cloud Storage. This confusion, unfortunately, is caused by many of us who are vendors or solution providers, or even publications and magazines. We are responsible to disseminate correct information to customers, but due to our lack of knowledge and experience in this extremely new market of Cloud Storage, we have created the FUDs (Fear, Uncertainty and Doubt) and hype.

Therefore, it is the duty of this blogger to clear the vapourware, and hopefully pass on the right information to accelerate  the adoption of Cloud Storage in the near future. At this moment, given the various factors such as network costs, high network latency and lack of key network technologies similar to LAN in Cloud Computing, Cloud Storage is, most of the time, for data storage containership and archiving only. And there are no IOPS or any performance related statistics related to Cloud Storage. If any engineer or vendor tells you that they have the fastest Cloud Storage in the industry, do me a favour. Give him/her a knock on the head for me!

Of course, as technologies evolve, this could change in the near future. For now, Cloud Storage is a container, NOT a high performance storage in the cloud. It is usually not meant for transactional data. There are many vendors in the Cloud Storage space from real CSPs to storage companies offering re-packaged storage boxes that are “cloud-ready”. A good example of a CSP offering Cloud Storage is Amazon S3 (Simple Storage Service). And storage vendors such as EMC and HDS are repackaging and rebranding their storage technologies as object storage, ready for the cloud. EMC Atmos is really a repackaged and rebranded Centera, with some slight modifications, while HDS , using their Archiving solution, has HCP (aka HCAP). There’s nothing wrong with what EMC and HDS have done, but before the overhyping of the world of Cloud Computing, these platforms were meant for immutable data archiving reasons. Just thought you should know.

One particular company that captured my imagination and addresses the storage performance portion is Nasuni. Of course, they are quite inventive with the Cloud Storage Gateway approach. Nasuni comes up with a Cloud Storage Gateway filer appliance, which can be either a physical 1U server or as a VMware or Hyper-V virtual appliance sitting on-premise at the customer’s site.

The key to this is “on-premise”, which allows access to data much faster because they are locally-cached in the Nasuni filer appliance itself. This Nasuni filer piece addresses the Cloud Storage “performance” piece but Nasuni do not claim any performance statistics with such implementation. The clever bit is that this addresses data or files that are transactional in nature, i.e. NFS or CIFS, to serve data or files “locally”. (I wonder if Nasuni filer has iSCSI as well. Hmmmm….)

In the Nasuni architecture, they “break up” their “Cloud Storage” into 2 pieces. Piece #1 sits on-premise, at the customer site, and acts as a bridge to the Piece #2, that is sitting in a Cloud Storage. From a simplified view, have a look at the diagram below:



Piece #1 is the component that handles some of the transactional traffic related to files. In a more technical diagram below, you can see that the Nasuni filer addresses the file sharing portion, using the local disks on the filer appliance as a local caching mechanism.


Furthermore, older file pieces are whiffed away to the any Cloud Storage using the Cloud Connector interface, hence giving the customer a sense that their storage capacity needs can be limitless if they want to (for a fee, of course). At the same time, the Nasuni filer support thin provisioning and snapshots. How cool is that!

The Cloud Storage piece (Piece #2) is used for the data container and archiving reasons. This component can be sitting and hosted at Amazon S3, Microsoft Azure, Rackspace Cloud Files, Nirvanix Storage Delivery Network and Iron Mountain Archive Services Platform.

The data communication and transfer between the Nasuni filer is secure, encrypted, deduplication and compressed, giving it the efficiency and security that most customers would be concerned about. The diagram below explains the dat communication and data transfer bit.


In this manner, the Nasuni filer can replace traditional NAS platforms and can potentially provide a much lower total cost of ownership (TCO) in the long run. Nasuni does not pretend to be a NAS replacement. To me, this concept is very inventive and could potentially change the way we perceive file sharing and file server, obscuring and blurring concept of NAS.

Again, I would like to reiterate that Nasuni does not attempt to say their solution is a NAS or a performance-based Cloud Storage but what they have cleverly packaged seems to be appealing to customers. Their customer base has grown 78% in Q2 of 2011. It’s just too bad they are not here in Malaysia or this part of the world (yet).

IOPS in Cloud Storage? Not yet.


Signs of things to come?

I wanted to sign off early tonight but an article in ComputerWorld caught my tired eyes. It was titled “EMC to put hardware into servers, VMs into storage” and after I read it, I couldn’t help but to juxtapose the articles with what I said earlier in my blogs, here and here.

It is very interesting to note that “EMC runs vSphere directly on the storage controllers and then uses vMotion to migrate VMs from application servers onto the storage array, ..” since the storage boxes have enough compute power to run Virtual Machines on the storage. Traditionally and widely accepted, VMs should be running on servers. Contrary to beliefs, EMC has already demonstrated this running of VMs capability on their VNX, Isilon and Symmetrix.

And soon, with EMC’s Project Lightning (announced at EMC World in May 2011), they will be introducing server side PCIe-based SSDs, ala Fusion-IO. This is different from the NetApp PAM/FlashCache PCIe-based card, which sits on their arrays, not on hosts or servers. And it is also very interesting to note that this EMC server-side PCIe Flash SSD card will become a bridge to EMC’s FAST (Fully Automated Storage Tiering) architecture, enabling it to place hot, warm and cold data strategically on different storage tiers of the applications on VMware’s VMs (now on either the server or the storage),  perhaps using vMotion as a data mover on top of the “specialized” link created by the server-side EMC PCIe card.

This also blurs the line between the servers and storage and creates a virtual architecture between servers and storage, because what used to be distinct data border of the servers is now being melded into the EMC storage array, virtually.

2 red alerts are flagging in my brain right now.

  1. The “bridge” has just linked the server back to the storage, after years of talking about networked storage. The server is ONE again with the storage. Doesn’t that look to you like a server with plenty of storage? It has come a full cycle. But more interesting and what I am eager to see is what more is this “bridge” capable of when it comes to data management. vMotion might be the first of many new “protocol” breeds to enhance data management and mobility with this “bridge”. I am salivating right now of this massive potential.
  2. What else can EMC do with the VMware API? This capability I am writing right now is made possible by EMC tweaking VMware’s API to maximize much, much more. As the VMware vStorage API is continually being enhanced, the potential is again, very massive and could change the entire landscape of cloud computing and subsequently, the entire IT landscape. This is another Pavlov’s dog moment (see figures below as part of my satirical joke on myself)


Sorry, the diagram below is not related to what my blog entry is. Just my way of describing myself right now. 😉

I am extremely impressed with what EMC is doing. A lot of smarts and thinking go into this and this is definitely signs of things to come. The server and the storage are “merging again”. Think of it as Borg assimilation in Star Trek.

Resistance is futile!

Does all SSDs make sense?

I have been receiving a lot of email updates from Texas Memory Systems for many months now. I am a subscriber to their updates and Texas Memory System is the grand daddy of flash and DRAM-based storage systems. They are not cheap but they are blazingly fast.

Lately, more and more vendors have been coming out with all SSDs storage arrays. Startups such Pure Storage, Violin Memory and Nimbus Data Systems have been pushing the envelope selling all SSDs storage arrays. A few days ago, EMC also announced their all SSDs storage array. As quoted, the new EMC VNX5500-F utilizes 2.5-in, single-level cell (SLC) NAND flash drives to 10 times the performance of the hard-drive based VNX arrays. And that is important because EMC has just become one of the earliest big gorillas to jump into the band wagon.

But does it make sense? Can one justify to invest in an all SSDs storage array?

At this point, especially in this part of the world, I predict that not many IT managers are willing to put their head on the chopping board and invest in an all SSDs storage array. They would become guinea pigs for a very expensive exercise and the state of the economy is not helping. Therefore the automatic storage tiering (AST) might stick better than having an all SSDs storage array. The cautious and prudent approach is less risky as I have mentioned in a past blog.

I wrote about Pure Storage in a previous blog and the notion that SSDs will offer plenty of IOPS and throughput. If the performance gain translates into higher productivity and getting the job done quicker, then I am all for SSDs. In fact, given the extra performance numbers

There is no denying that the fact that the industry is moving towards SSDs and it makes sense. That day will come in the near future but not now for customers in these part of the world.

SSDs coming into mainstream … be Ready!

There has been a slew of SSD news in the storage blogosphere with the big one from eBay.

eBay has just announced that it has 100TB of SSDs from Nimbus Data Systems. On top of that, OCZ, SanDisk and STEC, all major SSD manufacturers, have announced a whole lot of new products with the PCIe SSD cards leading the way. The most interesting thing was the factor of $/GB has gone down significantly, getting very close to the $/GB of spinning disks. This is indeed good news to the industry because SSDs delivers low latency, high IOPS, low power consumption and many other new benefits.

Side note: As I am beginning to understand more about SSDs, I found out that NAND flash SSD has a latency in the microseconds compared to spinning HDDs, which has milliseconds latency range. In addition to that DRAM SSDs have latency that is in the range of nano seconds, which is basically memory type of access. DRAM SSDs are of course, more expensive. 

The SSDs are coming very soon into the mainstream, and this will inadvertently, drive a new generation of applications and accelerate growth in knowledge acquisition. We are already seeing the decline of Fibre Channel disks and the rise of SAS and SATA disks but SSDs in the enterprise storage, as far as I am concerned, brings forth 2 new challenges which we, as professionals and users in the storage networking environment, must address.

These challenges can be simplified to

  1. Are we ready?
  2. Where is the new bottleneck?

To address the first challenge, we must understand the second challenge first.

In system architectures, we know of various of performance bottlenecks that exist either in CPU, memory, bus, bridge, buffer, I/O devices and so on. In order to deliver the data to be process, we have to view the data block/byte service request in its entirety.

When a user request for a file, this is a service request. The end objective is the user is able to read and write the file he/she requested. The time taken from the beginning of the request to the end of it, is known as service time, which latency plays a big part of it. We assume that the file resides in a NAS system in the network.

The request for the file begins by going through the file system layer of the host the user is accessing, then to the user and kernel space, moving on through the device driver of the NIC card, through the TCP/IP stack (which has its own set of buffer overheads and so on), passing the request through the physical wire. From there it moves on through the NAS system with the RAID system, file system and so on until it reaches the file request. Note that I have shortened the entire process for simple explanation but it shows that the service request passes through a whole lot of things in order to complete the request.

Bottlenecks exist everywhere within the service request path and is also subjected to external factors related to that service request. For a long, long time, I/O has been biggest bottleneck to the processing of the service request because it is usually and almost always the slowest component in the entire scheme of things.

The introduction of SSDs will improve the I/O performance tremendously, into the micro- or even nano-seconds range, putting it in almost equal performance terms with other components in the system architecture. The buses and the bridges in the computer systems could be the new locations where the bottleneck of a service request exist. Hence we have use this understanding to change the modus operandi of the existing types of applications such as databases, email servers and file servers.

The usual tried-and-tested best practices may have to be changed to adapt to the shift of the bottleneck.

So, we have to equip ourselves with what SSDs is doing and will do to the industry. We have to be ready and take advantage of this “quiet” period to learn and know more about SSD technology and what the experts are saying. I found a great website that introduces and speaks about SSD in depth. It is called StorageSearch and it is what I consider the best treasure trove on the web right now for SSD information. It is run by a gentleman named Zsolt Kerekes. Go check it out.

Yup, we must be get ready when SSDs hit the mainstream, and ride the wave.



Solid State Drives … are they reliable?

There’s been a lot of questions about Solid State Drives (SSD), aka Enterprise Flash Drives (EFD) by some vendors. Are they less reliable than our 10K or 15K RPM hard disk drives (HDDs)? I was asked this question in the middle of the stage when I was presenting the topic of Green Storage 3 weeks ago.

Well, the usual answer from the typical techie is … “It depends”.

We all fear the unknown and given the limited knowledge we have about SSDs (they are fairly new in the enterprise storage market), we tend to be drawn more to the negatives than the positives of what SSDs are and what they can be. I, for one, believe that SSDs have more positives and over time, we will grow to accept that this is all part of what the IT evolution. IT has always evolved into something better, stronger, faster, more reliable and so on. As famously quoted by Jeff Goldblum’s character Dr. Ian Malcolm, in the movie Jurassic Park I, “Life finds a way …”, IT will always find a way to be just that.

SSDs are typically categorized into MLCs (multi-level cells) and SLCs (single-level cells). They have typically predictable life expectancy ranging from tens of thousands of writes to more than a million writes per drive. This, by no means, is a measure of reliability of the SSDs versus the HDDs. However, SSD controllers and drives employ various techniques to enhance the durability of the drives. A common method is to balance the I/O accesses to the disk block to adapt the I/O usage patterns which can prolong the lifespan of the disk blocks (and subsequently the drives itself) and also ensure performance of the drive does not lag since the I/O is more “spread-out” in the drive. This is known as “wear-leveling” algorithm.

Most SSDs proposed by enterprise storage vendors are MLCs to meet the market price per IOP/$/GB demand because SLC are definitely more expensive for higher durability. Also MLCs have higher BER (bit-error-rate) and it is known than MLCs have 1 BER per 10,000 writes while SLCs have 1 BER per 100,000 writes.

But the advantage of SSDs clearly outweigh HDDs. Fast access (much lower latency) is one of the main advantages. Higher IOPS is another one. SSDs can provide from several thousand IOPS to more than 1 million IOPS when compared to enterprise HDDs. A typical 7,200 RPM SATA drive has less than 120 IOPS while a 15,000 RPM Fibre Channel or SAS drive ranges from 130-200 IOPS. That IOPS advantage is definitely a vast differentiator when comparing SSDs and HDDs.

We are also seeing both drive-format and card-format SSDs in the market. The drive-format type are typically in the 2.5″ and 3.5″ profile and they tend to fit into enterprise storage systems as “disk drives”. They are known to provide capacity. On the other hand, there are also card-format type of SSDs, that fit into a PCIe card that is inserted into host systems. These tend to address the performance requirement of systems and applications. The well known PCIe vendors are Fusion-IO which is in the high-end performance market and NetApp which peddles the PAM (Performance Access Module) card in its filers. The PAM card has been renamed as FlashCache. Rumour has it that EMC will be coming out with a similar solution soon.

Another to note is that SSDs can be read-biased or write-biased. Most SSDs in the market tend to be more read-biased, published with high read IOPS, not write IOPS. Therefore, we have to be prudent to know what out there. This means that some solution, such as the NetApp FlashCache, is more suitable for heavy-read I/O rather than writes I/O. The FlashCache addresses a large segment of the enterprise market because most applications are heavy on reads than writes.

SSDs have been positioned as Tier 0 layer in the Automated Storage Tiering segment of Enterprise Storage. Vendors such as Dell Compellent, HP 3PAR and also EMC FAST2 position themselves with enhanced tiering techniques to automated LUN and sub-LUN tiering and customers have been lapping up this feature like little puppies.

However, an up-and-coming segment for SSDs usage is positioning the SSDs as extended read or write cache to the existing memory of the systems. NetApp’s Flashcache is a PCIe solution that is basically an extended read cache. An interesting feature of Oracle Solaris ZFS called Hybrid Storage Pool allows the creation of read and write cache using SSDs. The Sun fellas even come up with cool names – ReadZilla and LogZilla – for this Hybrid Storage Pool features.

Basically, I have poured out what I know about SSDs (so far) and I intend to learn more about it. SNIA (Storage Networking Industry Association) has a Technical Working Group for Solid State Storage. I advise the readers to check it out.