More specialized appliances at Oracle OpenWorld

I was reading the news from Oracle OpenWorld and a slew of news about specialized appliances are on the menu.

Oracle added Big Data Appliance and Oracle Exalytics Business Intelligence Machine to its previous numero uno, Exadata Database Machine. EMC, also announced its Green Plum Data Computing Appliance and also its VNX Unified Storage for Oracle.

As quoted

The EMC VNX Unified Storage for Oracle is a VNX system that has 
Oracle installed in a VMware vSphere virtual machine environment. 
The system is meant to unify all Oracle environments--database over 
Oracle Direct NFS, application servers over NFS, and testing and 
development over NFS--resulting in less disk space used and faster 
testing. EMC says this configuration was made because 50% of Oracle 
customers are virtualizing their systems today.

The VNX Unified Storage for Oracle includes EMC's Fully Automated 
Storage Tiering (FAST) technology, which migrates most frequently 
used data between a primary Fibre Channel drive and solid state drives 
and migrates less frequently used data to Serial ATA (SATA) drives and 
its FAST Cache. In an Oracle environment, FAST is well-suited to 
database applications that generate a large number of random 
inputs-outputs, that experience sudden bursts in user query activity, 
or a high number of user loads and where the entire working set can 
be contained in the solid state drive cache.

Based on testing carried out on an Oracle Real Application Clusters 
(RAC) 11g database that was configured to access the VNX7500 file 
storage over the Network File System (NFS), using the Oracle 
Direct NFS (dNFS) client, results showed an 100% improvement in 
transactions per minute (TPM), 170% improvement in IOPS, and 
a 79% decrease in response time, the company said.

As for GreenPlum, EMC quoted:

The company also is showing off the EMC Greenplum Data Computing 
Appliance(DCA) for Big Data Analytics configuration, which provides 
a new migration path to Greenplum for Oracle Data Warehouse. This 
system includes the Greenplum Data Computing Appliance, EMC's 
Global Data Warehouse, and EMC's IT Business Intelligence Grid 
infrastructure. The EMC Greenplum DCA consists of 8 to 16 segment 
servers running Red Hat Enterprise Linux. Each segment server 
contains 96 to 192 processor cores, with 384 GB to 768 GB of 
memory per segment server. The DCA includes 12 600-GB Serial 
Attached SCSI (SAS) 15K RPM drives for a total useable and 
compressed capacity of 73 TB to 144 TB. The DCA competes with 
Oracle's Exadata Database Machine.

In tests performed with this server/storage configuration and a 
15-TB Oracle Data Warehouse, the DCA processed a 99 million rows 
query in less than 28 seconds vs. seven minutes in a traditional 
Oracle environment and data loads decreased from six days to 29 
minutes

It is getting pretty obvious that specialized appliances are making waves at Oracle OpenWorld but what’s more interesting is the return of a combined and integrated environment of compute and storage as I have mentioned in my previous blog. And I forsee that these specialized appliances will be one of the building blocks of cloud computing together with general purposes platforms such as x86, JBODs and the glue to all these, virtualization, notably VMware.

The rise of the specialized appliance

Compute and storage are 2 components within the IT infrastructure which are surely converging. SAN and NAS are facing their greatest adversary yet, and could be made insignificant if the cloud and virtualization game had their way. This is giving rise to the a new breed of solution, a specialized appliance where both compute and storage are ONE. Rising from the ashes of shared storage (SAN and NAS, take note), we are beginning to see things going back to way of direct, internal storage.

There were some scuffles in the bushes about 5 years, where Sun (now Oracle) was ahead of its game. The Sun Fire X4500 (aka Thumper) was one of the strong candidates to challenge the SAN/NAS duopoly in this networked storage period. X4500 integrated both the server and the storage components together, using ZFS as a file system and volume manager to deliver a very high throughput on all the JBOD disks very efficiently. ZFS acted as the RAID, so there was no need to have specialized RAID hardware. This proved that a very high performance storage solution can be easily integrated using standard off-the-shelf infrastructure components and the x86 architecture. By combining both compute and storage together, there were hints that the industry was about to rise up to Direct-Attached Storage (DAS) again, despite its perceived weakness against SAN and NAS.

Unfortunately, the applications were not ready for DAS then. Besides ZFS, applications such as databases, emails and file servers were not ready to jump into the DAS bandwagon and watch them ride into the sunset. But the fairy tale seems to be retold again, and this time, the evidence that DAS could rise again is much stronger.

The catalyst to this disruptive force? Virtualization!

I mentioned that VMware is the silent storage killer a few blogs ago. Needless to say, that ruffled a few featheres among the readers. I have no doubt that virtualization is changing how we storage guys look at SAN and NAS. In a traditional setup, the SAN or NAS is setup to provision LUNs or mount points to the data storage for VMFS volumes in the VMware environment. It will then be the storage array to provide snapshots, replications, thin provisioning and so on.

Perhaps VMware is nit picking that managing storage arrays for VMFS volumes is difficult. From the VMware administrators view, they are right. They don’t want to know what’s going on below the VM-level. All they want is storage, any kind of storage and VMware will manage the volumes, snapshots, replication and thin  provisioning. Indeed they were already doing that since vStorage API was introduced. In the new release of VMware version 5.0, the ante has been upped even higher, making networked storage less and less significant.

If you want to know about vStorage API and stuff, below is a diagram of the integration of the various components at the VMware API level.

 

VMware can now use direct, internal storage look like shared storage. The Virtual Storage Appliance (VSA) does just that. VMware already has a thriving market from the community and hobbists for VMware Appliances.

The appliance market has now evolved into new infrastructure too. Using x86 architecture, off-the-shelf infrastructure components (sounds familiar?), companies such as Nutanix and Tintri are taking advantage of this booming trend to introduce specialized VMware appliances as shown in their advertisements on their respective web sites.

Here’s the Nutanix Ad:

 

Here’s the Tintri Ad:

 

Both Tintri and Nutanix are a new breed of appliances – specialized appliances for VMware.

At the same time, other applications are building these specialized appliances as well. I have mentioned Oracle Exadata many times in the past and Oracle Exadata is the perfect example an a fine-tuned, hardcore database engine to make the Oracle run at the best performance possible.

Likewise HP has announced their E5000 Messaging System for Microsoft Exchange. The E5000 is a specialized appliance optimized and well-tuned for the Microsoft Exchange Server 2010. From the words of HP,

“HP E5000 Messaging System is the industry’s first fully self-contained platform built for the next-generation of Microsoft Exchange to deliver enterprise-class messaging to businesses of all sizes. Built as a turnkey solution that can be up and running in a few hours vs. days, the HP E5000 Messaging System gives business users the experience they want most: large mailboxes, centralized archiving of mailboxes files and 24×7 access from any device. IT staffs benefit the solutions simplicity to setup, scale and manage and to meet new demands affordably. Ideal for multi-site enterprises as well as branch office and remote office environments, each HP Messaging System delivers greater simplicity and accelerates deployment with preconfigured solutions starting at 500 mailboxes up to 3000 mailboxes, while delivering large, 1 to 2.5GB mailbox sizes. Clients can grow by adding storage capacity or more appliances within the environment up from hundreds to thousands of mailboxes.”

What are the specs of this E5000 box, you say? Here you go:

 

And look at Row#2 in the table above … Direct, Internal Disks! Look at Row #4, Xeon CPUs! Both Compute and Storage in the same appliance!

While the HP E5000 announcement was recently, Hitachi Data Systems were already in the game early with their Unified Compute Platform and their Converged Platform for Microsoft Exchange with relatively the same idea – specialized appliances.

Perhaps the HDS solutions aren’t exactly direct, internal storage but the concept is still the same – specialized appliance. HDS Unified Compute Platform (UCP) has these components.

 

HDS Converged Platform for MS Exchange provides their specialized “appliance” with Reference Architectures that can support up to 68,000 Microsoft Exchange mailboxes. Here’s an architecture diagram of their “appliance”

 

There’s no denying that the networked storage landscape is changing. So are the computing platforms. We are already seeing the compute and storage components being integrated together, tighter than ever. The wave is rising for specialized appliances and it can only get more intense from now on.

No wonder HP’s Converged Infrastructure vision is betting on x86 architecture, simple storage platforms with SAS/SATA disks and Virtualization. Other vendors are doing the same as well – Cisco, NetApp and VMware with their FlexPod solution and EMC with their VBlocks of VMware, Cisco and EMC Storage.

Hail to the Rise of the Specialized Appliance!

NFS deserves more credit from guys doing virtualization

I was at the RedHat Forum last week when I chanced upon a conversation between an attendee and one of the ECS engineers. The conversation went like this

Attendee: Is the RHEV running on SAN or NAS?

ECS Engineer: Oh, for this demo, it is running NFS but in production, you should run iSCSI or Fibre Channel. NFS is only for labs only, not good for production.

Attendee: I see … (and he went off)

I was standing next to them munching my mini-pizza and in my mind, “Oh, come on, NFS is better than that!”

NAS has always played a smaller brother to SAN but usually for the wrong reasons. Perhaps it is the perception that NAS is low-end and not good enough for high-end production systems. However, this is very wrong because NAS has been growing at a faster rate than Fibre Channel, and at the same time Fibre Channel growth has been tapering and possibly on the wane. And I have always said that NAS is a better suited protocol when it comes to unstructured data and files because the NAS protocol is the new storage networking currency of Internet storage and the Cloud (this could change very soon with the REST protocol, but that’s another story). Where else can you find a protocol where sharing is key. iSCSI, even though it has been growing at a faster pace in production storage, cannot be shared easily because it is block-based.

Now back to NFS. NFS version 3 has been around for more than 15 years and has taken its share of bad raps. I agree that this protocol is still very much in the landscape of most NFS installations. But NFS version 4 is changing all that taking on the better parts of the CIFS protocol, notably the equivalent of opportunistic locking or oplocks. In addition to that it has greatly enhanced its security, incorporating Kerberos-type of authentication. As for performance, NFS v4 added in a compounded in a COMPOUND operations for aggregating operations into a single request.

Today, most virtualization solutions from VMware and RedHat works with NFS natively. Note that the Windows CIFS protocol is not supported, only NFS.

This blog entry is not stating that NFS is better than iSCSI or FC but to give NFS credit where credit is due. NFS is not inferior to these block-based protocols. In fact, there are situations where NFS is better, like for instance, expanding the NFS-based datastore on the fly in a VMware implementation. I will use several performance related examples since performance is often used as a yardstick when these protocols are compared.

In an experiment conducted by VMware based on a version 4.0, with all things being equal, below is a series of graphs that compares these 3 protocols (NFS, iSCSI and FC). Note the comparison between NFS and iSCSI rather than FC because NFS and iSCSI run on Gigabit Ethernet, whereas FC is on a different networking platform (hey, if you got the money, go ahead and buy FC!)

Based a one virtual machine (VM), the Read throughput statistics (higher is better) are:

 

The red circle shows that NFS is up there with iSCSI in terms of read throughput from 4K blocks to 512K blocks. As for write throughput for 1 VM, the graph is shown below:


Even though NFS suffers in write throughput in the smaller blocks less than 16KB, NFS performance write throughput improves over iSCSI when between 16K and 32K range and is equal when it is in 64K, 128K and 512K block tests.

The 2 graphs above are of a single VM. But in most real production environment, a single ESX host will run multiple VMs and here is the throughput graph for multiple VMs.

Again, you can see that in a multiple VMs environment, NFS and iSCSI are equal in throughput, dispelling the notion that NFS is not as good in performance as iSCSI.

Oh, you might say that this is just VMs without any OSes or any applications running in these VMs. Next, I want to share with you another performance testing conducted by VMware for an Microsoft Exchange environment.

The next statistics are produced from an Exchange Load Generator (popularly known as LoadGen) to simulate the load of 16,000 Exchange users running in 8 VMs. With all things being equal again, you will be surprised after you see these graphs.

The graph above shows the average send mail latency of the 3 protocols (lower is better). On the average, NFS has lower latency than iSCSI, better than what most people might think. Another graph shows the 95th percentile of send mail latency below:

 

Again, you can see that the NFS’s latency is lower than iSCSI. Interesting isn’t it?

What about IOPS then? In another test with an 8-hour DoubleHeavy LoadGen simulator, the IOPS graphs for all 3 protocols are shown below:

In the graph above (higher is better), NFS performed reasonably well compared to the other 2 block-based protocols, and even outperforming iSCSI in this 8-hour load testing. Surprising huh?

As I have shown, NFS is not inferior compared to the block-based protocols such as iSCSI. In fact, VMware in version 4.1 has improved all 3 storage protocols significantly as mentioned in the VMware paper. The following are quoted in the paper for NFS and iSCSI.

  1. Using storage microbenchmarks, we observe that vSphere 4.1 NFS shows improvements in the range of 12–40% for Reads,and improvements in the range of 32–124% for Writes, over 10GbE.
  2. Using storage microbenchmarks, we observe that vSphere 4.1 Software iSCSI shows improvements in the range of 6–23% for Reads, and improvements in the range of 8–19% for Writes, over 10GbE

The performance improvement for NFS is significant when the network infrastructure was 10GbE. The percentage jump between 32-124%! That’s a whopping figure compared to iSCSI which ranged from 8-19%. Since both protocols are neck-to-neck in version 4.0, NFS seems to be taking a bigger lead in version 4.1. With the release of VMware version 5.0 a few weeks ago, we shall know the performance of both NFS and iSCSI soon.

To be fair, NFS does take a higher CPU performance hit compared to iSCSI as the graph below shows:

Also note that the load testing are based on NFS version 3. If version 4 was used, I am sure the performance statistics above will take a whole new plateau.

Therefore, NFS isn’t inferior at all compared to iSCSI, even in a 10GbE environment. We just got to know the facts instead of brushing off NFS.

Virtualization and cloud aren’t what they are without storage

I was chatting with a friend yesterday and we were discussing about virtualization and cloud, the biggest things that are happening in the IT industry right now. We were talking about the VMware vSphere 5 arrival, the cool stuff VMware is bringing into the game, pushing the technology juggernaut farther and farther ahead of its rivals Hyper-V, Xen and Virtual Box.

And in the technology section of the newspaper yesterday, I saw news of Jaring OneCloud offering and one of the local IT players just brought in Joyent. Fantastic stuff! But for us in IT, we have been inundated with cloud, cloud and more cloud. The hype, the fuzz and the reality. It’s all there but back to our conversation. We realized that virtualization and cloud aren’t much without storage, the cornerstone of virtualization and cloud. And in the storage networking layer, there are the data management piece, the information infrastructure piece and so on and yet … why are there so few storage networking professional out there in our IT scene.

I have been lamenting this for a long time because we have been facing this problem for a long time. We are facing a shortage of qualified and well experienced storage networking professionals. There are plenty of jobs out there but not enough resources to meet the demand. As SNIA Malaysia Chairman, it is my duty to work with my committee members of HP, IBM, EMC, NetApp, Symantec and Cisco to create the awareness, and more importantly the passion to get the local IT’s storage networking professional voice together. It has been challenging but my advice to all those people out there – “Why be ordinary when you can become extra-ordinary?”

We have to make others realize that storage networking is what makes virtualization and cloud happen. Join us at SNIA Malaysia and be part of something extra-ordinary. Storage networking IS the foundation of virtualization and cloud. You can’t exclude it.