Highroad to Parallel Road

Unless you are working with highly, parallelized access to files in a large scale-out NAS environment, you probably don’t get to work much with Parallel NFS (pNFS). pNFS is part of the NFSv4.1 (RFC 5661) standard, and was introduced in January 2010 to address NFS protocol in the clustered, scale-out NAS environment. It is to provide parallel file access across distributed servers.

pNFS is heavily driven by Panasas, NetApp, EMC, IBM, Sun (now Oracle) among others. And funnily enough, the company that sticks out from the bunch is one that used to tout block storage as the way to go, not files. That’s EMC, the company that more well known for its SAN solutions than its NAS (remember Celerra and IP4700?). And EMC has embraced pNFS in a big, big way. Read EMC’s CTO for Global Marketing, Chuck Hollis’ blog here and here.

However, unknown to many, a lot of the thinking that goes into pNFS are very similar to an EMC product some years ago. That product is EMC Highroad, which in the later years, renamed as Multi-path File System (MPFS).

Note: If you want to know more about the history of HighRoad/MPFS, read this blog.

The cornerstone of EMC MPFS is their File Mapping Protocol or FMP, which is a robust protocol that lines the mapping of files to their addressable blocks in storage. In a nutshell, when I was made responsible for this product during my time at EMC, I used to pitch to companies that MPFS was a file request is through NFS but respond to the requester can be in blocks (iSCSI or Fibre Channel). The beauty of this was, NFSv3 was chatty and heavy but the delivery of data through blocks via iSCSI or Fibre Channel has lesser overhead compared to NFSv3.

Hence the delivery is faster and EMC touted that the performance was 2-4x faster than NFS. Indeed, I have seen some lab tests results from EMC’s work with Schlumberger High Performance Lab in Houston, and the numbers were impressive. I still have them on Powerpoint somewhere.

In circa of 2003-2004, EMC donated the FMP code to IETF and as they say the rest was history.

The picture below basically summarizes what pNFS is all about.

 

A NFSv4 client will basically check with a Metadata server via the pNFS protocol about the layout of the distributed cluster of server. The file, could be striped across multiple nodes of the cluster, and it is the metadata server that provides a map to the client to access the blocks or files from these nodes. This is exactly what the EMC HighRoad/MPFS File Mapping Protocol (FMP) did, mapping the file requests to its respective corresponding blocks. See diagram below:

 

And one of the powerful feature of pNFS is that it is not just about NFS. The green arrow you see in the above diagram is the storage-access protocol. That access protocol can be NFSv4.1, CIFS, iSCSI, Fibre Channel, FCoE, Infiniband, and Object Storage Device (OSD).

In order to have pNFS working, the NFS client must be NFSv4.1 ready and that code has been made available in Linux and OpenSolaris. Other Unix vendors, no doubt, will be coming out with their NFSv4.1 implementation soon. Oooooh, there will be a Windows NFSv4.1 client coming as well!

But I want to dispel the notion that EMC is a SAN company. EMC is a very strong NAS company and if you have seen the IDC market share (ok, ok, many of you out there will argue about it), EMC is #1 in NAS. And their contribution to pNFS is immense.

Novell Filr (How do you pronounce this?)

I let you in on a little secret … I am a great admirer of Novell’s technology.

Ok, ok, they aren’t what they used to be anymore (remember the great heydays of Netware, ZenWorks and Groupwise?) And some of their business decisions didn’t make a lot of fans either. Some notable ones in recent years were the joint patent agreement with Microsoft (November 2006) and their ownership of Unix operating system rights. Though Novell did finally protected the Unix community by being the rightful owner of Unix OS rights, the negativity from the lawsuit and counter lawsuit between SCO and Novell soured the relationship with the faithfuls of Unix. In the end, they were acquired by Attachmate late last year.

However, I have been picking up bits of Novell technology knowledge for the past 3-4 years. Somehow, despite the negative perception that most people I know had about Novell, I strongly believe the ideas and thinking that goes into their solutions and products are smart and innovative.

So, when my buddy (and ex-housemate) of mine, Mr. Ong Tee Kok, the Country Manager of Novell Malaysia, asked me to evaluate a new solution from Novell (it’s not even been released yet), I jumped at the chance.

Novell will soon be announce a solution called Novell Filr. I really don’t know how to pronounce the name, but the concept of Novell Filr makes a lot of sense. I cannot say that it is disruptive but it is coming to meet the changing world of how users are storing and accessing their files and balancing it with the needs of enterprise file management and access.

Yes, Novell Filr is a file virtualization solution. It comes between the user and their files. Previously in a network attached environment, files are presented to the users via the typical file sharing protocols, CIFS for Windows and NFS for Unix/Linux. These protocols have been around for ages, with some recent advents in the last few years for SMB 2.0 and NFS version 4. However, the updates to these protocols address the greater needs of the organizations and the enterprise rather than the needs of the users.

And because of this, users have been flocking to cloud-centric solutions out there such as DropBox, Box.net and Windows Live SkyDrive. These solutions cater to the needs of the users wanting to access their files anywhere, with any device. Unfortunately, the simplicity of file access the “cloud-way” is not there when the users are in the office network. They would have to be routinely reminded by the system administrator to keep the files in some special directory to have their files backed up. Otherwise, they shall be ostracized by the IT department and their straying files will not be backed up.

Well, Novell will be introducing their Novell Filr soon and they have released a video of their solution. Check this out.

I shall be spending some time this week to look into their solution deeper and hoping to see a demo soon. And I have great confidence in the Novell solutions. I intend to share more about them later.

Storage Architects no longer required

I picked up a new article this afternoon from SearchStorage – titled “Enterprise storage trends: SSDs, capacity optimization, auto tiering“. I cannot help but notice some of the things I have been writing about VMware being the storage killer and the rise of Cloud Computing which take away our jobs.

I did receive some feedback about what I wrote in the past and after reading the SearchStorage article, I can’t help but feeling justified. On the side bar, it wrote:

 

The rise of virtual machine-specific and cloud storage suggest that other changes are imminent. In both cases …. and would no longer require storage architects and managers.

Things are changing at an extremely fast pace and for those of us still languishing in the realms of NAS and SAN, our expertise could be rendered obsolete pretty quickly.

But all is not lost because it would be easier for a storage engineer, who already has the foundation to move into the virtualization space than a server virtualization engineer coming down to learn about the storage fundamentals. We can either choose to be dinosaur or be the species of the next generation.

The rise of the specialized appliance

Compute and storage are 2 components within the IT infrastructure which are surely converging. SAN and NAS are facing their greatest adversary yet, and could be made insignificant if the cloud and virtualization game had their way. This is giving rise to the a new breed of solution, a specialized appliance where both compute and storage are ONE. Rising from the ashes of shared storage (SAN and NAS, take note), we are beginning to see things going back to way of direct, internal storage.

There were some scuffles in the bushes about 5 years, where Sun (now Oracle) was ahead of its game. The Sun Fire X4500 (aka Thumper) was one of the strong candidates to challenge the SAN/NAS duopoly in this networked storage period. X4500 integrated both the server and the storage components together, using ZFS as a file system and volume manager to deliver a very high throughput on all the JBOD disks very efficiently. ZFS acted as the RAID, so there was no need to have specialized RAID hardware. This proved that a very high performance storage solution can be easily integrated using standard off-the-shelf infrastructure components and the x86 architecture. By combining both compute and storage together, there were hints that the industry was about to rise up to Direct-Attached Storage (DAS) again, despite its perceived weakness against SAN and NAS.

Unfortunately, the applications were not ready for DAS then. Besides ZFS, applications such as databases, emails and file servers were not ready to jump into the DAS bandwagon and watch them ride into the sunset. But the fairy tale seems to be retold again, and this time, the evidence that DAS could rise again is much stronger.

The catalyst to this disruptive force? Virtualization!

I mentioned that VMware is the silent storage killer a few blogs ago. Needless to say, that ruffled a few featheres among the readers. I have no doubt that virtualization is changing how we storage guys look at SAN and NAS. In a traditional setup, the SAN or NAS is setup to provision LUNs or mount points to the data storage for VMFS volumes in the VMware environment. It will then be the storage array to provide snapshots, replications, thin provisioning and so on.

Perhaps VMware is nit picking that managing storage arrays for VMFS volumes is difficult. From the VMware administrators view, they are right. They don’t want to know what’s going on below the VM-level. All they want is storage, any kind of storage and VMware will manage the volumes, snapshots, replication and thin  provisioning. Indeed they were already doing that since vStorage API was introduced. In the new release of VMware version 5.0, the ante has been upped even higher, making networked storage less and less significant.

If you want to know about vStorage API and stuff, below is a diagram of the integration of the various components at the VMware API level.

 

VMware can now use direct, internal storage look like shared storage. The Virtual Storage Appliance (VSA) does just that. VMware already has a thriving market from the community and hobbists for VMware Appliances.

The appliance market has now evolved into new infrastructure too. Using x86 architecture, off-the-shelf infrastructure components (sounds familiar?), companies such as Nutanix and Tintri are taking advantage of this booming trend to introduce specialized VMware appliances as shown in their advertisements on their respective web sites.

Here’s the Nutanix Ad:

 

Here’s the Tintri Ad:

 

Both Tintri and Nutanix are a new breed of appliances – specialized appliances for VMware.

At the same time, other applications are building these specialized appliances as well. I have mentioned Oracle Exadata many times in the past and Oracle Exadata is the perfect example an a fine-tuned, hardcore database engine to make the Oracle run at the best performance possible.

Likewise HP has announced their E5000 Messaging System for Microsoft Exchange. The E5000 is a specialized appliance optimized and well-tuned for the Microsoft Exchange Server 2010. From the words of HP,

“HP E5000 Messaging System is the industry’s first fully self-contained platform built for the next-generation of Microsoft Exchange to deliver enterprise-class messaging to businesses of all sizes. Built as a turnkey solution that can be up and running in a few hours vs. days, the HP E5000 Messaging System gives business users the experience they want most: large mailboxes, centralized archiving of mailboxes files and 24×7 access from any device. IT staffs benefit the solutions simplicity to setup, scale and manage and to meet new demands affordably. Ideal for multi-site enterprises as well as branch office and remote office environments, each HP Messaging System delivers greater simplicity and accelerates deployment with preconfigured solutions starting at 500 mailboxes up to 3000 mailboxes, while delivering large, 1 to 2.5GB mailbox sizes. Clients can grow by adding storage capacity or more appliances within the environment up from hundreds to thousands of mailboxes.”

What are the specs of this E5000 box, you say? Here you go:

 

And look at Row#2 in the table above … Direct, Internal Disks! Look at Row #4, Xeon CPUs! Both Compute and Storage in the same appliance!

While the HP E5000 announcement was recently, Hitachi Data Systems were already in the game early with their Unified Compute Platform and their Converged Platform for Microsoft Exchange with relatively the same idea – specialized appliances.

Perhaps the HDS solutions aren’t exactly direct, internal storage but the concept is still the same – specialized appliance. HDS Unified Compute Platform (UCP) has these components.

 

HDS Converged Platform for MS Exchange provides their specialized “appliance” with Reference Architectures that can support up to 68,000 Microsoft Exchange mailboxes. Here’s an architecture diagram of their “appliance”

 

There’s no denying that the networked storage landscape is changing. So are the computing platforms. We are already seeing the compute and storage components being integrated together, tighter than ever. The wave is rising for specialized appliances and it can only get more intense from now on.

No wonder HP’s Converged Infrastructure vision is betting on x86 architecture, simple storage platforms with SAS/SATA disks and Virtualization. Other vendors are doing the same as well – Cisco, NetApp and VMware with their FlexPod solution and EMC with their VBlocks of VMware, Cisco and EMC Storage.

Hail to the Rise of the Specialized Appliance!

Will SAN or NAS matter if your customer’s storage is in the Cloud?

An interesting question popped into my head yesterday. With all this push into the Cloud, the customer does not own most of the computer equipment. They are just getting services and when they want storage, do you think they care whether their storage is on a SAN or NAS?

I have mentioned this before, Cloud makes a lot of IT stuff irrelevant. Read my previous blog. This means that the demand for IT techies, sysadmins, consultant will suddenly be squeezed into who’s very good, good, not-so-good and the downright bad ones. Let’s the survival-of-the-fittest games begin!

Yes, the SAN and NAS, or even unified storage story doesn’t hold much weight anymore. However, to the cloud service provider, they will be out there looking for what is best for their bottom line, whether it will be a branded box or just a white box if they are willing to build the storage on their own. For those providers who have strong financials, obviously investing in premium brands like EMC, IBM, NetApp, and so on, makes sense because they need someone to blame and penalize when the shit hits the fan. For those who doesn’t have the financial prowess, this presents a whole new economy that resellers, partners, distributors can tap on to – build for these cloud providers at a cheaper price (hint, hint).

However, storage relies on a strong storage operating system to do just that. They are plenty of open source ones. Hey, you can practically build a simple iSCSI or NAS box with Linux. Consumer grade NAS such as NetGear, Synology and DLink have been using open-source Linux to penetrate the low-end, home storage market for years. The cloud providers will be a different ballgame, but the storage piece is fundamentally the same.

Things are changing folks, and for those consultants, product pre-sales, post-sales, sysadmins, operators of storage, you have to evolve to meet this new market. SAN and NAS do not matter anymore when customers are using the cloud services.

p/s: I have been spending time looking at some very, very cool cloud-ready storage operating systems. If you have the time, leave me a comment and we’ll talk. 😀