NFS-phobic in Malaysia

I taught the EMC Cloud Infrastructure and Services (CIS) class last week and naturally, a few students came from the VMware space. I asked how they were implementing their storage and everyone said Fibre Channel.

I have spoken to a lot of people about this as well in the past, whether they are using SAN or NAS storage for VMware environments. And almost 99% would say SAN, either FC-SAN or iSCSI-based SAN. Why???

When I ask these people about deploying NFS, the usual reply would be related to performance.

NFS version 3 won the file sharing protocol race during its early days where Unix variants were prevalent, but no thanks to the Balkanization of Unices in the 90s. Furthermore, NFS lost quite a bit of ground between NFSv3 in 1995 and the coming out party of NFSv4.1 just 2 years ago. The in-between years were barren and NFS become quite a bit of a joke with “Need For Speed” or “No F*king Security“. That also could be a contributing factor to the NFS-phobia we see here in Malaysia.

I have experiences with both SAN and NAS and understood the respective protocols of Fibre Channel, iSCSI, NFS and CIFS, and I felt that NFS has been given unfair treatment by people in this country. For the uninformed, NFS is the only NAS protocol supported by VMware. CIFS, the Windows file sharing protocol, is not supported, probably for performance and latency reasons. However, if you catch up with high performance computing (HPC), clustering, or MPP (Massively Parallel Processing) resources, almost always you will read about NFS being involved in delivering very high performance I/O. So, why isn’t NFS proposed with confidence in VMware environments?

I have blogged about this before. And I want to use my blog today to reassert what I believe in and hope that more consideration can be given to NFS when it comes to performance, even for virtualized environments.

NFS performance is competitive when compared to Fibre Channel and in a lot of cases, better than iSCSI. It is just that the perception of poor performance in NFS is stuck in people’s mind and it is hard to change that. However, there are multiple credible sources that stated that NFS is comparable to Fibre Channel. Let me share with you one of the source that compared NFS with other transport protocols:

From the 2 graphs of IOPS and Latency, NFS fares well against other more popular transport protocols in VMware environments. Those NFS performance numbers, are probably not RDMA driven as well. Otherwise RDMA could very well boost the NFS numbers into even higher ground.

What is this RDMA (Remote Direct Memory Access)? RDMA is already making its presence felt quietly, and being used with transports like Infiniband and 10 Gigabit Ethernet. In fact, Oracle Solaris version 11 will use RDMA as the default transmission protocol whenever there is a presence of RDMA-enable NICs in the system. The diagram below shows where RDMA fits in in the network stack.

RDMA eliminates the need for the OS to participate in the delivery of data, and directly depositing the data from the initiator’s memory to the target’s memory. This eliminates traditional networking overheads such as buffers copying and setting up network data structures for the delivery. A little comparison of RDMA with traditional networking is shown below:

I was trying to find out how prevalent NFS was in supporting the fastest supercomputers in the world from the Top500 Supercomputing sites. I did not find details of NFS being used, but what I found was the Top500 supercomputers do not employ Fibre Channel SAN at all! Most have either proprietary interconnects with some on Infiniband and 10 Gigabit Ethernet. I would presume that NFS would figure in most of them, and I am confident that NFS can be a protocol of choice for high performance environments, and even VMware environments.

The future looks bright for NFSv4. We are beginning to see the word of “parallel NFS (pNFS)” being thrown into conversations around here, and the awareness is there. NFS version 4.2 is just around the corner as well, promising greater enhancement to the protocol.

5 Responses to NFS-phobic in Malaysia

Pingback: Best Storage Blog Posts of 2012 | Tonian.com
Andre says:

February 6, 2013 at 5:36 pm

Just curious, what is the setup that you use in to provide the relative values on the graphs”? Also, how did you get the values? Through the ESXi interface? Also I presume nfsv4 being used in the test?

In a HPC environment where there are a large number of compute nodes and possibly a large shared storage pool, nfs may not be the way to go. Scalable clustered filesystems like gpfs or cxfs rule the roost. Multiple data stream requests per compute node vs the single nfs stream (per mountpoint) would see nfs to be a bottleneck in a parallel multi-process job. Looking forward to pNFS to make this possible. =)

- cfheoh says:
  
  February 14, 2013 at 9:11 am
  
  Hi Andre
  
  Sorry for the late reply. I have a plentiful agenda for the Lunar New Year. Happy Lunar New Year to you!
  
  First of all, the tests were conducted by VMware themselves (there’s a whitepaper out there) and NFSv3 was the protocol. I do agree that NFSv3 has lived its age and the world is slowly adopting to NFSv4.1 (to be exact). NFSv4.2 is around the corner as well, but again, the adoption and implementation of NFSv4 will take time. pNFS, as an extension to v4.1, has proved to be a challenge and pNFS client codes in some of the OSes are not quite ready. I have been keeping abreast with SNIA’s ESF (Ethernet Storage Forum) guys in the NFS SIG regularly.
  
  I am neither familiar with GPFS or CXFS, but I have worked with some IBM folks here in Malaysia. The feedback on GPFS, at a personal level, was just so-so because the IBMers find it quite complex. Perhaps you would have more experience with that. If you have good notes, do share them with me and help me understand GPFS or CXFS values.
  
  I am keeping my fingers crossed with pNFS as well, because I would love to see it getting wide adoption in the HPC space.
  
  All the best to you. Thank you
  
  /Chin-Fah
  
Ryan says:

October 9, 2013 at 7:19 am

Supercomputers generally use cluster based file systems such as lustre, for performance/scalability/availability reasons.
as much as i agree NFS is a good protocol, i disagree that it is currently suitable for vmware unless your doing 10GBE until we see NFS4 support in vmware there’s no multipath and very little reason to use it unless you have no other options.

- cfheoh says:
  
  November 20, 2013 at 2:48 am
  
  Hi Ryan
  
  NFSv3 do have performance limitation as well as high availability concerns because they do not scale well in the present Gigabit networks. Even with 10GbE, NFSv3 do have its challenges.
  
  However, many VMware farms use NFS because of its manageability. Even if NFS does not much VAAI-like capabilities in VMware 5.x, the elegance of managing NFS folders for VM datastore overcomes the complexity of management compared to block-protocols like FC and iSCSI. Check out this link –> http://www.ciosolutions.com/Nimble+Storage+vs+Netapp+-+CASL+WAFL and read the section “What we miss from NetApp”. The author has clearly and eloquently explained the beauty of NFS in VMware farms.
  
  I have worked with some folks in AMD here in Malaysia with tens of PBs, and NFS clearly beats block-based protocols in terms of manageability.
  
  I agree with you the shortcomings of NFSv3. The protocol is coming to 20 years ago and it is already jaded. NFSv4.1 (not NFSv4) should be the way to go. 😉
  
  Thanks for sharing your insights.
  
  Regards
  /Chin-Fah

NFS-phobic in Malaysia

Related

About cfheoh

5 Responses to NFS-phobic in Malaysia

Leave a Reply Cancel reply

Recent Posts

Sponsored Ads

Google Adsense

Recent Comments

Google Adsense

NFS-phobic in Malaysia

Share this:

Related

About cfheoh

5 Responses to NFS-phobic in Malaysia

Leave a Reply Cancel reply

Recent Posts

Sponsored Ads

Google Adsense

Recent Comments

Google Adsense