Why demote archived data access?

We are all familiar with the concept of data archiving. Passive data gets archived from production storage and are migrated to a slower and often, cheaper storage medium such tapes or SATA disks. Hence the terms nearline and offline data are created. With that, IT constantly reminds users that the archived data is infrequently accessed, and therefore, they have to accept the slower access to passive, archived data.

The business conditions have certainly changed, because the need for data to be 100% online is becoming more relevant. The new competitive nature of businesses dictates that data must be at the fingertips, because speed and agility are the new competitive advantage. Often the total amount of data, production and archived data, is into hundred of TBs, even into PetaBytes!

The industries I am familiar with – Oil & Gas, and Media & Entertainment – are facing this situation. These industries have a deluge of files, and unstructured data in its archive, and much of it dormant, inactive and sitting on old tapes of a bygone era. Yet, these files and unstructured data have the most potential to be explored, mined and analyzed to realize its value to the organization. In short, the archived data and files must be democratized!

The flip side is, when the archived files and unstructured data are coupled with a slow access interface or unreliable storage infrastructure, the value of archived data is downgraded because of the aggravated interaction between access and applications and business requirements. How would organizations value archived data more if the access path to the archived data is so damn hard???!!!

An interesting solution fell upon my lap some months ago, and putting A and B together (A + B), I believe the access path to archived data can be unbelievably of high performance, simple, transparent and most importantly, remove the BLOODY PAIN of FILE AND DATA MIGRATION!  For storage administrators and engineers familiar with data migration, especially if the size of the migration is into hundreds of TBs or even PBs, you know what I mean!

I have known this solution for some time now, because I have been avidly following its development after its founders left NetApp following their Spinnaker venture to start Avere Systems.

avere_220

Continue reading

HDS HNAS kicks ass

I am dusting off the cobwebs of my blog. After almost 3 months of inactivity, (and trying to avoid the Social Guidelines Media of my present company), I have bolstered enough energy to start writing again. I am tired, and I am finishing off the previous engagements prior to joining HDS. But I am glad those are coming to an end, with the last job in Beijing next week.

So officially, I will be in HDS as of November 4, 2013 . And to get into my employer’s good books, I think I should start with something that HDS has proved many critics wrong. The notion that HDS is poor with NAS solutions has been dispelled with a recent benchmark report from SPECSfs, especially when it comes to NFS file performance. HDS has never been much of a big shouter about their HNAS, even back in the days of OEM with BlueArc. The gap period after the BlueArc acquisition was also, in my opinion, quiet unless it was the gestation period for this Kick-Ass announcement a couple of weeks ago. Here is one of the news circling in the web, from the ever trusty El-Reg.

HDS has never been big shouting like the guys, like EMC and NetApp, who have plenty of marketing dollars to spend. EMC Isilon and NetApp C-Mode have always touted their mighty SPECSfs numbers, usually with a high number of controllers or nodes behind the benchmarks. More often than not, many readers would probably focus more on the NFSops/sec figures rather than the number of heads required to generate the figures.

Unaware of this HDS announcement, I was already asking myself that question about NFSops/sec per SINGLE controller head. So, on September 26 2013, I did a table comparing some key participants of the SPECSfs2008_nfs.v3 and here is the table:

SPECSfs2008_nfs.v3-26-Sept-2013In the last columns of the 2 halves (which I have highlighted in Red), the NFSops/sec/single controller head numbers are shown. I hope that readers would view the performance numbers more objectively after reading this. Therefore, I let you make your own decisions but ultimately, they are what they are. One should not be over-mesmerized by the super million NFSops/sec until one looks under the hood. Secondly, one should also look at things more holistically such as $/NFSops/sec, $/ORT (overall response time), and $/GB/NFSops/managed and other relevant indicators of the systems sold.

But I do not want to take the thunder away from HDS’ HNAS platforms in this recent benchmark. In summary,

HDS SPECbench summaryTo reach a respectable number of 607,647 NFSops/sec with a sub-second response time is quite incredible. The ORT of 0.59 msecs should not be taken lightly because to eke just about a 0.1 msec is not easy. Therefore, reaching 0.5 millisecond is pretty awesome.

This is my first blog after 3 months. I am glad to be back and hopefully with the monkey off my back (I am referring to my outstanding engagements), I can concentrating on writing good stuff again. I know, I know … I still owe some people some entries. It’s great to be back 🙂

SMP than VMware

VMware is not a panacea for all your server virtualization requirements but because they do fantastic marketing (not to mention doing 1 small seminar every 1.5-2 months here in Malaysia last year), everyone thinks they are the only choice for server virtualization.

Efforts from Citrix Xen, Microsoft Hyper-V and RedHat Virtualization do not seem to make a dent into VMware’s armour and it is beginning to feel that VMware is the only choice for server virtualization. However, every new server virtualization proposal would end up with the customer buying a brand new, much more powerful server. More CPUs, more cores, and more RAM (I am not going into VMware vRAM licensing issues here but customers know they are caged-in).

You see, VMware’s style of server virtualization is a in-system virtualization. The amount of physical resources within the system are being pooled, virtualized and shared with the virtual machines (VMs) in the physical chassis. With exception to the concept of distributed vSwitches (dvSwitch), CPUs, processing CPU cores and RAM are pretty much confined within what’s available in the physical box in most server virtualization environment. You can envision the concept of VMware’s in-system virtualization in the diagram below:

So, the consolidation (and virtualization) phase of older physical servers would involve packing tons of CPU cores and tons of RAMs in a newer, high end server.

I just visited a prospect a few days ago. For about 30 users for an ERP system and perhaps 100 users of Zimbra mailboxes, he lamented that he had to invest into 2 Dell R710 servers with 64GB of RAM each and sporting 2 x 8-core Intel Xeon. That sounded to like an overkill but that is what is happening here in this part of the world. The customer is given the perception and the doubt of inadequacy when they virtualize their servers. “What if I don’t have enough cores?; what if I don’t have enough RAM?” That in itself is the typical Malaysian (and Singaporean) kiasu mentality. Check out the Wikipedia definition of kiasu here.

Such a high-end server costs a lot of moolahs. And furthermore, the scalability and performance of the virtualized servers in the VMs are trapped within how much these servers can scale physically. If the server is maxed out at 16-cores and 128GB of RAM, then the customer to upgrade again with a server forklift. That’s not good.

And one more thing. VMware server virtualization is not ready for High Performance Computing (HPC) …yet.

Let’s look at this in another way. Let’s assume that you can look the server virtualization approach in an outward manner rather than the inward within kind of thinking, like the VMware in-system method.

What if you can invest in lower-end x86 servers with 1 x quad-core CPUs, with 8GB of RAM? What if you can put aggregate many of these lower-end servers together and build a large cluster of lower-end x86 servers into a huge symmetric multiprocessing server farm that supports 1,024 CPUs of 16,384 cores, 64TB of RAM? Have a look at this video that explains what I just mentioned:

ScaleMP video

Yeah, yeah .. it’s a marketing video from ScaleMP. But I am looking beyond the company and looking at the possibility of this out-system type of server virtualization. The ability to pool together all the CPU processing power of many physical servers and the aggregation of physical RAMs of all the combined servers into a single shared memory architecture unleashes the true power of server virtualization. This is THE next generation symmetric multiprocessing (SMP) architecture, and it breaks free from the limitations and scalability the in-ward virtualization of physical servers.

In the past, SMP system rely on heavy programmability of the applications to scale with SMP systems. Applications didn’t necessary scale on-the-fly with SMP systems, and some level of configuration and programming have to be applied to address the proprietary  SMP methods and interconnects. ScaleMP’s vSMP Foundation hypervisor solution removes the proprietary nature of SMP and bringing x86 server virtualization to meet the demands of HPC.

Here’s a look at the high level architecture of ScaleMP vSMP:

This type architecture brings similarity to RNA Networks solutions that I blogged some time ago. RNA Network, which was acquired by Dell late last year, based their solution on the RDMA technology and protocol, and was more about enhancing scalability and performance with memory pooling via Memory Cloud. ScaleMP’s patent-pending technology is more than that. It pools both memory and processing cores as well, giving it greater scalability and performance, the much needed resources for the demands of HPC environments.

The folks at ScaleMP contacted me a couple of weeks back and shared some of their marketing datasheets and whitepapers. While the information passed to me were OK, I wish the information could have a deeper dive into the technology and implementation as well. I hope they could share it, and I don’t mind signing an NDA.

Well, this is done pro bono, because I want everyone to know the choices and possibilities out there. It is my worldly cause to have people educated because only by being informed, we make better choices. The server virtualization world isn’t always about VMware, you know.