Amazon Archives - Storage Gaga

At the mercy of the cloud deity

By cfheoh | December 13, 2021 - 8:00 am |December 11, 2021 Amazon, Amazon Web Services, API, Backup, Big Data, Business Continuity, Cloud, Data Availability, Data Management, Data Protection, Data Security, Disaster Recovery, Gartner, Google, IBM, ILM, Microsoft Azure, Oracle Cloud, Reliability, Software-defined Datacenter

Leave a comment

Amazon Web Services (AWS) went down in the middle of last week. News of the outage were mentioned:

AWS Management Console unavailable error

Piling the misery

The AWS outage headlines attract the naysayers, the fickle armchair pundits, and the opportunists. Here are a few news articles that bring these folks to chastise the cloud giant.

Of course, I am one of these critics. I don’t deny that I am not. But I read this situation from a multicloud hyperbole of which I am not a fan. Too much multicloud whitewashing by vendors trying to pitch multicloud as a disaster recovery solution without understanding that this is easier said than done.

Continue reading →

Fueling the Flywheel of AWS Storage

By cfheoh | January 18, 2021 - 9:00 am |January 18, 2021 Amazon, Amazon Web Services, Backup, Business Continuity, Cloud, Data Availability, Data Management, Data Protection, Data Security, Digital Transformation, Disaster Recovery, Filesystems, Google, Microsoft, Microsoft Azure, Object Storage, Software Defined Storage, Software-defined Datacenter

Leave a comment

It was bound to happen. It happened. AWS Storage is the Number 1 Storage Company.

The tell tale signs were there when Silicon Angle reported that AWS Storage revenue was around USD$6.5-7.0 billion last year and will reach USD$10 billion at the end of 2021. That news was just a month ago. Last week, IT Brand Pulse went a step further declaring AWS Storage the Number 1 in terms of revenue. Both have the numbers to back it up.

AWS Logo

How did it become that way? How did AWS Storage became numero uno?

Flywheel juggernaut

I became interested in the Flywheel concept some years back. It was conceived in Jim Collins’ book, “Good to Great” almost 20 years ago, and since then, Amazon.com has become the real life enactment of the Flywheel concept.

Amazon.com Flywheel – How each turn becomes sturdier, brawnier.

Every turn of the flywheel requires the same amount of effort although in the beginning, the noticeable effect is minuscule. But as every turn gains momentum, the returns of each turn scales greater and greater to the fixed efforts of operating a single turn.

Continue reading →

Storage in a shiny multi-cloud space

By cfheoh | September 14, 2020 - 6:22 pm |September 14, 2020 Amazon, Amazon Web Services, Analytics, API, Artificial Intelligence, Backup, Big Data, Business Continuity, Cloud, Clumio, Containers, Data, Data Archiving, Data Availability, Data Fabric, Data Management, Data Privacy, Data Protection, Data Security, Deep Learning, Digital Transformation, Disaster Recovery, Docker, Druva, Gartner, Google, Google Anthos, High Performance Computing, Kubernetes, Machine Learning, Microsoft, Microsoft Azure, Object Storage, Oracle, Oracle Cloud, Rackspace, Software Defined Storage, Software-defined Datacenter, Storage Tiering, Wasabi Cloud

Leave a comment

The multi-cloud for infrastructure-as-a-service (IaaS) era is not here (yet). That is what the technology marketers want you to think. The hype, the vapourware, the frenzy. It is what they do. The same goes to technology analysts where they describe vision and futures, and the high level constructs and strategies to get there. The hype of multi-cloud is often thought of running applications and infrastructure services seamlessly in several public clouds such as Amazon AWS, Microsoft® Azure and Google Cloud Platform, and linking it to on-premises data centers and private clouds. Hybrid is the new black.

Multicloud connectivity to public cloud providers and on-premises private cloud

Multi-Cloud, on-premises, public and hybrid clouds

And the aspiration of multi-cloud is the right one, when it is truly ready. Gartner® wrote a high level article titled “Why Organizations Choose a Multicloud Strategy“. To take advantage of each individual cloud’s strengths and resiliency in respective geographies make good business sense, but there are many other considerations that cannot be an afterthought. In this blog, we look at a few of them from a data storage perspective.

In the beginning there was …

For this storage dinosaur, data storage and compute have always coupled as one. In the mainframe DASD days. these 2 were together. Even with the rise of networking architectures and protocols, from IBM SNA, DECnet, Ethernet & TCP/IP, and Token Ring FC-SAN (sorry, this is just a joke), the SANs, the filers to the servers were close together, albeit with a network buffered layer.

A decade ago, when the public clouds started appearing, data storage and compute were mostly inseparable. There was demarcation of public clouds and private clouds. The notion of hybrid clouds meant public clouds and private clouds can intermix with on-premise computing and data storage but in almost all cases, this was confined to a single public cloud provider. Until these public cloud providers realized they were not able to entice the larger enterprises to move their IT out of their on-premises data centers to the cloud convincingly. So, these public cloud providers decided to reverse their strategy and peddled their cloud services back to on-prem. Today, Amazon AWS has Outposts; Microsoft® Azure has Arc; and Google Cloud Platform launched Anthos.

Continue reading →

Lift and Shift Begone!

By cfheoh | April 25, 2019 - 5:30 pm |April 25, 2019 Amazon, Amazon Web Services, Cloud, Composable Infrastructure, Data Availability, Data Fabric, Data Management, High Performance Computing, Hyperconvergence, Mellanox Technologies, NetApp, NVMe, Server SAN, Software Defined Storage

1 Comment

I am excited. New technologies are bringing the data (and storage) closer to processing and compute than ever before. I believe the “Lift and Shift” way would be a thing of the past … soon.

Data is heavy

Moving data across the network is painful. Moving data across distributed networks is even more painful. To compile the recent first image of a black hole, an amount of 5PB or more had to shipped for central processing. If this was moved over a 10 Gigabit network, it would have taken weeks.

Furthermore, data has dependencies. Snapshots, clones, and other data relationships with applications and processes render data inert, weighing it down like an anchor of a ship.

When I first started in the industry more than 25 years ago, Direct Attached Storage (DAS) was the dominating storage platform. I had a bulky Sun MultiDisk Pack connected via Fast SCSI to my SPARCstation 2 (diagram below):

Then I was assigned as the implementation engineer for Hock Hua Bank (now defunct) retail banking project in their Sibu HQ in East Malaysia. It was the first Sun SPARCstorage 1000 (photo below), running a direct attached Fibre Channel 0.25 Gbps FCAL (Fibre Channel Arbitrated Loop). It was the cusp of the birth of SAN (Storage Area Network).

Photo from https://www.cca.org/dave/tech/sys5/

The proliferation of SAN over the next 2 decades pushed DAS into obscurity, until SAS (Serial Attached SCSI) came about. Added to the mix was the prominence of Cloud Storage. But on-premises storage and Cloud Storage didn’t always come together. There was always a valley between the 2, until the public clouds gained a stronger foothold in the minds of IT and businesses. Today, both on-premises storage and cloud storage are slowly cosying as one Data Singularity, thanks to vision and conceptualization of data fabrics. NetApp was an early proponent of the Data Fabric concept 4 years ago. Continue reading →

Sleepless in Malaysia with Object Storage

By cfheoh | January 22, 2019 - 1:42 pm |January 22, 2019 Amazon, Amazon Web Services, Analytics, API, Big Data, Cloud, Cloudian, Clusters, Data Management, Deep Learning, DellEMC, Dropbox, Filesystems, Google, Hadoop, Hadoop Clusters, HDS, IDC, IoT, iSCSI, Linux, Machine Learning, Microsoft, Minio, NAS, NFS, Object Storage, OpenIO, Redhat, Security, swiftstack

Leave a comment

Object Storage? What’s that?

For the past couple of months, I have been speaking with a few parties in Malaysia about object storage technology. And I was fairly surprised with the responses.

The 2 reports

For a start, I did not set out to talk about object storage. It kind of fell onto my lap. 2 recent Hitachi Vantara reports revealed that countries like Australia, Hong Kong and even South East Asian countries were behind in their understanding of what object storage was, and the benefits it brought to the new generation of web scale and enterprise applications.

In the first report, an IDC survey sponsored by Hitachi Vantara, mentioned that 41% of the enterprises in Australia are not aware of object storage technology. In a similar survey, this one pointing towards Hong Kong and China, the percentages were 38% and 35% respectively. I would presume that the percentages for countries in South East Asia would not fall too far from the apple tree.

How is Malaysia doing?

However, I worry that the percentage number could be far more dire in Malaysia. In the past 2 months, responses from several conversations painted a darker hue about object storage technology with the companies in Malaysia. These included a reasonable sized hosting company, a well-established systems integrator, a software development company, several storage practitioners in Openstack and a DellEMC’s regional consultant for unstructured data. The collective conclusion was object storage technology was relatively unknown (probably similar to the percentages to the IDC/Hitachi Vantara reports), but it appeared to be shunned at this juncture. In web scale applications, Redhat Ceph block and files appeared popular in contrast to Openstack Swift. In enterprise applications, it was a toss of iSCSI and NFS.

Image from https://zdnet4.cbsistatic.com/hub/i/r/2018/04/24/c79e9dfb-b4a9-46bb-b831-f2c57fdf8a1d/resize/470xauto/5e4846d1bc7a034c382baf6dcbb612ed/cloud-storage.jpg

Continue reading →

Microsoft desires Mellanox

By cfheoh | December 20, 2018 - 11:02 am |December 20, 2018 100Gigabit Ethernet, Acquisition, Amazon, Artificial Intelligence, Cloud, Data Fabric, Data Management, Deep Learning, Edge Computing, High Performance Computing, Infiniband, Machine Learning, Mellanox Technologies, Microsoft, NVMe, Storage Field Day, Tech Field Day, Virtualization

Leave a comment

My lazy Thursday morning was spurred by a posting by Stephen Foskett, Chief Organizer of Tech Field Days. “Microsoft mulls the acquisition of Mellanox”

The AWS factor

A quick reaction leans towards a strange one. Microsoft of all people, buying a chip company? Does it make sense? However, leaning deeper, it starts to make some sense. And I believe the desire is spurred by Amazon Web Services announcement of their Graviton processor at AWS re:Invent last month.

AWS acquired Annapurna Labs in early 2015. From the sources, Annapurna was working on low powered, high performance networking chips for the mid-range market. The key words – lower powered, high performance, mid-range – are certainly the musical notes to the AWS opus. And that would mean the ability for AWS to control their destiny, even at the edge. Continue reading →

The Return of SAN and NAS with AWS?

By cfheoh | December 3, 2018 - 10:31 am |December 3, 2018 Amazon, Appliance, Artificial Intelligence, Big Data, CIFS, Cloud, Data Availability, Data Management, Data Protection, Data Security, Deep Learning, Excelero, Fibre Channel, High Performance Computing, Hyperconvergence, iSCSI, Machine Learning, Mellanox, NAS, NetApp, NFS, NVMe, Object Storage, Openstack, Oracle, Reliability, Scale-out architecture, Server SAN, SMB, Snapshots, Software-defined Datacenter, Virtualization, VMware

1 Comment

AWS what?

Amazon Web Services announced Outposts at re:Invent last week. It was not much of a surprise for me because when AWS had their partnership with VMware in 2016, the undercurrents were there to have AWS services come right at the doorsteps of any datacenter. In my mind, AWS has built so far out in the cloud that eventually, the only way to grow is to come back to core of IT services – The Enterprise.

Their intentions were indeed stealthy, but I have been a believer of the IT pendulum. What has swung out to the left or right would eventually come back to the centre again. History has proven that, time and time again.

SAN and NAS coming back?

A friend of mine casually spoke about AWS Outposts announcements. “Does that mean SAN and NAS are coming back?” I couldn’t hide my excitement hearing the return but … be still, my beating heart!

I am a storage dinosaur now. My era started in the early 90s. SAN and NAS were a big part of my career, but cloud computing has changed and shaped the landscape of on-premises shared storage. SAN and NAS are probably closeted by the younger generation of storage engineers and storage architects, who are more adept to S3 APIs and Infrastructure-as-Code. The nuts and bolts of Fibre Channel, SMB (or CIFS if one still prefers it), and NFS are of lesser prominence, and concepts such as FLOGI, PLOGI, SMB mandatory locking, NFS advisory locking and even iSCSI IQN are probably alien to many of them.

What is Amazon Outposts?

In a nutshell, AWS will be selling servers and infrastructure gear. The AWS-branded hardware, starting from a single server to large racks, will be shipped to a customer’s datacenter or any hosting location, packaged with AWS popular computing and storage services, and optionally, with VMware technology for virtualized computing resources.

Taken from https://aws.amazon.com/outposts/

In a move ala-Azure Stack, Outposts completes the round trip of the IT Pendulum. It has swung to the left; it has swung to the right; it is now back at the centre. AWS is no longer public cloud computing company. They have just become a hybrid cloud computing company. Continue reading →

Oracle Cloud Infrastructure to prove skeptics wrong

By cfheoh | October 24, 2018 - 6:53 am |October 24, 2018 Amazon, Analytics, Artificial Intelligence, Big Data, Cloud, Clusters, Data Availability, Data Management, Deep Learning, Disaster Recovery, Flash, High Performance Computing, Machine Learning, MapReduce, Object Storage, Oracle, Oracle Cloud, Performance Benchmark, Reliability, Scale-out architecture, Software-defined Datacenter, Storage Field Day, Tech Field Day, Virtualization

1 Comment

[Preamble: I have been invited by GestaltIT as a delegate to their TechFieldDay from Oct 17-19, 2018 in the Silicon Valley USA. My expenses, travel and accommodation are covered by GestaltIT, the organizer and I was not obligated to blog or promote their technologies presented at this event. The content of this blog is of my own opinions and views]

The much maligned Oracle Cloud is getting a fresh reboot, starting with their Oracle Cloud Infrastructure (OCI), and significant enhancements and technology updates were announced at the Oracle Open World this week. I had the privilege to hear about Oracle Cloud’s new attack plan when they presented at Tech Field Day 17 last week.

Oracle Cloud has not have the best of days in recent months. Thomas Kurian’s resignation as their President of Product Development was highly publicized in a disagreement with CTO and founder, Larry Ellison over cloud software strategy. Then there was an on-going lawsuit about how Oracle was misrepresenting their cloud revenue growth, which puts Oracle in a bad light.

On the local front here in Malaysia, I have heard from the grapevine of the aggressive nature of Oracle personnel pushing partners and customers to adopt their cloud services using legal scare tactics on their database licensing. A buddy of mine, who was previously the cloud business development manager at CTC Global, also shared Oracle’s cloud shortcomings compared to Amazon Web Service and Microsoft Azure a year ago.

Oracle Cloud Infrastructure team aimed to turnover the bad perceptions, starting with the delegates of Tech Field Day 17, including yours truly.Their strategy was clear. Oracle Cloud Infrastructure runs the highest performance and the highest enterprise grade Infrastructure-as-a-Service (IaaS), bar none. Unlike the IBM Cloud, which in my opinion is a wishy-washy cloud service platform, Oracle Cloud’s ambition is solid.

They did a demo on JDEdwards EnterpriseOne application, and they continue to demonstrate their prowess running the highest performance computing experience ever, for all enterprise-grade workload. And that enterprise pedigree is clear.

Just this week, Amazon Prime Day had an outage. Amazon is in the process of weaning Oracle database from their entire ecosystem by 2020, and this outage clearly showed that the Oracle database and the enterprise applications would only run best on Oracle Cloud Infrastructure.

Continue reading →

Magic happening

By cfheoh | March 8, 2018 - 1:30 pm |March 8, 2018 Amazon, Apple, Backup, BYOD, Cloud, Data Management, Deduplication, Disks, Dropbox, Filesystems, Object Storage, Reliability, Scale-out architecture, Security, Software Defined Storage, Storage Field Day, Uncategorized

2 Comments

[Preamble: I am a delegate of Storage Field Day 15 from Mar 7-9, 2018. My expenses, travel and accommodation are paid for by GestaltIT, the organizer and I am not obligated to blog or promote the technologies presented at this event. The content of this blog is of my own opinions and views]

The magic is happening.

Dropbox, the magical disruptor, is going IPO.

When Dropbox first entered into the market which eventually termed as BYOD (Bring your Own Device), it was a phenomenon. There was nothing else that matched its simplicity and ease-of-use. A file uploaded into the cloud was instantaneously available on the tablets and smart phones. It was on every storage vendor’s presentation slides, using Dropbox as the perennial name dropping tactic to get end users buy-in.

Dropbox was more than that, and it went on to define a whole new market segment known as Enterprise File Synchronization and Sharing (EFSS), together with everybody else such as Box, Easishare (they are here in South East Asia), and just about everybody else. And the executive team at Dropbox knew they were special too, so much so that they rejected a buyout attempt by Apple in 2011.

Today, Dropbox is beyond BYOD and EFSS. They are a full fledged collaboration platform that includes project management, project workflow, file versioning, secure file transfer, smart file synchronization and Dropbox Paper. And they offer comprehensive plans from Basic, Plus and Professional to Business and Enterprise. Their upcoming IPO, I am sure, will give them far greater capital to expand, and realize their full potential as the foremost content-based collaboration platform in the world.

Dropbox began their exodus from AWS a couple of years ago. They wanted to control their destiny and have moved more than 500PB into their own private data center for their customer data. That was half-an-exabyte, people! And two years later, they saved $75million of operating costs after they exited AWS. Today, they have more than 1 Exabyte of customer data! That is just incredible.

And Dropbox’s storage architecture started with a simple foundational design called “Magic Pocket“. Magic Pocket is a “fixed-length, immutable” block storage layer.

The block size is fixed at 4MB chunks (for parallel performance and service resumption reasons), compressed and deduped (for capacity savings reasons), encrypted (for security reasons) and replicated (for high availability reasons).

Continue reading →

Of Object Storage, Filesystems and Multi-Cloud

By cfheoh | November 22, 2017 - 12:25 pm |November 22, 2017 Amazon, CIFS, Cloud, Cloudian, Data Availability, Data Fabric, Data Management, Elastifile, Filesystems, High Performance Computing, Hyperconvergence, Nasuni, NFS, Object Storage, OpenIO, Openstack, Performance Benchmark, Performance Caching, Reliability, Scale-out architecture, Scality, Server SAN, SMB, Software Defined Storage, Software-defined Datacenter, Storage Optimization, swiftstack, Uncategorized, Virtualization

1 Comment

Data storage silos everywhere. The early clarion call was to eliminate IT data storage silos by moving to the cloud. Fast forward to the present. Data storage silos are still everywhere, but this time, they are in the clouds. I blogged about this.

Object Storage was all the rage when it first started. AWS, with its S3 (Simple Storage Service) offering, started the cloud storage frenzy. Highly available, globally distributed, simple to access, and fitted superbly into the entire AWS ecosystem. Quickly, a smorgasbord of S3-compatible, S3-like object-based storage emerged. OpenStack Swift, HDS HCP, EMC Atmos, Cleversafe (which became IBM SpectrumScale), Inktank Ceph (which became RedHat Ceph), Bycast (acquired by NetApp to be StorageGrid), Quantum Lattus, Amplidata, and many more. For a period of a few years prior, it looked to me that the popularity of object storage with an S3 compatible front has overtaken distributed file systems.

What’s not to like? Object storage are distributed, they are metadata rich (at a certain structural level), they are immutable (hence secure from a certain point of view), and some even claim self-healing (depending on data protection policies). But one thing that object storage rarely touted dominance was high performance I/O. There were some cases, but they were either fronted by a file system (eg. NFSv4.1 with pNFS extensions), or using some host-based, SAN-client agent (eg. StorNext or Intel Lustre). Object-based storage, in its native form, has not been positioned as high performance I/O storage.

A few weeks ago, I read an article from Storage Soup, Dave Raffo. When I read it, it felt oxymoronic. SwiftStack was just nominated as a visionary in the Gartner Magic Quadrant for Distributed File Systems and Object Storage. But according to Dave’s article, Swiftstack did not want to be “associated” with object storage that much, even though Swiftstack’s technology underpinning was all object storage. Strange.

Continue reading →

Category Archives: Amazon