Understanding security practices in File Synchronization

Ho hum. Another day, and another data leak. What else is new?

The latest hullabaloo in my radar was from one of Malaysia’s reverent universities, UiTM, which reported a data leak of 11,891 student applicants’ private details including MyKad (national identity card) numbers of each individual. Reading from the news article, one can deduced that the unsecured link mentioned was probably from a cloud storage service, i.e. file synchronization software such as OneDrive, Google Drive, Dropbox, etc. Those files that can be easily shared via an HTTP/S URL link. Ah, convenience over the data security best practices. 

Cloud File Sync software

It irks me when data security practices are poorly practised. And it is likely that there is ignorance of data security practices in the first place.

It also irks me when many end users everywhere I have encountered tell me their file synchronization software is backup. That is just a very poor excuse of a data protection strategy, if any, especially in enterprise and cloud environments. Convenience, set-and-forget mentality. Out of sight. Out of mind. Right? 

Convenience is not data security. File Sync is NOT Backup

Many users are used to the convenience of file synchronization. The proliferation of cloud storage services with free Gigabytes here and there have created an IT segment based on BYOD, which transformed into EFSS, and now CCP. The buzzword salad involves the Bring-Your-Own-Device, which evolved into Enterprise-File-Sync-&-Share, and in these later years, Content-Collaboration-Platform.

All these are fine and good. The data industry is growing up, and many are leveraging the power of file synchronization technologies, be it on on-premises and from cloud storage services. Organizations, large and small, are able to use these file synchronization platforms to enhance their businesses and digitally transforming their operational efficiencies and practices. But what is sorely missing in embracing the convenience and simplicity is the much ignored cybersecurity housekeeping practices that should be keeping our files and data safe.

Continue reading

Reverting the Cloud First mindset

When cloud computing was all the rage, every business wanted to be on-board. Those who resisted felt the heat as the FOMO (fear of missing out) feeling set in, especially those who were doing this thing called “Digital Transformation“. The public cloud service providers took advantage of the cloud computing frenzy, calling for a “Cloud First” strategy. For a number of years, the marketing worked. The cloud first mentality became the tip of the tongue of many, encouraging droves to cloud adoption.

All this was fine and dandy but recently, we are beginning to hear and read about a few high profile cases of cloud repatriation. DHH‘s journal of Basecamp’s exit from AWS in late 2022 reverberated strongly, saying what should be a wake up call for those caught in the Cloud Computing Hotel California’s gilded cage. An even more bizarre claim about cost savings of $400 million over 3 years was made by Ahrefs, a Singapore SEO software maker which chose to use a co-location facility instead of a public cloud service.

Cloud First is not Cool (not sure where is the source is from but I got this off Twitter some months ago)

While these big news jail breaks are going against the grain, most are still in that diaspora to jump into the cloud services everywhere. In droves even. But, on and off, I am beginning to hear some grips, grunts and groans from end users in the cloud. These news have emboldened some to think that there is another choice besides shifting all IT and data services to the cloud.

Continue reading

Project COSI

The S3 (Simple Storage Service) has become a de facto standard for accessing object storage. Many vendors claim 100% compatibility to S3, but from what I know, several file storage services integration and validation with the S3 have revealed otherwise. There are certain nuances that have derailed some of the more advanced integrations. I shall not reveal the ones that I know of, but let us use this thought as a basis of our discussion for Project COSI in this blog.

Project COSI high level architecture

What is Project COSI?

COSI stands for Container Object Storage Interface. It is still an alpha stage project in Kubernetes version 1.25 as of September 2022 whilst the latest version of Kubernetes today is version 1.26. To understand the objectives COSI, one must understand the journey and the challenges of persistent storage for containers and Kubernetes.

For me at least, there have been arduous arguments of provisioning a storage repository that keeps the data persistent (and permanent) after containers in a Kubernetes pod have stopped, or replicated to another cluster. And for now, many storage vendors in the industry have settled with the CSI (container storage interface) framework when it comes to data persistence using file-based and block-based storage. You can find a long list of CSI drivers here.

However, you would think that since object storage is the most native storage to containers and Kubernetes pods, there is already a consistent way to accessing object storage services. From the objectives set out by Project COSI, turns out that there isn’t a standard way to provision and accessing object storage as compared to the CSI framework for file-based and block-based storage. So the COSI objectives were set to:

  • Kubernetes Native – Use the Kubernetes API to provision, configure and manage buckets
  • Self Service – A clear delineation between administration and operations (DevOps) to enable self-service capability for DevOps personnel
  • Portability – Vendor neutrality enabled through portability across Kubernetes Clusters and across Object Storage vendors

Further details describing Project COSI can be found here at the Kubernetes site titled “Introducing COSI: Object Storage Management using Kubernetes API“.

Standardization equals technology adoption

Standardization means consistency, control, confidence. The higher the standardization across the storage and containerized apps industry, the higher the adoption of the technology. And given what I have heard from the industry over these few years, Kubernetes, to me, even till this day, is a platform and a framework that are filled and riddled with so many moving parts. Many of the components looks the same, feels the same, and sounds the same, but might not work out the same when deployed.

Therefore, the COSI standardization work is important and critical to grow this burgeoning segment, especially when we are rocketing towards disaggregation of computing service units, resources that be orchestrated to scale up or down at the execution of codes. Infrastructure-as-Code (IAC) is becoming a reality more and more with each passing day, and object storage is at the heart of this transformation for Kubernetes and containers.

Continue reading

Rethinking data processing frameworks systems in real time

“Row, row, row your boat, gently down the stream…”

Except the stream isn’t gentle at all in the data processing’s new context.

For many of us in the storage infrastructure and data management world, the well known framework is storing and retrieve data from a storage media. That media could be a disk-based storage array, a tape, or some cloud storage where the storage media is abstracted from the users and the applications. The model of post processing the data after the data has safely and persistently stored on that media is a well understood and a mature one. Users, applications and workloads (A&W) process this data in its resting phase, retrieve it, work on it, and write it back to the resting phase again.

There is another model of data processing that has been bubbling over the years and now reaching a boiling point. Still it has not reached its apex yet. This is processing the data in flight, while it is still flowing as it passes through processing engine. The nature of this kind of data is described in one 2018 conference I chanced upon a year ago.

letgo marketplace processing numbers in 2018

  • * NRT = near real time

From a storage technology infrastructure perspective, this kind of data processing piqued my curiosity immensely. And I have been studying this burgeoning new data processing model in my spare time, and where it fits, bringing the understanding back into the storage infrastructure and data management side.

Continue reading

At the mercy of the cloud deity

Amazon Web Services (AWS) went down in the middle of last week. News of the outage were mentioned:

AWS Management Console unavailable error

Piling the misery

The AWS outage headlines attract the naysayers, the fickle armchair pundits, and the opportunists. Here are a few news articles that bring these folks to chastise the cloud giant.

Of course, I am one of these critics. I don’t deny that I am not. But I read this situation from a multicloud hyperbole of which I am not a fan. Too much multicloud whitewashing by vendors trying to pitch multicloud as a disaster recovery solution without understanding that this is easier said than done.

Continue reading

Control your Files. Control your Sovereignty.

Data residency, data sovereignty, data localization – the trio of data compliance and governance – have been on my mind a lot lately. I am seeing a disturbing trend. “Splinternet” has taken a hurried and hastened pace. We are now seeing many countries drawing up digital boundaries in the name of data privacy and data protection with sovereign laws and regulations. Besides, these digital demarcation along the lines with data definitions, digital “colonization” is a strong undercurrent as developing countries are accepting larger and more powerful foreign powers into their playpen.

Public cloud services transcend national borders. The breakneck speed in the adoption of public cloud services is causing anxieties and concerns with conservative governments everywhere. On the flip side of the coin, commerce has certainly flourished and bloomed as global wide collaborations bring new opportunities, new markets – all for capitalism and growth.

[ Note: While we are on this debacle, the voices of decentralization are getting louder as well, but that is a topic for another day ]

Where are your data files now?

Continue reading

Don’t go to the Clouds. Come back!

Almost in tandem last week, Nutanix™ and HPE appeared to have made denigrated comments about Cloud First mandates of many organizations today. Nutanix™ took to the annual .NEXT conference to send the message that cloud is wasteful. HPE campaigned against a UK Public Sector “Cloud First” policy.

Cloud First or Cloud Not First

The anti-cloud first messaging sounded a bit funny and hypocritical when both companies have a foot in public clouds, advocating many of their customers in the clouds. So what gives?

That A16Z report

For a numbers of years, many fear criticizing the public cloud services openly. For me, there are the 3 C bombs in public clouds.

  • Costs
  • Complexity
  • Control (lack of it)

Yeah, we would hear of a few mini heart attacks here and there about clouds overcharging customers, and security fallouts. But vendors then who were looking up to the big 3 public clouds as deities, rarely chastise them for the errors. Until recently.

The Cost of Cloud, a Trillion Dollar Paradox” released by revered VC firm Andreessen Horowitz in May 2021 opened up the vocals of several vendors who are now emboldened to make stronger comments about the shortcomings of public cloud services. The report has made it evident that public cloud services are not panacea of all IT woes.

The report has made it evident that public cloud services are not panacea of all IT woes. And looking at the trends, this will only get louder.

Use ours first. We are better

It is pretty obvious that both Nutanix™ and HPE have bigger stakes outside the public cloud IaaS (infrastructure-as-a-service) offerings. It is also pretty obvious that both are not the biggest players in this cloud-first economy. Given their weights in the respective markets, they are leveraging their positions to swing the mindsets to their turf where they can win.

“Use our technology and services. We are better, even though we are also in the public clouds.”

Not a zero sum game

But IT services and IT technologies are not a zero sum game. Both on-premises IT services and complementary public cloud services can co-exist. Both can leverage on each other’s strengths and support each other’s weaknesses, if you know how to blend and assimilate the best of both worlds. Hybrid cloud is the new black.

Gartner Hype Cycle

The IT pendulum swings. Technology hype goes fever pitch. Everyone thinks there is a cure for cancer. Reality sets in. They realize that they were wrong (not completely) or right (not completely). Life goes on. The Gartner® Hype Cycle explains this very well.

The cloud is OK

There are many merits having IT services provisioned in the cloud. Agility, pay-per-use, OPEX, burst traffic, seemingly unlimited resources and so. You can read more about it at Benefits of Cloud Computing: The pros and cons. Even AWS agrees to Three things every business needs from hybrid cloud, perhaps to the chagrin of these naysayers.

I opined that there is no single solution for everything. There is no Best Storage Technology Ever (a snarky post). And so, I believe there is nothing wrong of Nutanix™ and HPE, and maybe others, being hypocritical of their cloud and non-cloud technology offerings. These companies are adjusting and adapting to the changing landscapes of the IT environments, but it is best not to confuse the customers what tactics, strategy and vision are. Inconsistencies in messaging diminishes trust.

 

 

SSOT of Files

[ This is part two of “Where are your files living now?”. You can read Part One here ]

Data locality, Data mobility“. It was a term I like to use a lot when describing about data consolidation, leading to my mention about files and folders, and where they live in my previous blog. The thinking of where the files and folders are now as in everywhere as they can be in a plethora of premises stretches the premise of SSOT (Single Source of Truth). And this expatriation of files with minimal checks and balances disturbs me.

A year ago, just before I joined iXsystems, I was given Google® embargoed news, probably a week before they announced BigQuery Omni. Then I was interviewed by Enterprise IT News, a local Malaysian technology news portal to provide an opinion quote. This was what I quoted:

“’The data warehouse in the cloud’ managed services of Big Query is underpinned by Google® Anthos, its hybrid cloud infra and service management platform based on GKE (Google® Kubernetes Engine). The containerised applications, both on-prem and in the multi-clouds, would allow Anthos to secure and orchestrate infra, services and policy management under one roof.”

I further quoted ” The data repositories remain in each cloud is good to address data sovereignty, data security concerns but it did not mention how it addresses “single source of truth” across multi-clouds.

Single Source of Truth – regardless of repositories

Continue reading

Where are your files living now?

[ This is Part One of a longer conversation ]

EMC2 (before the Dell® acquisition) in the 2000s had a tagline called “Where Information Lives™**. This was before the time of cloud storage. The tagline was an adage of enterprise data storage, proper and contemporaneous to the persistent narrative at the time – Data Consolidation. Within the data consolidation stories, thousands of files and folders moved about the networks of the organizations, from servers to clients, clients to servers. NAS (Network Attached Storage) was, and still is the work horse of many, many organizations.

[ **Side story ] There was an internal anti-EMC joke within NetApp® called “Information has a new address”.

EMC tagline “Where Information Lives”

This was a time where there were almost no concerns about Shadow IT; ransomware were less known; and most importantly, almost everyone knew where their files and folders were, more or less (except in Oil & Gas upstream – to be told in later in this blog). That was because there were concerted attempts to consolidate data, and inadvertently files and folders, in the organization.

Even when these organizations were spread across the world, there were distributed file technologies at the time that could deliver files and folders in an acceptable manner. Definitely not as good as what we have today in a cloudy world, but acceptable. I personally worked a project setting up Andrew File Systems for Intel® in Penang in the mid-90s, almost joined Tacit Networks in the mid-2000s, dabbled on Microsoft® Distributed File System with NetApp® and Windows File Servers while fixing the mountains of issues in deploying the worldwide GUSto (Global Unified Storage) Project in Shell 2006. Somewhere in my chronological listings, Acopia Networks (acquired by F5) and of course, EMC2 Rainfinity and NetApp® NuView OEM, Virtual File Manager.

The point I am trying to make here is most IT organizations had a good grip of where the files and folders were. I do not think this is very true anymore. Do you know where your files and folders are living today? 

Continue reading

My 2-day weekend with Nextcloud on FreeNAS

In recent weeks, I have been asked by friends and old cust0mers on how to extend their NAS shared drives to work-from-home, the new reality. Malaysia went into a full lockdown as of June 1st several days ago.

I have written about file synchronization stories before but I have never done a Nextcloud blog. I have little experience with TrueNAS® CORE Nextcloud plugin and this was a good weekend to build it up from scratch with Virtualbox with FreeNAS™ 11.2U5 (because my friend was using that version).

[ Note ] FreeNAS™ 11.2U5 has been EOLed.

Nextcloud login screen

So, here it how it went for my little experiment. FYI, this is not a How-to guide. That will come later after I have put all my notes together with screenshots and all. This is just a collection of my thoughts while setting up Nextcloud on FreeNAS™.

Dropbox® is expensive

Using cloud storage with file sync and share capability is not exactly a cheap thing especially when you are a small medium sized business or a school or a charity organization. Here is the pricing table for Dropbox® for Business :

Dropbox for business pricing

I am using Dropbox® as the example here but the same can be said for OneDrive or Google Drive and others. The pricing can quickly add up when the price is calculated per user per month.

Continue reading