First looks into Interplanetary File System

The cryptocurrency craze has elevated another strong candidate in recent months. Filecoin, is leading the voice of a decentralized Internet, the next generation Web 3.0. In this blog, I am not going to write much about the Filecoin frenzy but the underlying distributed file system that powers this phenomenon – The Interplanetary File System.

[ Note: This is still a very new area for me, and the rest of the content of this blog is still nascent and developing ]

Interplanetary File System

Tremulous Client-Server web architecture

The entire Internet architecture is almost client and server. Your clients like browsers, apps, connect to Web services served from a collection of servers. As Web 3.0 approaches (some say it is already here), the client-server model is no longer perceived as the Internet architecture of choice. Billions, and billions of users, applications, devices relying solely on a centralized service would lead to many impactful consequences, and the reasons for decentralization, away from the client-server architecture models of the Internet are cogent.

This reliance on centralization can lead to dire consequences, and heck, just last week, some of us read about the WD My Book Live NAS devices executed a factory reset and deleted all data on the devices. Whether this was en masse or not, this showed one of the burgeoning vulnerabilities of a centralized client-service architecture.

IPFS decentralized storage network

Interplanetary File System (IPFS) is a decentralized storage network where clients can store and retrieve data, just like what we do now with Google® Photos, or Dropbox®, and many other cloud storage services today. However, clients use IPFS’ protocol put and get (similar to HTTP/S put and get) operations to the distributed peer-to-peer storage network. The clients, using the diagram below as reference, are the IPFS cubes on the top left corner (Add Data = Storing Data using ipfs put) and bottom right corner (Request Data = Retrieving Data using ipfs get).

Interplanetary File System (IPFS) Peer-to-Peer Network

In the middle circular structure of the diagram is the P2P network where the IPFS nodes exchange information and sharing data using the IPFS protocols. Each of the nodes holds the one or more structures (either in part or in entirety) of a few distributed hash tables (DHT) housing key-value information using a lookup table that links each and every object with the content identifier, the CID (the key) and the hashed value of the object. For simplicity sake, each node is considered a Merkel Directed Acyclic Graph (DAG) node, an implementation variant of the Merkel Tree, and is immutable.

From another angle, the diagram below shows how clients [ in (2) ], interacting with the IPFS P2P Network [ in (3) ], where each IPFS cube (called a Merkel DAG node), housing and dynamically updating the Distributed Hash Table, DHT [ in (4) ] as everything from objects, key-value pairs, nodes, changes continuously.

The blockchain aspect [ in (1) ] is not discussed in the blog, but is the foundational storage and retrieval incentivization system for Filecoin. Terms like storage miners and retrieval miners are now coming out from the technobabble into the mainstream speak of the cryptocurrency players.

IPFS DHT Key-to-Block Storage example

Content-based addressing

Now that I have explained lightly about IPFS decentralized storage network, we now look at IPFS hashing, which is shown at (5) in the diagram above.

Using the client-server architecture model mentioned earlier, the client must be able to identify and recognize the location of the server. This could be a URL (Universal Resource Locator) for HTTP/S such as https://www.example.com. The domain name www.example.com is an FQDN (fully qualified domain name) resolved by a DNS (Domain Name System), which translates the human readable www.example.com into an IP address. Thus, the Internet ecosystem, pretty much for decades, runs on a location-based addressing system.

Using the BitTorrent concept, files are shared via the P2P network protocol as described in IPFS. The files, or in the IPFS speak, the data objects are sharded and hashed into a Merkel DAG. The diagram below shows how a cat picture is sharded into multiple commit objects. A CID content identifier is computed for tagging and identifying the content.

IPFS sharding and hashing a cat photo

Each commit object is 256KB, and data larger than 256KB will sharded into multiple chunks of the content, and then chained through metadata links sequentially from objects to objects to objects, until the last piece that is smaller than 256KB.

The CID content identifier represents the object and once requested (either via ipfs ls, or ipfs add or ipfs pin rm), will recollect and retrieve the content from the most relevant assembled “address label” of the content. This is IPFS content addressing. And this is where it is different from location-based addressing in many client-server architectures. With the content address, the client can get to the data objects without the requirement to know the location of the data objects are. This removes the centralization of a data storage service provider and partitioning the entire content across the distributed decentralized IPFS network. This leads to inherent advantages such as deduplication of contents (further space savings), versioning (data permanence), immutability and most of all, security.

Decentralization beyond Cryptocurrencies

Cryptocurrencies based Proof-of-Space (PoS) and Proof-of-Space-Time (PoST) have been gaining in popularity in the past year, reaching crescendo in the last 2 months. Chia coin has been the loudest, and I blogged about it a few weeks ago. But Filecoin’s ICO (initial coin offering) should not discounted as well, not just of its financial prospects, because the Internet is now become a realm of social and political contention. These are further exacerbated acutely by the rampant, ever more devastating impacts of ransomware and cybersecurity incidents of late.

With these in mind, decentralization presents a strong case to supplant the current state of the Internet. IPFS is in the leader pack of the decentralization conversation.

One Response to First looks into Interplanetary File System

mycatstinx says:

January 29, 2023 at 1:42 am

Your article is very well placed together and is of great interest and reading,as blockchain develops and it is developing at present at such a fast pace ideas and peoples general can quickly vaporise …
Being completely ignorant in the area of writing and understanding code scripts which puts us in the 95 percent of the human race…yes you guys and girls who can program overlooked that little quirky!! percentage it is possible to see the overall picture whether its svg,jpg,png.Its simple users of twitter,facebook,instragram are feeders they are like junkies with bad habits or bad persons in fast sports cars they aint interested in tryin to work out NFT storage,ipfs mechanism and in a sense why should they.However eventually image and blockchain will booommmm in other words we will be paying a couple of cents instead of dollars to leave images onchain and this is the interesting part the entittys that provide these services will expand in such a way that the will become immensely powerful and of course rich and successful…your article explains that it is to expensive to place images onchain….thats it right there…not quite true at present their are places that provide the chance to upload images onto block not ipfs for around 2 dollars a hit which even at that price is worth the dollar.These guys are like you and ….and of course my cat that stinx….we just love progress and freedom….so what should be said here wont be because thats what you wanna read and so without doubt my cat stinx!!!!