A peer-to-peer (p2p) storage network is IPFS. Peers situated anywhere in the world can access content and may relay information, store it, or do both. Instead of using its location, IPFS knows how to find what you ask for using its content address.
Understanding IPFS is based on the following three key ideas:
- Content addressing for unique identification
- Directed acyclic graphs for content linking (DAGs)
- Distributed hash tables for content discovery (DHTs)
The IPFS ecosystem is made possible by these three concepts, which reinforce one another. Let's start with content addressing and distinctive content identification.
Instead of identifying material by its location, IPFS employs content addressing to determine what is inside of it. You already frequently search for items based on their content. For instance, when looking for a book in the library, you frequently inquire for it by name ; this is content addressing as you are inquiring as to what the book is. You would say, "I want the book that's on the second level, first stack, third shelf from the bottom, four books from the left," if you were using location addressing to find it. You wouldn't have a chance if someone moved that book!
Directed acyclic graphs, or DAGs, are a type of data structure that IPFS and many other systems use. They employ Merkle DAGs specifically, where each node has a distinctive identity that is a hash of the node's contents. Sounds recognizable? This brings up the CID idea that was discussed in the preceding section. Or to put it another way, content addressing is the act of identifying a data object by the value of its hash. If you're interested in learning more about Merkle DAGs, check out our guide on them.
A Merkle DAG can be organized in a variety of ways, and IPFS utilizes one that is tailored for representing directories and files. For instance, Git employs a Merkle DAG that contains several copies of your repository.
IPFS employs a distributed hash table, or DHT, to determine which peers are hosting the item you're looking for (discovery). A database of keys and values is called a hash table. A hash table that is shared among all of the peers in a distributed network is known as a distributed hash table. You consult your peers to find content.
The IPFS ecosystem's libp2p project manages peer connections and communication as well as providing the DHT. (Take note that libp2p can be used as a tool for various distributed systems in addition to IPFS, exactly like IPLD.)
You utilize the DHT once more to determine the current location of those peers once you have determined where your material is (or, more specifically, which peers are storing each of the blocks that make up the content you're after) (routing). Use libp2p to double-query the DHT in order to access the content.
You have identified your content and the location(s) in which it is now located. You must now access that information and download it. IPFS presently employs a module named Bitswap to request blocks from and send blocks to other peers.
With Bitswap, you can establish a connection with the peer or peers who have the content you require, send them your wantlist (a list of all the blocks you are looking for), and request the blocks you need from them. When those blocks are received, you can validate them by comparing their CIDs to the CIDs you requested by hashing their content. If necessary, you can additionally deduplicate blocks using these CIDs.
Other content replication techniques are also being discussed, with Graphsync being the most advanced. A proposal to add capabilities for requests and answers to the Bitswap protocol is also being discussed.
The IPFS ecosystem is made up of numerous modular libraries that support particular components of any distributed system, as you may have inferred from this talk. Without a doubt, you can use any component of the stack separately or creatively mix them.
The IPFS ecosystem assigns CIDs to content and creates IPLD Merkle DAGs to connect that content. Over a DHT that libp2p offers, you can find content, connect to any source of that content, and download it using a multiplexed connection. The foundation upon which IPFS is built is the middle of the stack, which is made up of linked, unique identifiers. So, this is how IPFS works.




















