P2P networks of Bitcoin and Ethereum

P2P network

What is P2P? P2P (peer-to-peer) is the connection between nodes on the network, which is a type of network topology. The nodes in the network are both providers and acquirers of resource services. Understand the model structure of P2P through the following figure.

Simple model of P2P network
From the above figure, we can also see some characteristics of P2P network:

Decentralization: The resources and services in the network are distributed on all nodes, and each node stores all data, and the transmission of information can be directly between the nodes without the intervention of intermediate links.
Scalability: Users can join the network at any time, and the system's resources and service capabilities are simultaneously expanded. In theory, its scalability can be almost unlimited.
Robustness: Because the service is scattered among various nodes, the destruction of some nodes or the network has little impact on other parts, so P2P has the characteristics of resistance to attacks and high fault tolerance. P2P networks generally can automatically adjust the overall topology when some nodes fail to maintain the connectivity of other nodes.
Cost-effective: P2P architecture can effectively use a large number of ordinary nodes scattered on the Internet, and distribute computing tasks or storage data to all nodes. Utilize the idle computing power or storage space to achieve the purpose of high-performance computing and mass storage.
Privacy protection: In a P2P network, because the transmission of information is scattered among nodes without going through a certain centralized link, the possibility of users' private information being eavesdropped and leaked is greatly reduced.
Load balancing: Since each node is both a server and a client, it reduces the requirements for server computing power and storage in the traditional C/S model. At the same time, because resources are distributed across multiple nodes, the load balance of the entire network is better realized .
How should nodes join the network? There are generally two ways to join the P2P network

First, it is known that a peer node that already exists and is in the active state in the p2p network will share the connection information of the entire p2p network after connecting with it.
Second, there is a tracker server. Before entering the p2p network, first visit this tracker server to obtain all node information of the p2p network, and then connect these peers according to the peer information obtained from the tracker, and then join the p2p network.
Knowing the characteristics of P2P network and how to join nodes, we are looking at which network models include P2P? There are four different network models: centralized, purely distributed, hybrid, and structured models. These models are briefly introduced below.

Centralized: There is a central node that saves the index information of all other nodes. The index information generally includes node IP address, port, node resource, etc.

Purely distributed: The central node is removed, and a random network is established between P2P nodes, which is to randomly establish a connection channel between a newly added node and a node in the P2P network, thus forming a random topology.

Purely distributed organization

Hybrid: Hybrid is actually a mixture of centralized and distributed structures. There are multiple super nodes in the network to form a distributed network, and each super node has multiple common nodes and it forms a local centralized network.

Structured P2P network: A distributed network structure in which all nodes are organized in an orderly manner according to a certain structure, such as forming a ring network or tree network. The specific implementation of structured networks is generally based on the idea of DHT (Distributed Hash Table) algorithm. DHT is just a network model, not a specific implementation. Specific implementations include algorithms such as Chord, Pastry, CAN, and Kademlia. Ethereum uses the Kademlia algorithm.

P2P will be introduced first. Different blockchains will use different network models. Later, we will introduce the two most representative blockchain networks: Bitcoin network and Ethereum network.

Second, the P2P network in Bitcoin
The nodes in the Bitcoin network have four main functions: wallet, mining, blockchain database, and network routing. Each node will have the function of routing, but not all other functions are available. Generally, the Bitcoin core node will contain all four functions later. A node containing all functions is also called a full node.

Four major functions of Bitcoin network nodes
Except for the Bitcoin core wallet which is a full node, most of the others are light nodes. Users can check their account balances, manage wallet addresses and private keys, and initiate transactions through these light node wallets. Another type of node is a mining node. If the node stores all the data of the block, it is also a full node, generally an independent miner.

There is a mining node that is not independent mining, but will be connected with other nodes in the mining pool to conduct collective mining. This is called collective miners. This will also form a mining pool network, which is a centralized network. The central node is the mining pool server, and other miners are connected to him. The communication between the miners and the mining server does not use the Bitcoin protocol. , But its own mining pool protocol, the mainstream mining pool protocol is the Stratum protocol.

Bitcoin network
In addition, after a miner creates a new block, it needs to be broadcast to all nodes in the entire network. After the entire network accepts the block, the reward to the miner will be valid, and the next block hash calculation will begin. Therefore, miners must minimize the time required for the fast broadcast of the new zone and the calculation of the next block. Therefore, a dedicated broadcast network is needed to speed up the propagation of blocks. This propagation network is also called Bitcoin Relay NetWork.

In the Bitcoin network, a node can send its own peer list (peer list) to neighboring nodes. Therefore, after the initial node discovery, a copy of the node list from neighboring nodes is required.

Second, the P2P network in Ethereum
Like Bitcoin, Ethereum also has four major functions: wallet, mining, blockchain database, and block routing. There are also different types of nodes. The biggest difference from Bitcoin's P2P network structure is that Ethereum's P2P has a structure. of. Its network is implemented using Kademlia (Kad) algorithm, which can quickly and accurately route and locate data problems.

Kad's routing table is constructed through data called K buckets, which record node NodeId, distance, endpoint, ip and other information. The K buckets are sorted according to the distance from the target node, a total of 256 K buckets, each K bucket contains 16 nodes.

The communication between nodes in the Kad network is based on UDP, which is mainly composed of the following commands. If the PING-PONG handshake between two nodes is passed, the corresponding node is considered online.

The ping command detects a node and determines whether it is online
PONG ping response
FINDNODE queries the node for a node that is close to the target node ID
NEIGHBORS FIND_NODE command response, send the node in the K bucket close to the target node ID

The system randomly generates the NodeId of the local node when it is started for the first time, which is recorded as LocalId, and will be fixed after generation, and the local node is recorded as local-eth.
The system reads the public node information and writes it into the K bucket after the ping-pong handshake is completed.
The system refreshes the K bucket every 7200ms.
The process of refreshing the K bucket is as follows:

Randomly generate the target node Id, record it as TargetId, and record the number of discoveries and refresh time starting from 1.
Calculate the distance between TargetId and LocalId and record it as Dlt
The NodeId of the node in the K bucket is recorded as KadId, and the distance between KadId and TargetId is calculated and recorded as Dkt
Find the node in the K bucket with Dlt greater than Dkt, and record it as the k bucket node, and send the FindNODE command to the k bucket node. The FindNODE command contains TargetId
After the K-bucket node receives the FindNODE command, it will also execute the b-d process, and use the Neighbours command to send the node found in the K-bucket back to the local node.
After receiving Neighbours, the local node writes the received node into the K bucket.
If the number of searches does not exceed 8, and the refresh time does not exceed 600ms, then return to step b to execute the cycle