Rethinking Sharding and Smart Contracts For Maximizing Blockchain Throughput

in blockchain •  5 years ago  (edited)

Taking a break from our series, David Beberman wrote this post after viewing the Scaling Ethereum 2019 LIVE — Day Two presentation.

A  lot of research has gone into how to make blockchains scale through  sharding. From what I can tell, the main concept is to enable parallel  execution of multiple transactions on separate shards, without  compromising the immutability and security of the blockchain, including  all the shards. Most of the research I’ve been able to find focuses on  algorithms for consensus on shards. Although all of this research looks  very promising, I want to take a look at sharding from a different  angle.

For  the sake of argument, let’s say that a consensus algorithm for sharding  exists. Further, let’s say that this algorithm runs on an open  permissionless blockchain, and is available today. My question is, even  with this, do we get the scaling that is hoped for with sharding. To get  a feel for this, lets take a quick look at Amdahl’s Law[1]:


                                                                          

      Amdahl's Law

This  shows that the theoretical speedup of the execution of the whole task  increases with the improvement of the resources of the system and that  regardless of the magnitude of the improvement, the theoretical speedup  is always limited by the part of the task that cannot benefit from the  improvement.[2] This essentially states that the  throughput of a system, once all the parallelizable portions are  maximized, is limited to the throughput of the serialized portions.

For blockchain sharding, I interpret this as meaning that the  potential increase in throughput is currently limited to the quantity  of transactions that can be executed simultaneously on separate shards. That is, if a transaction on a shard needs data from another shard it has to synchronize the transfer of the data from the other shard. This is a point of serialization and given a large pool of shards, according to Amdahl’s Law dominates the throughput.

Individual Native Coin Transactions

A  transaction between two accounts limited explicitly to only  transferring coin balances between the accounts, such as sending coin  between two accounts, does not need data from any other account.  Therefore, provided the data for both accounts is available on a  particular shard, the transaction can be executed asynchronously with  other transactions on other accounts. This scales with the number of  shards and the number of disjoint transactions between pairs of  accounts. As the number of accounts grows, one would expect that the  opportunity for sharding such independent transactions grows as well. In  the limit the throughput is dominated by the time it takes to execute a  single transaction regardless of how many transactions are being  executed at any given point in time. This is exactly the situation we  want for improving blockchain throughput.

Smart Contracts For Sharding

Things  do not workout so well for smart contracts. A smart contract is  implemented on the blockchain as a single ledger account with data state  associated with the program code. Each transaction runs through the  code changing the data state. Each change is recorded on the blockchain  and verified with a hash representing the state. To maintain a  deterministic, consistent state of the blockchain, each smart contract  can only be executed on one shard at a time, unless the state of the  smart contract account itself can be sharded. In general, this makes the  smart contract account execution the limiting factor for all of the  accounts that are sending transactions to the smart contract. As smart  contracts are used for tokenization, it is highly likely that as a given  token increases in circulation, its smart contract becomes a bottleneck  for throughput, regardless of how many shards exists, as predicted by  Amdahl’s Law.

With the current implementation model for smart contracts I see only two possible ways for scaling with sharding:

· Use multiple smart contracts segregated on shards

· Use deterministic multi-threaded smart contracts, (aka SIMD)

Multiple  smart contracts can take advantage of sharding.  For example, if each  smart contract representing a token is assigned to a separate shard,  then transactions for a given token do not affect transactions on other  tokens. Although each individual token’s transactions are limited to the  throughput of the smart contract on its specific shard, with a large  and growing number of tokens, the number of shards can grow linearly  with them. This doesn’t solve the problem of the throughput of an  individual smart contract, but it is an improvement over all the smart  contracts on a single blockchain without sharding. Even with this  approach, issues crop up if any state is needed from a user account that  is shared among the shards (e.g. native coin to pay for the  transactions).

A  technique from supercomputing called vectorization enables a program to  execute sections of its code in parallel. This is known as single  instruction multiple data (“SIMD”). Programs written for SIMD are  written to be deterministic. In essence SIMD machines, which today are  general purpose graphics processor units (“GPGPU”), shard their data  across an array of processors. This works very well for certain classes  of applications, such as matrix operations for graphics and similar.

This  technique could be applied to blockchain smart contracts theoretically.  That is, a smart contract could be written explicitly to support  parallel execution of transactions in some manner. I don’t know exactly  how this would be implemented. Perhaps blockchain researches will come  up with something innovative in this area. However, even with such a  solution, writing a smart contract that is SIMD capable becomes  significantly more complex, one would expect.

Neither  multiple smart contracts nor SIMD smart contracts are ideal solutions,  although both may provide some opportunity for scale.

Smart Object Assets Help Enable Sharding

Limiting  points for sharding with smart contracts is both sharding of state and  code execution. If there were a means to avoid transactions serializing  on a single smart contract state and code execution, sharding could  increase throughput scaling. To put this another way, if there was a  means for multiple instruction multiple data (“MIMD”) execution, the  opportunity for blockchain sharding would be significantly improved.

As was described in “Rethinking The Blockchain Account Concept”[3],  if each user account had its own state, instead of using separate smart  contracts, then each user account could contain objects that represent  assets, whether as tokens or other types of entities. As described in  “Extensible Smart Object Assets, Smart Object Asset Ownership and  Fractional Smart Object Asset Ownership With the DataGrid Blockchain Extensible Blockchain Object Model”[4],  (XSOA)’s and references to XSOA’s could be used to transfer ownership  between accounts with transactions directly between the account states.

For  example, given two sets of transactions, where each transaction is  between different accounts, that is: one transaction is from account A  to account B; and another transaction is between account C to account D,  then the transactions can be executed on different shards  simultaneously. Further, because the code for the XSOA’s is independent  of any of the accounts, and may be different code for each of the  transactions we can realize sharding for a MIMD model. That is different  code on each shard and different data on each shard.

The  limiting point for scale here is the number of transactions that can  take place simultaneously between disjoint account sets. We would expect  that as the quantity of accounts grows, the opportunity for disjoint  account sets within any group of transactions grows as well, which in  turn would result in a growing opportunity for sharding.

Conclusion

Using  as a given the availability of a sharding consensus algorithm, an  outstanding question is how to make use of such technology. Smart  contracts inherently serialize transactions and other than a complex  SIMD type solution, only offer scaling by using multiple separate  isolated smart contracts. Even with that, each smart contract’s  throughput is limited to a single shard’s throughput. By rethinking the  user account to include state information, and using the XBOM model, the  DataGrid Blockchain offers a solution to sharding scalability that  scales with the number of accounts and disjoint transactions among the  accounts. In addition to enabling inheritance and live code reuse, we  believe that this is a significant solution to the blockchain scaling  problem.

[1] https://en.wikipedia.org/wiki/Amdahl%27s_law

[2] ibid

[3] https://medium.com/@dbeberman/rethinking-the-blockchain-account-concept-6c94748f8021

[4] https://medium.com/@dbeberman/extensible-smart-object-assets-smart-object-asset-ownership-and-fractional-smart-object-asset-995c259a8508

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!