[TSMC] Adaptive Resource Scheduling in Permissionless Sharded-Blockchains: A Decentralized Multiagent Deep Reinforcement Learning Approach

Adaptive Resource Scheduling in Permissionless Sharded-Blockchains: A Decentralized Multiagent Deep Reinforcement Learning Approach

Guangsheng Yu, Xu Wang, Wei Ni, Qinghua Lu, Xiwei Xu, Ren Ping Liu, Liming Zhu

IEEE Transactions on Systems, Man, and Cybernetics: Systems

Existing permissionless sharded-Blockchains come on the scene. However, there is a lack of systematic formulations and experiments regarding the behaviors of individual miners. In this article, we interpret block mining in a permissionless sharded-Blockchain as a repeated M -player noncooperative game with finite actions, and propose a new multiagent deep reinforcement learning (MADRL) framework to allow the miners to maximize their profits in a decentralized fashion by scheduling their resources across the shards without centralized coordination. We formulate the rewards, and design a two-scale action space for each miner to reduce the action space and expedite convergence. We also propose a new MADRL model, named Rainbow-WoLF-PHC, which allows each miner to learn its resource allocation online and converge fast to a mixed strategy Nash equilibrium. Extensive experiments show the superiority of the Rainbow-WoLF-PHC to its alternatives in terms of convergence, stability, and profitable actions. This work provides a prosperous design of an end-user-friendly permissionless sharded-Blockchain.

Find on IEEE