zl程序教程

您现在的位置是:首页 >  IT要闻

当前栏目

用搜索和注意力学习稳健的调度方法

2023-03-15 21:57:38 时间

根据信道质量、缓冲区大小、要求和约束向用户分配物理层资源代表了无线电资源管理的核心优化问题之一。解决方案的空间随着每个维度的cardinality而组合式增长,使得在严格的时间要求下,使用穷举式搜索或甚至经典的优化算法都很难找到最佳解决方案。这个问题在MU-MIMO调度中更加明显,调度员可以将多个用户分配到同一时间频率的物理资源上。因此,传统的方法求助于设计启发式的方法,用优化来换取执行的可行性。在这项工作中,我们将MU-MIMO调度问题视为一个树状结构的组合问题,并借鉴AlphaGo Zero最近的成功经验,研究使用蒙特卡洛树搜索和强化学习的组合来搜索最佳性能解决方案的可行性。为了迎合手头问题的性质,如缺乏用户的内在排序以及用户组合之间的依赖性的重要性,我们通过引入自我注意机制对神经网络架构进行了根本的修改。然后我们证明,在存在测量不确定性和有限缓冲区的情况下,所产生的方法不仅是可行的,而且大大超过了最先进的基于启发式的调度方法。

原文题目:Learning Robust Scheduling with Search and Attention

原文:Allocating physical layer resources to users based on channel quality, buffer size, requirements and constraints represents one of the central optimization problems in the management of radio resources. The solution space grows combinatorially with the cardinality of each dimension making it hard to find optimal solutions using an exhaustive search or even classical optimization algorithms given the stringent time requirements. This problem is even more pronounced in MU-MIMO scheduling where the scheduler can assign multiple users to the same time-frequency physical resources. Traditional approaches thus resort to designing heuristics that trade optimality in favor of feasibility of execution. In this work we treat the MU-MIMO scheduling problem as a tree-structured combinatorial problem and, borrowing from the recent successes of AlphaGo Zero, we investigate the feasibility of searching for the best performing solutions using a combination of Monte Carlo Tree Search and Reinforcement Learning. To cater to the nature of the problem at hand, like the lack of an intrinsic ordering of the users as well as the importance of dependencies between combinations of users, we make fundamental modifications to the neural network architecture by introducing the self-attention mechanism. We then demonstrate that the resulting approach is not only feasible but vastly outperforms state-of-the-art heuristic-based scheduling approaches in the presence of measurement uncertainties and finite buffers.