zl程序教程

您现在的位置是:首页 >  IT要闻

当前栏目

模块化网络防止基于模型的多任务强化学习中的灾难性干扰

2023-03-15 21:57:34 时间

在多任务强化学习环境中,学习者通常通过利用多个相关任务的相似性,从训练中获益。同时,经过训练的代理能够解决更广泛的不同问题。虽然这种效果在无模型的多任务方法中得到了很好的记录,但我们证明了在使用单一的学习动态模型来完成多个任务时,会产生不利的影响。因此,我们解决了一个基本问题:基于模型的多任务强化学习是否以类似于无模型方法从共享策略网络中获益的方式从共享动态模型中获益。使用单一动力学模型,我们看到了任务混乱和性能下降的明显证据。作为补救措施,通过为每个任务训练孤立的子网络,为所学的动力学模型强制执行一个内部结构,在使用相同数量的参数时,明显提高了性能。我们通过在一个简单的网格世界和一个更复杂的vizdoom多任务实验中比较两种方法来说明我们的发现。

原文题目:Modular Networks Prevent Catastrophic Interference in Model-Based Multi-Task Reinforcement Learning

原文:In a multi-task reinforcement learning setting, the learner commonly benefits from training on multiple related tasks by exploiting similarities among them. At the same time, the trained agent is able to solve a wider range of different problems. While this effect is well documented for model-free multi-task methods, we demonstrate a detrimental effect when using a single learned dynamics model for multiple tasks. Thus, we address the fundamental question of whether model-based multi-task reinforcement learning benefits from shared dynamics models in a similar way model-free methods do from shared policy networks. Using a single dynamics model, we see clear evidence of task confusion and reduced performance. As a remedy, enforcing an internal structure for the learned dynamics model by training isolated sub-networks for each task notably improves performance while using the same amount of parameters. We illustrate our findings by comparing both methods on a simple gridworld and a more complex vizdoom multi-task experiment.