zl程序教程

您现在的位置是:首页 >  IT要闻

当前栏目

论在有缺失节点特征的图上学习时特征传播的不合理的有效性

2023-03-20 14:50:38 时间

虽然图谱神经网络(GNNs)最近已经成为关系数据建模的事实标准,但它们对图的节点或边缘特征的可用性提出了一个强有力的假设。然而,在许多现实世界的应用中,特征只有部分可用;例如,在社交网络中,年龄和性别只有一小部分用户可用。我们提出了一种在图机器学习应用中处理缺失特征的一般方法,该方法基于Dirichlet能量的最小化,并导致了图上的扩散型微分方程。这个方程的离散化产生了一个简单、快速和可扩展的算法,我们称之为特征传播。我们的实验表明,所提出的方法在七个常见的节点分类基准上优于以前的方法,并能承受令人惊讶的高比率的特征缺失:平均而言,当99%的特征缺失时,我们只观察到约4%的相对准确性下降。此外,在单个GPU上运行一个有250万个节点和123万条边的图只需要10秒。

原文题目:On the Unreasonable Effectiveness of Feature propagation in Learning on Graphs with Missing Node Features

原文:While Graph Neural Networks (GNNs) have recently become the de facto standard for modeling relational data, they impose a strong assumption on the availability of the node or edge features of the graph. In many real-world applications, however, features are only partially available; for example, in social networks, age and gender are available only for a small subset of users. We present a general approach for handling missing features in graph machine learning applications that is based on minimization of the Dirichlet energy and leads to a diffusion-type differential equation on the graph. The discretization of this equation produces a simple, fast and scalable algorithm which we call Feature Propagation. We experimentally show that the proposed approach outperforms previous methods on seven common node-classification benchmarks and can withstand surprisingly high rates of missing features: on average we observe only around 4% relative accuracy drop when 99% of the features are missing. Moreover, it takes only 10 seconds to run on a graph with ∼2.5M nodes and ∼123M edges on a single GPU.