您现在的位置是：首页 > 数据库

当前栏目

带有数据分层的图神经网络训练

数据神经网络

2023-03-14 22:37:12 时间

图谱神经网络（GNNs）在从图结构数据中学习方面已经显示出成功，并应用于欺诈检测、推荐和知识图谱推理。然而，有效地训练GNN是一个挑战，因为。1）GPU内存容量有限，对于大型数据集来说可能是不够的；2）基于图形的数据结构导致不规则的数据访问模式。在这项工作中，我们提供了一种方法来统计分析和识别在GNN训练前更频繁访问的数据。我们的数据分层方法不仅利用了输入图的结构，而且还利用了从实际的GNN训练过程中获得的洞察力来实现更高的预测结果。通过我们的数据分层方法，我们还提供了一个新的数据放置和访问策略，以进一步减少CPU-GPU的通信开销。我们还考虑到了多GPU的GNN训练，并在多GPU系统中证明了我们的策略的有效性。评估结果显示，我们的工作将CPU-GPU的流量减少了87-95%，在具有数亿个节点和数十亿条边的图上，GNN的训练速度比现有的解决方案提高了1.6-2.1倍。

原文题目：Graph Neural Network Training with Data Tiering

原文：Graph Neural Networks (GNNs) have shown success in learning from graph-structured data, with applications to fraud detection, recommendation, and knowledge graph reasoning. However, training GNN efficiently is challenging because: 1) GPU memory capacity is limited and can be insufficient for large datasets, and 2) the graph-based data structure causes irregular data access patterns. In this work, we provide a method to statistical analyze and identify more frequently accessed data ahead of GNN training. Our data tiering method not only utilizes the structure of input graph, but also an insight gained from actual GNN training process to achieve a higher prediction result. With our data tiering method, we additionally provide a new data placement and access strategy to further minimize the CPU-GPU communication overhead. We also take into account of multi-GPU GNN training as well and we demonstrate the effectiveness of our strategy in a multi-GPU system. The evaluation results show that our work reduces CPU-GPU traffic by 87-95% and improves the training speed of GNN over the existing solutions by 1.6-2.1x on graphs with hundreds of millions of nodes and billions of edges.

带有数据分层的图神经网络训练.pdf

猜你喜欢

做数据分析时，R 用户如何学习 Python？
和Lambdas的第一次亲密接触
微软支持甲骨文起诉谷歌侵犯Java版权
开源Word读写组件DocX介绍与入门
手把手教你做数据分析
如何让热点图支持大数据
C#读取Excel几种方法的体会
寻找阿登高地——爬虫工程师如何绕过验证码
面向对象编程从骨子里就有问题：看看名人大家是如何诋毁它
ggplot2又添新神器——ggthemr助你制作惊艳美图
Apache Impala引领传统分析数据库技术的发展
C语言实现合并排序
苹果遭黑客攻击罪魁是Java
C#开源轻量级对象数据库NDatabase介绍
实用 | Apache Kudu读写路径
Java初学者的30个常见问题
人人都应学会的4个数据分析思路
Java日志：迁移到 Logback 和 SLF4J
数据流程图和数据结构是需求分析中不可缺少的一环
Python简史

zl程序教程

当前栏目

带有数据分层的图神经网络训练

相关文章