您现在的位置是：首页 > IT要闻

当前栏目

机器学习学术速递[12.22]

学习数据

2023-04-18 16:13:12 时间

cs.LG 方向，今日共计98篇

Graph相关(图学习|图神经网络|图优化等)(6篇)

【1】 Controversy Detection: a Text and Graph Neural Network Based Approach 标题：争议检测：一种基于文本和图形神经网络的方法链接：https://arxiv.org/abs/2112.11445

作者：Samy Benslimane,Jérome Azé,Sandra Bringay,Maximilien Servajean,Caroline Mollevi 机构： LIRMM UMR , CNRS, University of Montpellier, Montpellier, France, AMIS, Paul Valery University, Montpellier, France, Institut du Cancer Montpellier (ICM), Montpellier, France, IDESP, UMR Inserm - Universit´e de Montpellier, Montpellier, France 摘要：有争议的内容是指吸引正面和负面反馈的任何内容。它的自动识别，特别是在社交媒体上的自动识别，是一项具有挑战性的任务，因为它应该在大量不断变化的帖子上完成，涵盖各种各样的主题。大多数现有方法依赖于主题讨论的图形结构和/或消息的内容。本文提出了一种基于讨论图结构和文本特征的争议检测方法。我们提出的方法依赖于图形神经网络（gnn）在执行图形分类任务之前将图形表示（包括其文本）编码到嵌入向量中。后者将该职位归类为有争议或无争议。提出了两种争议检测策略。第一种是基于层次图表示的学习。图用户节点分层迭代嵌入，计算整个图嵌入向量。第二种是基于注意机制，该机制允许每个用户节点在计算节点嵌入时或多或少地重视其邻居。我们使用不同的真实数据集进行实验来评估我们的方法。进行的实验表明，结合文本特征和结构信息对性能有积极影响。摘要：Controversial content refers to any content that attracts both positive and negative feedback. Its automatic identification, especially on social media, is a challenging task as it should be done on a large number of continuously evolving posts, covering a large variety of topics. Most of the existing approaches rely on the graph structure of a topic-discussion and/or the content of messages. This paper proposes a controversy detection approach based on both graph structure of a discussion and text features. Our proposed approach relies on Graph Neural Network (gnn) to encode the graph representation (including its texts) in an embedding vector before performing a graph classification task. The latter will classify the post as controversial or not. Two controversy detection strategies are proposed. The first one is based on a hierarchical graph representation learning. Graph user nodes are embedded hierarchically and iteratively to compute the whole graph embedding vector. The second one is based on the attention mechanism, which allows each user node to give more or less importance to its neighbors when computing node embeddings. We conduct experiments to evaluate our approach using different real-world datasets. Conducted experiments show the positive impact of combining textual features and structural information in terms of performance.

【2】 Supervised Graph Contrastive Pretraining for Text Classification 标题：有监督图对比预训练在文本分类中的应用链接：https://arxiv.org/abs/2112.11389

作者：Samujjwal Ghosh,Subhadeep Maji,Maunendra Sankar Desarkar 机构：IIT Hyderabad, Amazon 备注：A condensed version of this paper has been accepted to ACM SAC'22. DOI: this https URL 摘要：文本分类的对比预训练技术在无监督环境下得到了广泛的研究。但是，通常可以使用来自与当前任务共享标签语义的相关任务的标签数据。我们假设有效地使用这些标记数据可以更好地概括当前任务。在本文中，我们提出了一种基于图的监督对比学习方法，有效地利用相关任务中的标记数据。我们通过将示例中的监督信息外推到令牌来构造令牌图。我们的公式产生了一个嵌入空间，其中具有高/低属于同一类的概率的令牌彼此靠近/远离。我们还开发了详细的理论见解，作为我们方法的动机。在我们对$13$数据集的实验中，我们表明我们的方法比训练前方案平均高出$2.5\%$，并且基于示例级对比学习的公式平均高出$1.8\%$。此外，我们还显示了我们的方法在零炮设置下的跨域有效性，平均为$3.91\%$。最后，我们还证明了我们的方法可以作为知识蒸馏设置中的噪声教师，显著提高基于Transformer的模型在低标记数据区域的性能，平均提高4.57\%$。摘要：Contrastive pretraining techniques for text classification has been largely studied in an unsupervised setting. However, oftentimes labeled data from related tasks which share label semantics with current task is available. We hypothesize that using this labeled data effectively can lead to better generalization on current task. In this paper, we propose a novel way to effectively utilize labeled data from related tasks with a graph based supervised contrastive learning approach. We formulate a token-graph by extrapolating the supervised information from examples to tokens. Our formulation results in an embedding space where tokens with high/low probability of belonging to same class are near/further-away from one another. We also develop detailed theoretical insights which serve as a motivation for our method. In our experiments with $13$ datasets, we show our method outperforms pretraining schemes by $2.5\%$ and also example-level contrastive learning based formulation by $1.8\%$ on average. In addition, we show cross-domain effectiveness of our method in a zero-shot setting by $3.91\%$ on average. Lastly, we also demonstrate our method can be used as a noisy teacher in a knowledge distillation setting to significantly improve performance of transformer based models in low labeled data regime by $4.57\%$ on average.

【3】 An Inference Approach To Question Answering Over Knowledge Graphs 标题：一种基于知识图的问答推理方法链接：https://arxiv.org/abs/2112.11070

作者：Aayushee Gupta,K. M. Annervaz,Ambedkar Dukkipati,Shubhashis Sengupta 机构：International Institute of Information Technology, Bangalore, Annervaz K M, Indian Institute of Science & Accenture Technology Labs 备注：10 pages, 4 figures, 4 tables 摘要：知识图（KG）是从大型自然语言文本语料库中提取信息的重要工具。知识图上的自然语言查询问题对于人类对这些信息的消费至关重要。通常通过将自然语言查询转换为结构化查询，然后在KG上启动结构化查询来解决此问题。文献中关于知识图的直接回答模型很少。查询转换模型和直接模型都需要与知识图领域相关的特定训练数据。在这项工作中，我们将知识图上的自然语言查询问题转化为前提假设对上的推理问题。通过对转换后的代理推理问题使用经过训练的深度学习模型，我们为原始自然语言查询问题提供了解决方案。我们的方法在MetaQA数据集上实现了90%以上的准确率，超过了现有的最新技术。我们还提出了一个推理模型，称为分层循环路径编码器（HRPE）。推理模型可以进行微调，以便在训练数据较少的领域中使用。我们的方法不需要大量特定于领域的训练数据来查询来自不同领域的新知识图。摘要：Knowledge Graphs (KG) act as a great tool for holding distilled information from large natural language text corpora. The problem of natural language querying over knowledge graphs is essential for the human consumption of this information. This problem is typically addressed by converting the natural language query to a structured query and then firing the structured query on the KG. Direct answering models over knowledge graphs in literature are very few. The query conversion models and direct models both require specific training data pertaining to the domain of the knowledge graph. In this work, we convert the problem of natural language querying over knowledge graphs to an inference problem over premise-hypothesis pairs. Using trained deep learning models for the converted proxy inferencing problem, we provide the solution for the original natural language querying problem. Our method achieves over 90% accuracy on MetaQA dataset, beating the existing state-of-the-art. We also propose a model for inferencing called Hierarchical Recurrent Path Encoder(HRPE). The inferencing models can be fine-tuned to be used across domains with less training data. Our approach does not require large domain-specific training data for querying on new knowledge graphs from different domains.

【4】 ANUBIS: A Provenance Graph-Based Framework for Advanced Persistent Threat Detection 标题：Anubis：一种基于起源图的高级持久威胁检测框架链接：https://arxiv.org/abs/2112.11032

作者：Md. Monowar Anjum,Shahrear Iqbal,Benoit Hamelin 机构：National Research Council, Fredericton, NB, Canada, Tutte Institute for Mathematics and Computing, Ottawa, ON, Canada 备注：Accepted for publication in the 37th ACM SIGAPP Symposium on Applied Computing (SAC 2022) 摘要：我们提出了一种基于机器学习的高效APT检测系统ANUBIS。我们的ANUBIS设计理念包括两个主要部分。首先，我们希望网络响应团队能够有效利用ANUBIS。因此，预测可解释性是ANUBIS设计的主要重点之一。其次，ANUBIS使用系统起源图来捕获因果关系，从而实现高检测性能。ANUBIS预测能力的核心是一个贝叶斯神经网络，它可以告诉我们自己对预测的信心有多大。我们根据最近的APT数据集（DARPA OpTC）对ANUBIS进行了评估，结果表明，ANUBIS可以高精度地检测类似于APT活动的恶意活动。此外，ANUBIS还了解了高级模式，可以向威胁分析师解释其预测。高预测性能和可解释的攻击故事重建使ANUBIS成为企业网络防御的有效工具。摘要：We present ANUBIS, a highly effective machine learning-based APT detection system. Our design philosophy for ANUBIS involves two principal components. Firstly, we intend ANUBIS to be effectively utilized by cyber-response teams. Therefore, prediction explainability is one of the main focuses of ANUBIS design. Secondly, ANUBIS uses system provenance graphs to capture causality and thereby achieve high detection performance. At the core of the predictive capability of ANUBIS, there is a Bayesian Neural Network that can tell how confident it is in its predictions. We evaluate ANUBIS against a recent APT dataset (DARPA OpTC) and show that ANUBIS can detect malicious activity akin to APT campaigns with high accuracy. Moreover, ANUBIS learns about high-level patterns that allow it to explain its predictions to threat analysts. The high predictive performance with explainable attack story reconstruction makes ANUBIS an effective tool to use for enterprise cyber defense.

【5】 ACGNet: Action Complement Graph Network for Weakly-supervised Temporal Action Localization 标题：ACGNet：弱监督时间动作定位的动作补充图网络链接：https://arxiv.org/abs/2112.10977

作者：Zichen Yang,Jie Qin,Di Huang 机构： State Key Laboratory of Software Development Environment, Beihang University, Beijing , China, College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing , China 备注：Accepted to AAAI 2022 摘要：未剪辑视频中的弱监督时间动作定位（WTAL）已经成为一项实际但具有挑战性的任务，因为只有视频级别的标签可用。现有方法通常利用现成的段级特征，这些特征存在空间不完整性和时间不一致性，从而限制了它们的性能。在本文中，我们从一个新的角度来解决这个问题，通过一个简单而有效的图卷积网络，即动作补码图网络（ACGNet）来增强段级表示。它有助于当前视频片段感知来自其他视频片段的时空依赖性，这些视频片段可能传达互补的线索，从而隐式地减轻上述两个问题造成的负面影响。通过这种方式，段级特征对时空变化更具辨别力和鲁棒性，有助于提高定位精度。更重要的是，建议的ACGNet作为一个通用模块工作，可以灵活地插入不同的WTAL框架，同时保持端到端的训练方式。在THUMOS'14和ActivityNet1上进行了广泛的实验。2个基准，其中最先进的结果清楚地表明了拟议方法的优越性。摘要：Weakly-supervised temporal action localization (WTAL) in untrimmed videos has emerged as a practical but challenging task since only video-level labels are available. Existing approaches typically leverage off-the-shelf segment-level features, which suffer from spatial incompleteness and temporal incoherence, thus limiting their performance. In this paper, we tackle this problem from a new perspective by enhancing segment-level representations with a simple yet effective graph convolutional network, namely action complement graph network (ACGNet). It facilitates the current video segment to perceive spatial-temporal dependencies from others that potentially convey complementary clues, implicitly mitigating the negative effects caused by the two issues above. By this means, the segment-level features are more discriminative and robust to spatial-temporal variations, contributing to higher localization accuracies. More importantly, the proposed ACGNet works as a universal module that can be flexibly plugged into different WTAL frameworks, while maintaining the end-to-end training fashion. Extensive experiments are conducted on the THUMOS'14 and ActivityNet1.2 benchmarks, where the state-of-the-art results clearly demonstrate the superiority of the proposed approach.

【6】 GCN-Geo: A Graph Convolution Network-based Fine-grained IP Geolocation Framework 标题：GCN-GEO：一种基于图卷积网络的细粒度IP地理定位框架链接：https://arxiv.org/abs/2112.10767

作者：Shichang Ding,Fan Zhang,Xiangyang Luo,Fenlin Liu 备注：Under Review 摘要：经典的基于细粒度测量的IP地理定位算法通常依赖于某些特定的线性延迟距离规则。这可能会在延迟-距离关系为非线性的实际网络环境中导致不可靠的地理定位结果。近年来，基于学习的IP地理定位算法开始受到研究者的关注。这些数据驱动算法利用多层感知器（MLP）对网络环境进行建模。他们不需要关于线性延迟距离规则的强预假设，并且能够学习非线性关系。从理论上讲，它们应该提高IP地理定位在不同网络中的泛化能力。然而，网络基本上是用图形表示的。MLP不适用于以图形形式构造的信息模型。基于MLP的IP地理定位方法将目标IP地址视为孤立的数据实例，忽略目标之间的连接信息。这将导致次优表示并限制地理定位性能。图卷积网络（GCN）是一种新兴的图形数据表示的深度学习方法。在这项工作中，我们研究如何用GCN为细粒度IP地理定位的计算机网络建模。首先，我们将IP地理定位任务描述为一个属性图节点回归问题。然后，提出了一个基于GCN的IP地理定位框架GCN Geo来预测每个IP地址的位置。最后，在三个真实世界数据集（纽约州、香港州和上海州）的实验结果表明，所提出的GCN-GEO框架明显优于基于规则的和基于学习的基线的平均误差距离、中值误差距离和最大误差距离。这验证了GCN在细粒度IP地理定位中的潜力。摘要：Classical fine-grained measurement-based IP geolocation algorithms often rely on some specific linear delay-distance rules. This could cause unreliable geolocation results in actual network environments where the delay-distance relationship is non-linear. Recently, researchers begin to pay attention to learning-based IP geolocation algorithms. These data-driven algorithms leverage multi-layer perceptron (MLP) to model the network environments. They do not need strong pre-assumptions about the linear delay-distance rule and are capable to learn non-linear relationships. In theory, they should improve the generalization ability of IP geolocation in different networks. However, networks are fundamentally represented as graphs. MLP is not well suited to model information structured as graphs. MLP-based IP geolocation methods treat target IP addresses as isolated data instances and ignore the connection information between targets. This would lead to suboptimal representations and limit the geolocation performance. Graph convolutional network (GCN) is an emerging deep learning method for graph data presentation. In this work, we research how to model computer networks for fine-grained IP geolocation with GCN. First, we formulate the IP geolocation task as an attributed graph node regression problem. Then, a GCN-based IP geolocation framework named GCN-Geo is proposed to predict the location of each IP address. Finally, the experimental results in three real-world datasets (New York State, Hong Kong, and Shanghai) show that the proposed GCN-Geo framework clearly outperforms the state-of-art rule-based and learning-based baselines on average error distance, median error distance and max error distance. This verifies the potential of GCN in fine-grained IP geolocation.

Transformer(1篇)

【1】 Voice Quality and Pitch Features in Transformer-Based Speech Recognition 标题：基于Transformer的语音识别中的音质和基音特征链接：https://arxiv.org/abs/2112.11391

作者：Guillermo Cámbara,Jordi Luque,Mireia Farrús 机构：TALN Research Group, Universitat Pompeu Fabra, Barcelona, Spain, Telef´onica I+D, Research, Barcelona, Spain, Language and Computation Centre, Universitat de Barcelona, Spain 备注：5 pages, 3 figures, submitted to Speech Prosody 2022 conference 摘要：抖动和微光测量已被证明是语音质量和韵律信息的载体，可提高说话人识别、二值化或自动语音识别（ASR）等任务的性能。然而，此类特征很少用于基于神经的ASR，因为在这种情况下，光谱特征通常占主导地位。在这项工作中，我们研究了将语音质量和音高特征合并到基于转换器的ASR模型中的效果，直觉表明注意机制可能利用潜在的韵律特征。为此，我们提出了韵律和频谱特征的分离卷积前端，表明这种架构选择比将此类音调和音质特征简单地连接到mel频谱图滤波器组产生更好的结果。此外，我们发现使用LibriSpeech基准测试，平均字错误率相对减少了5.6%。这些发现推动了对韵律知识应用的进一步研究，以提高基于变换的ASR的鲁棒性。摘要：Jitter and shimmer measurements have shown to be carriers of voice quality and prosodic information which enhance the performance of tasks like speaker recognition, diarization or automatic speech recognition (ASR). However, such features have been seldom used in the context of neural-based ASR, where spectral features often prevail. In this work, we study the effects of incorporating voice quality and pitch features altogether and separately to a Transformer-based ASR model, with the intuition that the attention mechanisms might exploit latent prosodic traits. For doing so, we propose separated convolutional front-ends for prosodic and spectral features, showing that this architectural choice yields better results than simple concatenation of such pitch and voice quality features to mel-spectrogram filterbanks. Furthermore, we find mean Word Error Rate relative reductions of up to 5.6% with the LibriSpeech benchmark. Such findings motivate further research on the application of prosody knowledge for increasing the robustness of Transformer-based ASR.

GAN|对抗|攻击|生成相关(6篇)

【1】 On the Adversarial Robustness of Causal Algorithmic Recourse 标题：论因果算法追索权的对抗稳健性链接：https://arxiv.org/abs/2112.11313

作者：Ricardo Dominguez-Olmedo,Amir-Hossein Karimi,Bernhard Schölkopf 机构：Max-Planck-Institute for Intelligent Systems, Tübingen, Germany, University of Tübingen, Germany, ETH Zürich, Switzerland 备注：NeurIPS 2021 WHY-21 Workshop (Oral) 摘要：算法资源寻求为个人提供可操作的建议，以克服自动决策系统产生的不利结果。在理想情况下，追索权建议应稳健，以应对寻求追索权的个人特征中相当小的不确定性。在这项工作中，我们制定了不利稳健追索问题，并表明追索方法提供最低成本追索失败的鲁棒性。然后，我们提出了在线性和可微情况下生成逆鲁棒追索权的方法。为了确保追索权的有效性，要求个人付出更多的努力。为了将稳健性负担的一部分从决策主体转移到决策者身上，我们提出了一种模型正则化器，该正则化器鼓励寻求稳健性追索权的额外成本降低。我们表明，使用我们提出的模型正则化器训练的分类器，会惩罚依赖不可操作特征进行预测的分类器，提供潜在的不太容易的追索。摘要：Algorithmic recourse seeks to provide actionable recommendations for individuals to overcome unfavorable outcomes made by automated decision-making systems. Recourse recommendations should ideally be robust to reasonably small uncertainty in the features of the individual seeking recourse. In this work, we formulate the adversarially robust recourse problem and show that recourse methods offering minimally costly recourse fail to be robust. We then present methods for generating adversarially robust recourse in the linear and in the differentiable case. To ensure that recourse is robust, individuals are asked to make more effort than they would have otherwise had to. In order to shift part of the burden of robustness from the decision-subject to the decision-maker, we propose a model regularizer that encourages the additional cost of seeking robust recourse to be low. We show that classifiers trained with our proposed model regularizer, which penalizes relying on unactionable features for prediction, offer potentially less effortful recourse.

【2】 Diagnostic Assessment Generation via Combinatorial Search 标题：基于组合搜索的诊断评估生成链接：https://arxiv.org/abs/2112.11188

作者：Daehan Kim,Hyeonseong Choi,Guik Jung 备注：12 pages, 3 figures 摘要：初始评估测试对于以一致的方式获取学习者的知识状态至关重要。除了构思问题本身，将相关问题组合成问题单也是一个耗时的过程。在这项工作中，我们提出了一个问题集合的通用公式和一个基于遗传算法的方法，可以从原始问题解决历史生成评估测试。首先，我们估计学习者问题知识矩阵（snapshot）。每个矩阵元素代表学习者正确回答特定问题的概率。我们将任务表述为对该快照的组合搜索。为了确保诊断测试具有代表性和区别性，选择的问题（1）相对于整个问题库的均方根误差较低，（2）学习者表现的标准偏差较高。实验结果表明，在一个私有数据集和四个公共数据集上，该方法的性能大大优于贪婪和随机基线。我们还对生成的九年级学生评估测试进行了定性分析，该测试在整个九年级课程中具有良好的问题分散性，难度水平分布良好。摘要：Initial assessment tests are crucial in capturing learner knowledge states in a consistent manner. Aside from crafting questions itself, putting together relevant problems to form a question sheet is also a time-consuming process. In this work, we present a generic formulation of question assembly and a genetic algorithm based method that can generate assessment tests from raw problem-solving history. First, we estimate the learner-question knowledge matrix (snapshot). Each matrix element stands for the probability that a learner correctly answers a specific question. We formulate the task as a combinatorial search over this snapshot. To ensure representative and discriminative diagnostic tests, questions are selected (1) that has a low root mean squared error against the whole question pool and (2) high standard deviation among learner performances. Experimental results show that the proposed method outperforms greedy and random baseline by a large margin in one private dataset and four public datasets. We also performed qualitative analysis on the generated assessment test for 9th graders, which enjoys good problem scatterness across the whole 9th grader curriculum and decent difficulty level distribution.

【3】 Adversarial Gradient Driven Exploration for Deep Click-Through Rate Prediction 标题：对抗性梯度驱动的深度点击率预测探索链接：https://arxiv.org/abs/2112.11136

作者：Kailun Wu,Weijie Bian,Zhangming Chan,Lejian Ren,Shiming Xiang,Shuguang Han,Hongbo Deng,Bo Zheng 机构：Alibaba Group, People’s Republic of China, Chinese Academy of Science Institute, of Automation 摘要：如今，数据驱动的深层神经模型在点击率（CTR）预测方面已经取得了显著的进展。不幸的是，当数据不足时，这些模型的有效性可能会失败。为了解决这个问题，研究人员通常采用探索策略，根据估计的报酬来检查项目，例如UCB或Thompson抽样。在开发和探索CTR预测的背景下，最近的研究试图利用预测不确定性以及模型预测作为奖励分数。然而，我们认为这种方法可能会使最终排名分数偏离原始分布，从而影响在线系统中的模型性能。在本文中，我们提出了一种新的探测方法，称为 extbf{a}dversiale extbf{G}辐射驱动 extbf{E}xploration（AGE）。具体来说，我们提出了一个伪探索模块来模拟梯度更新过程，该模块可以近似地模拟待探索项目样本对模型的影响。此外，为了提高勘探效率，我们提出了一种动态阈值单元来消除低电位CTR样品的影响。我们的方法的有效性在一个开放获取的学术数据集上得到了证明。与此同时，AGE还被部署在真实世界的展示广告平台上，所有在线指标都得到了显著改进。摘要：Nowadays, data-driven deep neural models have already shown remarkable progress on Click-through Rate (CTR) prediction. Unfortunately, the effectiveness of such models may fail when there are insufficient data. To handle this issue, researchers often adopt exploration strategies to examine items based on the estimated reward, e.g., UCB or Thompson Sampling. In the context of Exploitation-and-Exploration for CTR prediction, recent studies have attempted to utilize the prediction uncertainty along with model prediction as the reward score. However, we argue that such an approach may make the final ranking score deviate from the original distribution, and thereby affect model performance in the online system. In this paper, we propose a novel exploration method called extbf{A}dversarial extbf{G}radient Driven extbf{E}xploration (AGE). Specifically, we propose a Pseudo-Exploration Module to simulate the gradient updating process, which can approximate the influence of the samples of to-be-explored items for the model. In addition, for better exploration efficiency, we propose an Dynamic Threshold Unit to eliminate the effects of those samples with low potential CTR. The effectiveness of our approach was demonstrated on an open-access academic dataset. Meanwhile, AGE has also been deployed in a real-world display advertising platform and all online metrics have been significantly improved.

【4】 Enabling NAS with Automated Super-Network Generation 标题：通过自动超级网络生成实现NAS 链接：https://arxiv.org/abs/2112.10878

作者：J. Pablo Muñoz,Nikolay Lyalyushkin,Yash Akhauri,Anastasia Senina,Alexander Kozlov,Nilesh Jain 机构：Intel Labs,Intel Corporation 备注：Accepted at AAAI2022 - Practical Deep Learning in the Wild 摘要：最近的神经体系结构搜索（NAS）解决方案在训练超级网络，然后派生子网络方面取得了令人印象深刻的结果，即从预定义的搜索空间中生成的子模型，其性能优于专家构建的模型。可以为资源受限的边缘设备选择高效、健壮的子网络，使其在野外运行良好。然而，为任意体系结构构建超级网络仍然是一个挑战，常常阻碍这些方法的采用。为了应对这一挑战，我们提出了BootstrapNAS，这是一个用于自动生成NAS超级网络的软件框架。BootstrapNAS从流行的体系结构（如ResNet-50）或有效的定制设计中获取预先训练的模型，并自动从中创建超级网络，然后使用最先进的NAS技术训练超级网络，从而生成大大优于给定预先训练模型的子网络。我们通过从任意模型库生成超级网络来演示该解决方案，并使生成的超级网络可用于结果的再现性。摘要：Recent Neural Architecture Search (NAS) solutions have produced impressive results training super-networks and then deriving subnetworks, a.k.a. child models that outperform expert-crafted models from a pre-defined search space. Efficient and robust subnetworks can be selected for resource-constrained edge devices, allowing them to perform well in the wild. However, constructing super-networks for arbitrary architectures is still a challenge that often prevents the adoption of these approaches. To address this challenge, we present BootstrapNAS, a software framework for automatic generation of super-networks for NAS. BootstrapNAS takes a pre-trained model from a popular architecture, e.g., ResNet- 50, or from a valid custom design, and automatically creates a super-network out of it, then uses state-of-the-art NAS techniques to train the super-network, resulting in subnetworks that significantly outperform the given pre-trained model. We demonstrate the solution by generating super-networks from arbitrary model repositories and make available the resulting super-networks for reproducibility of the results.

【5】 TFDPM: Attack detection for cyber-physical systems with diffusion probabilistic models 标题：TFDPM：基于扩散概率模型的网络物理系统攻击检测链接：https://arxiv.org/abs/2112.10774

作者：Tijin Yan,Tong Zhou,Yufeng Zhan,Yuanqing Xia 机构：Key Laboratory of Intelligent Control Decision of Complex Systems, School of Automation, Beijing Institute of Technology, Beijing , P. R. China. 备注：27 pages, 11 figures 摘要：随着AIoT的发展，基于数据驱动的网络物理系统（CPSs）攻击检测方法受到了广泛关注。然而，现有的方法通常采用易于处理的分布来近似数据分布，这不适用于复杂系统。此外，不同通道中数据的相关性没有引起足够的重视。为了解决这些问题，我们使用基于能量的生成模型，它对数据分布的函数形式限制较少。此外，图形神经网络用于显式建模不同通道中数据的相关性。最后，我们提出了一个通用的攻击检测框架TFDPM。在给定历史数据的情况下，同时提取时间模式和特征模式。然后将提取的特征发送到条件扩散概率模型。利用条件生成网络可以得到预测值，并根据预测值和观测值之间的差异检测攻击。此外，为了实现实时检测，提出了一种条件噪声调度网络来加速预测过程。实验结果表明，TFDPM的性能优于现有的最先进的攻击检测方法。噪声调度网络将检测速度提高了三倍。摘要：With the development of AIoT, data-driven attack detection methods for cyber-physical systems (CPSs) have attracted lots of attention. However, existing methods usually adopt tractable distributions to approximate data distributions, which are not suitable for complex systems. Besides, the correlation of the data in different channels does not attract sufficient attention. To address these issues, we use energy-based generative models, which are less restrictive on functional forms of the data distribution. In addition, graph neural networks are used to explicitly model the correlation of the data in different channels. In the end, we propose TFDPM, a general framework for attack detection tasks in CPSs. It simultaneously extracts temporal pattern and feature pattern given the historical data. Then extract features are sent to a conditional diffusion probabilistic model. Predicted values can be obtained with the conditional generative network and attacks are detected based on the difference between predicted values and observed values. In addition, to realize real-time detection, a conditional noise scheduling network is proposed to accelerate the prediction process. Experimental results show that TFDPM outperforms existing state-of-the-art attack detection methods. The noise scheduling network increases the detection speed by three times.

【6】 Covert Communications via Adversarial Machine Learning and Reconfigurable Intelligent Surfaces 标题：基于对抗性机器学习和可重构智能表面的隐蔽通信链接：https://arxiv.org/abs/2112.11414

作者：Brian Kim,Tugba Erpek,Yalin E. Sagduyu,Sennur Ulukus 机构：Department of Electrical and Computer Engineering, University of Maryland, College Park, MD , USA, Intelligent Automation, Inc., Rockville, MD , USA 摘要：通过为软件定义的无线系统从大型天线移动到天线表面，可重构智能表面（RIS）依靠单元单元阵列来控制信号的散射和反射剖面，减轻传播损耗和多径衰减，从而提高覆盖范围和频谱效率。本文考虑了RIS存在时的隐蔽通信。虽然RIS促进了正在进行的传输，但预期的接收者和窃听者都各自尝试使用自己的深层神经网络（DNN）分类器检测该传输。RIS交互向量是通过平衡两个（潜在冲突的）目标来设计的，这两个目标是将发送信号聚焦到接收器，并使发送信号远离窃听者。为了加强隐蔽通信，在发射机的信号中加入对抗性干扰，以欺骗窃听者的分类器，同时保持对接收机的低影响。来自不同网络拓扑的结果表明，对抗性干扰和RIS交互向量可以联合设计，以有效地提高接收器处的信号检测精度，同时降低窃听者处的检测精度，从而实现隐蔽通信。摘要：By moving from massive antennas to antenna surfaces for software-defined wireless systems, the reconfigurable intelligent surfaces (RISs) rely on arrays of unit cells to control the scattering and reflection profiles of signals, mitigating the propagation loss and multipath attenuation, and thereby improving the coverage and spectral efficiency. In this paper, covert communication is considered in the presence of the RIS. While there is an ongoing transmission boosted by the RIS, both the intended receiver and an eavesdropper individually try to detect this transmission using their own deep neural network (DNN) classifiers. The RIS interaction vector is designed by balancing two (potentially conflicting) objectives of focusing the transmitted signal to the receiver and keeping the transmitted signal away from the eavesdropper. To boost covert communications, adversarial perturbations are added to signals at the transmitter to fool the eavesdropper's classifier while keeping the effect on the receiver low. Results from different network topologies show that adversarial perturbation and RIS interaction vector can be jointly designed to effectively increase the signal detection accuracy at the receiver while reducing the detection accuracy at the eavesdropper to enable covert communications.

半/弱/无/有监督|不确定性|主动学习(4篇)

【1】 Unsupervised deep learning techniques for powdery mildew recognition based on multispectral imaging 标题：基于多光谱成像的无监督深度学习白粉病识别技术链接：https://arxiv.org/abs/2112.11242

作者：Alessandro Benfenati,Paola Causin,Roberto Oberti,Giovanni Stefanello 机构： Dept. of Environmental Science and Policy, Universita degli Studi di Milano, Milano, Dept. of Mathematics, Universita degli Studi di Milano, Milano, Italy, Dept. of Agricultural and Environmental Sciences - Production, Landscape 摘要：目标。植物病害的可持续管理是一项具有相关经济和环境影响的公开挑战。最佳策略依赖于人类专业知识，在有利条件下进行实地侦察，以评估疾病症状的当前存在和程度。这项劳动密集型任务由于要侦察的大面积区域以及要检测的早期症状的毫米级大小而变得复杂。有鉴于此，基于图像的早期疾病症状检测是自动化该过程的一种有吸引力的方法，能够以可持续的成本实现潜在的高通量监测。方法。深度学习已成功地应用于各个领域，通过训练过程学习滤波器，自动选择相关图像特征。深度学习最近也进入了植物病害检测领域：基于这一思想，我们提出了一种自动识别黄瓜叶片白粉病的深度学习方法。我们专注于应用于多光谱成像数据的无监督深度学习技术，并建议使用自动编码器架构来研究两种疾病检测策略：i）压缩空间中的特征聚类；ii）异常检测。后果这两种方法已通过定量指标进行了评估。聚类方法本身并不能提供准确的预测，但它确实满足相关信息的需要。相反，异常检测具有很大的分辨率潜力，可以进一步利用它作为有监督体系结构（标记样本数量非常有限）的先验知识。摘要：Objectives. Sustainable management of plant diseases is an open challenge which has relevant economic and environmental impact. Optimal strategies rely on human expertise for field scouting under favourable conditions to assess the current presence and extent of disease symptoms. This labor-intensive task is complicated by the large field area to be scouted, combined with the millimeter-scale size of the early symptoms to be detected. In view of this, image-based detection of early disease symptoms is an attractive approach to automate this process, enabling a potential high throughput monitoring at sustainable costs. Methods. Deep learning has been successfully applied in various domains to obtain an automatic selection of the relevant image features by learning filters via a training procedure. Deep learning has recently entered also the domain of plant disease detection: following this idea, in this work we present a deep learning approach to automatically recognize powdery mildew on cucumber leaves. We focus on unsupervised deep learning techniques applied to multispectral imaging data and we propose the use of autoencoder architectures to investigate two strategies for disease detection: i) clusterization of features in a compressed space; ii) anomaly detection. Results. The two proposed approaches have been assessed by quantitative indices. The clusterization approach is not fully capable by itself to provide accurate predictions but it does cater relevant information. Anomaly detection has instead a significant potential of resolution which could be further exploited as a prior for supervised architectures with a very limited number of labeled samples.

【2】 Geometry-Aware Unsupervised Domain Adaptation 标题：几何感知的无监督区域自适应链接：https://arxiv.org/abs/2112.11041

作者：You-Wei Luo,Chuan-Xian Ren,Zi-Ying Chen 机构： the source and target subspaces are different but cor- 1School of Mathematics, Sun Yat-Sen University 摘要：无监督领域自适应（Unsupervised Domain adaption，UDA）的目的是在存在数据集移位的情况下，将知识从标记的源领域转移到未标记的目标领域。大多数现有方法无法很好地解决域对齐和类区分问题，这可能会扭曲下游任务（例如分类）的固有数据结构。为此，我们提出了一种新的几何感知模型，通过核范数优化同时学习可转移性和可鉴别性。我们从子空间几何的角度介绍了UDA的域相干性和类正交性。领域一致性将确保模型具有更大的学习可分离表示的能力，而类正交性将最小化簇之间的相关性，以缓解失调。因此，它们是一致的，可以相互受益。此外，我们对UDA中基于规范的学习文献提供了理论见解，这确保了我们模型的可解释性。我们发现，域和簇的范数预期将分别变大和变小，以增强可转移性和可辨别性。在标准UDA数据集上的大量实验结果证明了我们的理论和模型的有效性。摘要：Unsupervised Domain Adaptation (UDA) aims to transfer the knowledge from the labeled source domain to the unlabeled target domain in the presence of dataset shift. Most existing methods cannot address the domain alignment and class discrimination well, which may distort the intrinsic data structure for downstream tasks (e.g., classification). To this end, we propose a novel geometry-aware model to learn the transferability and discriminability simultaneously via nuclear norm optimization. We introduce the domain coherence and class orthogonality for UDA from the perspective of subspace geometry. The domain coherence will ensure the model has a larger capacity for learning separable representations, and class orthogonality will minimize the correlation between clusters to alleviate the misalignment. So, they are consistent and can benefit from each other. Besides, we provide a theoretical insight into the norm-based learning literature in UDA, which ensures the interpretability of our model. We show that the norms of domains and clusters are expected to be larger and smaller to enhance the transferability and discriminability, respectively. Extensive experimental results on standard UDA datasets demonstrate the effectiveness of our theory and model.

【3】 Mining Drifting Data Streams on a Budget: Combining Active Learning with Self-Labeling 标题：基于预算的漂移数据流挖掘：主动学习与自我标注相结合链接：https://arxiv.org/abs/2112.11019

作者：Łukasz Korycki,Bartosz Krawczyk 机构：�Lukasz KoryckiVirginia Commonwealth University, eduBartosz KrawczykVirginia Commonwealth University 摘要：挖掘数据流带来了许多挑战，包括数据的连续性和非平稳性、要处理的大量信息以及对计算资源的限制。虽然文献中针对这一问题提出了许多有监督的解决方案，但大多数都假设获得基本真相（以类别标签的形式）是无限的，并且在更新学习系统时可以立即利用这些信息。这是不现实的，因为我们必须考虑获取标签的潜在成本。因此，需要能够降低流媒体场景中地面真实性要求的解决方案。在本文中，我们提出了一个新的框架挖掘漂移数据流的预算，通过结合来自主动学习和自标记的信息。我们介绍了几种策略，这些策略可以利用智能实例选择和半监督过程，同时考虑到潜在的概念漂移。这种混合方法允许在实际的标签预算内有效地探索和利用流数据结构。由于我们的框架是作为包装器工作的，所以它可以应用于不同的学习算法。在一组具有各种类型的概念漂移的真实数据流上进行的实验研究，证明了所提出的策略在处理对类标签的高度有限访问时的有效性。当不能增加标记预算或替换效率低下的分类器时，所提出的混合方法尤其可行。我们就我们战略的适用领域提出了一系列建议。摘要：Mining data streams poses a number of challenges, including the continuous and non-stationary nature of data, the massive volume of information to be processed and constraints put on the computational resources. While there is a number of supervised solutions proposed for this problem in the literature, most of them assume that access to the ground truth (in form of class labels) is unlimited and such information can be instantly utilized when updating the learning system. This is far from being realistic, as one must consider the underlying cost of acquiring labels. Therefore, solutions that can reduce the requirements for ground truth in streaming scenarios are required. In this paper, we propose a novel framework for mining drifting data streams on a budget, by combining information coming from active learning and self-labeling. We introduce several strategies that can take advantage of both intelligent instance selection and semi-supervised procedures, while taking into account the potential presence of concept drift. Such a hybrid approach allows for efficient exploration and exploitation of streaming data structures within realistic labeling budgets. Since our framework works as a wrapper, it may be applied with different learning algorithms. Experimental study, carried out on a diverse set of real-world data streams with various types of concept drift, proves the usefulness of the proposed strategies when dealing with highly limited access to class labels. The presented hybrid approach is especially feasible when one cannot increase a budget for labeling or replace an inefficient classifier. We deliver a set of recommendations regarding areas of applicability for our strategies.

【4】 Augmented Contrastive Self-Supervised Learning for Audio Invariant Representations 标题：音频不变表示的增广对比自监督学习链接：https://arxiv.org/abs/2112.10950

作者：Melikasadat Emami,Dung Tran,Kazuhito Koishida 机构：† ECE Department, University of California, Los Angeles, Los Angeles, CA, ♦ Applied Sciences Group, Microsoft, Redmond, WA 备注：4 pages, 4 figures 摘要：由于标记数据的稀缺性，提高泛化能力是音频分类的一个主要挑战。自我监督学习（SSL）方法通过利用未标记的数据学习下游分类任务的有用特征来解决这一问题。在这项工作中，我们提出了一个增强的对比SSL框架来学习未标记数据的不变表示。我们的方法将各种扰动应用于未标记的输入数据，并利用对比学习来学习对此类扰动具有鲁棒性的表示。在Audioset和DESED数据集上的实验结果表明，我们的框架在声音/事件分类任务上显著优于最先进的SSL和监督学习方法。摘要：Improving generalization is a major challenge in audio classification due to labeled data scarcity. Self-supervised learning (SSL) methods tackle this by leveraging unlabeled data to learn useful features for downstream classification tasks. In this work, we propose an augmented contrastive SSL framework to learn invariant representations from unlabeled data. Our method applies various perturbations to the unlabeled input data and utilizes contrastive learning to learn representations robust to such perturbations. Experimental results on the Audioset and DESED datasets show that our framework significantly outperforms state-of-the-art SSL and supervised learning methods on sound/event classification tasks.

迁移|Zero/Few/One-Shot|自适应(3篇)

【1】 Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling 标题：利用自适应客户端抽样解决联合学习的系统和统计异构性链接：https://arxiv.org/abs/2112.11256

作者：Bing Luo,Wenli Xiao,Shiqiang Wang,Jianwei Huang,Leandros Tassiulas 机构：∗Shenzhen Institute of Artificial Intelligence and Robotics for Society, China, †School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, China, ‡IBM T. J. Watson Research Center, Yorktown Heights, NY, USA 备注：Accepted in IEEE INFOCOM 2022 摘要：当参与者数量较多且服务器的通信带宽有限时，联邦学习（FL）算法通常会在每一轮（部分参与）中对一小部分客户机进行采样。最近关于FL收敛性分析的工作集中于无偏客户抽样，例如，均匀随机抽样，由于高度的系统异质性和统计异质性，其收敛时间较慢。本文旨在设计一种自适应客户机采样算法，该算法同时处理系统和统计异构性，以最小化挂钟收敛时间。对于具有任意客户抽样概率的FL算法，我们得到了一个新的易于处理的收敛界。基于这个界，我们解析地建立了总学习时间和抽样概率之间的关系，从而得到了训练时间最小化的非凸优化问题。我们设计了一个有效的算法来学习收敛界中的未知参数，并开发了一个低复杂度的算法来近似求解非凸问题。硬件原型和仿真的实验结果表明，与几种基线采样方案相比，我们提出的采样方案显著缩短了收敛时间。值得注意的是，在硬件原型中，我们的方案比统一采样基线在达到相同目标损耗方面花费的时间少73%。摘要：Federated learning (FL) algorithms usually sample a fraction of clients in each round (partial participation) when the number of participants is large and the server's communication bandwidth is limited. Recent works on the convergence analysis of FL have focused on unbiased client sampling, e.g., sampling uniformly at random, which suffers from slow wall-clock time for convergence due to high degrees of system heterogeneity and statistical heterogeneity. This paper aims to design an adaptive client sampling algorithm that tackles both system and statistical heterogeneity to minimize the wall-clock convergence time. We obtain a new tractable convergence bound for FL algorithms with arbitrary client sampling probabilities. Based on the bound, we analytically establish the relationship between the total learning time and sampling probabilities, which results in a non-convex optimization problem for training time minimization. We design an efficient algorithm for learning the unknown parameters in the convergence bound and develop a low-complexity algorithm to approximately solve the non-convex problem. Experimental results from both hardware prototype and simulation demonstrate that our proposed sampling scheme significantly reduces the convergence time compared to several baseline sampling schemes. Notably, our scheme in hardware prototype spends 73% less time than the uniform sampling baseline for reaching the same target loss.

【2】 Learned ISTA with Error-based Thresholding for Adaptive Sparse Coding 标题：基于差错阈值的自适应稀疏编码学习ISTA 链接：https://arxiv.org/abs/2112.10985

作者：Ziang Li,Kailun Wu,Yiwen Guo,Changshui Zhang 机构： and Changshui Zhang are with the Institute for Arti-ficial Intelligence, Tsinghua University (THUAI), Department of Automation, TsinghuaUniversity 摘要：学习的迭代收缩阈值算法（LISTA）在一些用于稀疏编码的收缩函数中引入了具有可学习阈值的深度展开模型。根据一些理论见解，我们为LISTA提出了一种基于错误的阈值（EBT）机制，该机制利用分层重建错误的函数为每一层上的每个观察建议一个合适的阈值。我们表明，EBT机制很好地将收缩函数中的可学习参数从重建误差中分离出来，使其更适合各种观测。通过严格的理论分析，我们发现所提出的EBT除了具有更高的自适应性外，还可以在LISTA及其变体的基础上实现更快的收敛。大量的实验结果证实了我们的理论分析，并验证了我们方法的有效性。摘要：The learned iterative shrinkage thresholding algorithm (LISTA) introduces deep unfolding models with learnable thresholds in some shrinkage functions for sparse coding. Drawing on some theoretical insights, we advocate an error-based thresholding (EBT) mechanism for LISTA, which leverages a function of the layer-wise reconstruction error to suggest an appropriate threshold value for each observation on each layer. We show that the EBT mechanism well disentangles the learnable parameters in the shrinkage functions from the reconstruction errors, making them more adaptive to the various observations. With rigorous theoretical analyses, we show that the proposed EBT can lead to a faster convergence on the basis of LISTA and its variants, in addition to its higher adaptivity. Extensive experimental results confirm our theoretical analyses and verify the effectiveness of our methods.

【3】 Generalized Few-Shot Semantic Segmentation: All You Need is Fine-Tuning 标题：通用的Few-Shot语义分割：你所需要的就是微调链接：https://arxiv.org/abs/2112.10982

作者：Josh Myers-Dean,Yinan Zhao,Brian Price,Scott Cohen,Danna Gurari 机构：University of Colorado, Boulder, +University of Texas at Austin, †Adobe Research 备注：Includes supplementary materials 摘要：广义Few-Shot语义分割的引入不仅仅是对新类的Few-Shot分割模型进行评估，还包括测试它们对基类的记忆能力。虽然目前所有的方法都是基于元学习的，但它们的表现很差，在只观察了几次镜头之后，学习就饱和了。我们提出了第一个微调解决方案，并证明它解决了饱和问题，同时在两个数据集PASCAL-$5^i$和COCO-$20^i$上实现了最先进的结果。我们还表明，无论是微调多个最终层还是仅微调最终层，它都优于现有方法。最后，我们提出了一个三重态损失正则化，它展示了如何在新类别和基本类别之间重新分配性能平衡，从而使它们之间的差距更小。摘要：Generalized few-shot semantic segmentation was introduced to move beyond only evaluating few-shot segmentation models on novel classes to include testing their ability to remember base classes. While all approaches currently are based on meta-learning, they perform poorly and saturate in learning after observing only a few shots. We propose the first fine-tuning solution, and demonstrate that it addresses the saturation problem while achieving state-of-art results on two datasets, PASCAL-$5^i$ and COCO-$20^i$. We also show it outperforms existing methods whether fine-tuning multiple final layers or only the final layer. Finally, we present a triplet loss regularization that shows how to redistribute the balance of performance between novel and base categories so that there is a smaller gap between them.

强化学习(4篇)

【1】 Interpretable Preference-based Reinforcement Learning with Tree-Structured Reward Functions 标题：基于树结构奖励函数的可解释偏好强化学习链接：https://arxiv.org/abs/2112.11230

作者：Tom Bewley,Freddy Lecue 机构：University of Bristol, Bristol, United Kingdom, CortAIx, Thales, Montréal, Canada 备注：Accepted for publication at the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2022) 摘要：强化学习（RL）提供一致且性能良好的代理的潜力部分受到奖励工程问题的制约。启发式试错法的一种替代方法是基于偏好的RL（PbRL），其中奖励函数是从稀疏的人类反馈推断出来的。然而，以前的PbRL方法缺乏对学习到的奖励结构的可解释性，这妨碍了评估稳健性和一致性的能力。我们提出了一种在线的主动偏好学习算法，该算法利用树的内在可解释的组合结构构造奖励函数。通过使用合成反馈和人工反馈，我们展示了在几种环境中对树结构奖励函数的样本有效学习，然后利用增强的解释能力来探索和调试对齐。摘要：The potential of reinforcement learning (RL) to deliver aligned and performant agents is partially bottlenecked by the reward engineering problem. One alternative to heuristic trial-and-error is preference-based RL (PbRL), where a reward function is inferred from sparse human feedback. However, prior PbRL methods lack interpretability of the learned reward structure, which hampers the ability to assess robustness and alignment. We propose an online, active preference learning algorithm that constructs reward functions with the intrinsically interpretable, compositional structure of a tree. Using both synthetic and human-provided feedback, we demonstrate sample-efficient learning of tree-structured reward functions in several environments, then harness the enhanced interpretability to explore and debug for alignment.

【2】 Model-Based Safe Reinforcement Learning with Time-Varying State and Control Constraints: An Application to Intelligent Vehicles 标题：具有时变状态和控制约束的基于模型的安全强化学习：在智能车辆中的应用链接：https://arxiv.org/abs/2112.11217

作者：Xinglong Zhang,Yaoqian Peng,Biao Luo,Wei Pan,Xin Xu,Haibin Xie 机构： NationalUniversity of Defense Technology, Biao Lu is with theSchool of Automation, Central South University, Wei Panis with the Department of Cognitive Robotics, Delft University of Technology 备注：13 pages, 7 figures 摘要：近年来，基于屏障函数的安全强化学习（RL）受到了越来越多的关注，该学习具有参与者-批评家结构，用于连续控制任务。学习具有安全性和收敛性保证的近似最优控制策略仍然是一个挑战。此外，很少有研究涉及时变安全约束下的安全RL算法设计。针对具有时变状态和控制约束的非线性系统的最优控制问题，提出了一种基于模型的安全RL算法。在该方法中，我们构造了一种新的基于屏障的控制策略结构，可以保证控制安全。提出了一种多步策略评估机制，用于预测时变安全约束下的策略安全风险，指导策略安全更新。证明了稳定性和鲁棒性的理论结果。同时，对演员-评论家学习算法的收敛性进行了分析。在模拟的安全健身房环境中，该算法的性能优于几种最新的RL算法。此外，该方法还应用于两个实际智能车辆的集成路径跟踪和碰撞避免问题。使用差速驱动车辆和Ackermann驱动车辆分别验证离线部署性能和在线学习性能。我们的方法在实验中显示了令人印象深刻的模拟真实传输能力和令人满意的在线控制性能。摘要：Recently, barrier function-based safe reinforcement learning (RL) with the actor-critic structure for continuous control tasks has received increasing attention. It is still challenging to learn a near-optimal control policy with safety and convergence guarantees. Also, few works have addressed the safe RL algorithm design under time-varying safety constraints. This paper proposes a model-based safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints. In the proposed approach, we construct a novel barrier-based control policy structure that can guarantee control safety. A multi-step policy evaluation mechanism is proposed to predict the policy's safety risk under time-varying safety constraints and guide the policy to update safely. Theoretical results on stability and robustness are proven. Also, the convergence of the actor-critic learning algorithm is analyzed. The performance of the proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment. Furthermore, the approach is applied to the integrated path following and collision avoidance problem for two real-world intelligent vehicles. A differential-drive vehicle and an Ackermann-drive one are used to verify the offline deployment performance and the online learning performance, respectively. Our approach shows an impressive sim-to-real transfer capability and a satisfactory online control performance in the experiment.

【3】 A Scalable Deep Reinforcement Learning Model for Online Scheduling Coflows of Multi-Stage Jobs for High Performance Computing 标题：面向高性能计算的多阶段作业在线调度的可扩展深度强化学习模型链接：https://arxiv.org/abs/2112.11055

作者：Xin Wang,Hong Shen 机构：•Authors are with School of Computer Science and Engineering, Sun Yat-sen University 备注：12 pages, 18 figures 摘要：Coflow是最近提出的一种网络抽象，用于帮助提高数据并行计算作业的通信性能。在多阶段作业中，每个作业由多个共流组成，并由有向无环图（DAG）表示。有效地调度协同流对于提高数据中心的数据并行计算性能至关重要。与手工优化的调度启发式算法相比，现有的工作DeepWeave[1]利用强化学习（RL）框架自动生成高效的同流调度策略。它采用图形神经网络（GNN）将作业信息编码为一组嵌入向量，并将包含整个作业信息的平面嵌入向量反馈给策略网络。然而，该方法的可扩展性较差，因为它无法处理由任意大小和形状的DAG表示的作业，这需要一个大型策略网络来处理难以训练的高维嵌入向量。本文首先利用有向无环图神经网络（DAGNN）对输入进行处理，提出了一种新的流水线DAGNN，它可以有效地加快DAGNN的特征提取过程。接下来，我们将由可调度共流组成的嵌入序列而不是所有共流的平面嵌入馈送到策略网络，并输出优先级序列，这使得策略网络的大小仅取决于特征的维度，而不是作业DAG中维度和节点数的乘积。此外，为了提高优先级调度策略的准确性，我们将自注意机制引入到深度RL模型中，以捕获嵌入序列不同部分之间的交互，从而使输出优先级得分相关。基于此模型，我们开发了一种在线多阶段作业的同流调度算法。摘要：Coflow is a recently proposed networking abstraction to help improve the communication performance of data-parallel computing jobs. In multi-stage jobs, each job consists of multiple coflows and is represented by a Directed Acyclic Graph (DAG). Efficiently scheduling coflows is critical to improve the data-parallel computing performance in data centers. Compared with hand-tuned scheduling heuristics, existing work DeepWeave [1] utilizes Reinforcement Learning (RL) framework to generate highly-efficient coflow scheduling policies automatically. It employs a graph neural network (GNN) to encode the job information in a set of embedding vectors, and feeds a flat embedding vector containing the whole job information to the policy network. However, this method has poor scalability as it is unable to cope with jobs represented by DAGs of arbitrary sizes and shapes, which requires a large policy network for processing a high-dimensional embedding vector that is difficult to train. In this paper, we first utilize a directed acyclic graph neural network (DAGNN) to process the input and propose a novel Pipelined-DAGNN, which can effectively speed up the feature extraction process of the DAGNN. Next, we feed the embedding sequence composed of schedulable coflows instead of a flat embedding of all coflows to the policy network, and output a priority sequence, which makes the size of the policy network depend on only the dimension of features instead of the product of dimension and number of nodes in the job's DAG.Furthermore, to improve the accuracy of the priority scheduling policy, we incorporate the Self-Attention Mechanism into a deep RL model to capture the interaction between different parts of the embedding sequence to make the output priority scores relevant. Based on this model, we then develop a coflow scheduling algorithm for online multi-stage jobs.

【4】 Reinforcement Learning based Sequential Batch-sampling for Bayesian Optimal Experimental Design 标题：基于强化学习的序贯批抽样贝叶斯最优试验设计链接：https://arxiv.org/abs/2112.10944

作者：Yonatan Ashenafi,Piyush Pandita,Sayan Ghosh 机构：University of Alberta, Edmonton, T,G ,R, AB, Canada, General Electric Research, Niskayuna, New York, United States 摘要：使用复杂的数学方法建模的工程问题，或以昂贵的测试或实验为特征的工程问题，都被有限的预算或有限的计算资源所困扰。此外，行业中的实际情况会根据物流和偏好对实验进行的方式施加限制。例如，材料供应可能只允许在一次试验中进行少量试验，或者在计算模型的情况下，可能会面临基于共享计算资源的大量等待时间。在这种情况下，人们通常会以一种允许最大化自己的知识状态，同时满足上述实际约束条件的方式进行实验。顺序实验设计（SDOE）是一套流行的方法，近年来在不同的工程和实际问题上取得了有希望的结果。利用贝叶斯形式主义的一种常见策略是贝叶斯SDOE，它通常最适用于提前一步或短视场景，即在一系列实验的每一步选择单个实验。在这项工作中，我们的目标是扩展SDOE策略，在一批输入中查询实验或计算机代码。为此，我们利用基于深度强化学习（RL）的策略梯度方法，提出批量查询，这些查询的选择考虑了手头的全部预算。该算法保留了SDOE固有的顺序性，同时结合了基于深度RL领域任务的奖励元素。所提出的方法的一个独特功能是，一旦对其进行训练，它就能够应用于多个任务，例如优化功能。我们在一个综合问题和一个具有挑战性的高维工程问题上演示了该算法的性能。摘要：Engineering problems that are modeled using sophisticated mathematical methods or are characterized by expensive-to-conduct tests or experiments, are encumbered with limited budget or finite computational resources. Moreover, practical scenarios in the industry, impose restrictions, based on logistics and preference, on the manner in which the experiments can be conducted. For example, material supply may enable only a handful of experiments in a single-shot or in the case of computational models one may face significant wait-time based on shared computational resources. In such scenarios, one usually resorts to performing experiments in a manner that allows for maximizing one's state-of-knowledge while satisfying the above mentioned practical constraints. Sequential design of experiments (SDOE) is a popular suite of methods, that has yielded promising results in recent years across different engineering and practical problems. A common strategy, that leverages Bayesian formalism is the Bayesian SDOE, which usually works best in the one-step-ahead or myopic scenario of selecting a single experiment at each step of a sequence of experiments. In this work, we aim to extend the SDOE strategy, to query the experiment or computer code at a batch of inputs. To this end, we leverage deep reinforcement learning (RL) based policy gradient methods, to propose batches of queries that are selected taking into account entire budget in hand. The algorithm retains the sequential nature, inherent in the SDOE, while incorporating elements of reward based on task from the domain of deep RL. A unique capability of the proposed methodology is its ability to be applied to multiple tasks, for example optimization of a function, once its trained. We demonstrate the performance of the proposed algorithm on a synthetic problem, and a challenging high-dimensional engineering problem.

分层学习(2篇)

【1】 Hierarchical Over-the-Air Federated Edge Learning 标题：分层空中联合边缘学习链接：https://arxiv.org/abs/2112.11167

作者：Ozan Aygün,Mohammad Kazemi,Deniz Gündüz,Tolga M. Duman 机构：Dept. of Electrical and Electronics Engineering, Bilkent University, Ankara, Turkey, Dept. of Electrical and Electronic Engineering, Imperial College London, London, UK 备注：6 pages, 5 figures 摘要：考虑了无线通信信道上的联邦学习（FL），特别是空中（OTA）模型聚合框架。在OTA无线设置中，可以通过增加参数服务器（PS）上的接收天线数量来减轻不利的信道影响，该参数服务器（PS）执行模型聚合。然而，OTA FL的性能受到远离PS的移动用户（MU）的限制。在本文中，为了缓解这一限制，我们提出了分层空中联邦学习（HOTAFL），它利用中间服务器（is）在MU附近形成集群。我们对所提出的设置进行了收敛性分析，并通过理论和实验结果证明，在全局聚合之前，每个集群中的局部聚合比OTA FL具有更好的性能和更快的收敛速度。摘要：Federated learning (FL) over wireless communication channels, specifically, over-the-air (OTA) model aggregation framework is considered. In OTA wireless setups, the adverse channel effects can be alleviated by increasing the number of receive antennas at the parameter server (PS), which performs model aggregation. However, the performance of OTA FL is limited by the presence of mobile users (MUs) located far away from the PS. In this paper, to mitigate this limitation, we propose hierarchical over-the-air federated learning (HOTAFL), which utilizes intermediary servers (IS) to form clusters near MUs. We provide a convergence analysis for the proposed setup, and demonstrate through theoretical and experimental results that local aggregation in each cluster before global aggregation leads to a better performance and faster convergence than OTA FL.

【2】 Provable Hierarchical Lifelong Learning with a Sketch-based Modular Architecture 标题：基于草图的模块化体系结构可证明的分层终身学习链接：https://arxiv.org/abs/2112.10919

作者：Zihao Deng,Zee Fryer,Brendan Juba,Rina Panigrahy,Xin Wang 机构：Washington U. St. Louis, Google Research 摘要：我们提出了一个模块化的体系结构，用于分层结构任务的终身学习。具体地说，我们证明了我们的体系结构在理论上能够学习可以由函数解决的任务，这些函数是可学习的，只要能够访问作为子例程的其他先前学习的任务的函数。我们的经验表明，一些我们可以通过这种方式学习的任务在实践中并不是通过标准的训练方法学习的；事实上，之前的研究表明，如果没有简单任务的帮助，一些这样的任务是无法通过任何有效的方法学习的。我们还考虑自动识别任务的方法，而不依赖于明确给出的指标。摘要：We propose a modular architecture for the lifelong learning of hierarchically structured tasks. Specifically, we prove that our architecture is theoretically able to learn tasks that can be solved by functions that are learnable given access to functions for other, previously learned tasks as subroutines. We empirically show that some tasks that we can learn in this way are not learned by standard training methods in practice; indeed, prior work suggests that some such tasks cannot be learned by any efficient method without the aid of the simpler tasks. We also consider methods for identifying the tasks automatically, without relying on explicitly given indicators.

医学相关(3篇)

【1】 Automated Drug-Related Information Extraction from French Clinical Documents: ReLyfe Approach 标题：从法国临床文献中自动提取与药物相关的信息：ReLyfe方法链接：https://arxiv.org/abs/2112.11439

作者：Azzam Alwan,Maayane Attias,Larry Rubin,Adnan El Bakri 机构：R&D Department, ReLyfe - Medical Intelligence, Paris, France, Computer Science, Ecole Polytechnique, Palaiseau, France, BeCareLink, New York, United States 备注：None 摘要：在法国，构建医疗数据仍然是一项挑战，主要是因为出于隐私考虑，缺乏医疗数据，以及缺乏处理法语的方法和途径。这些挑战之一是在法国临床文献中构建药物相关信息。据我们所知，在过去十年中，研究法国处方的相关论文不到五篇。本文提出了一种从法国临床扫描文档中提取药物相关信息的新方法，同时保护患者的隐私。此外，我们在一个健康数据管理平台中部署了我们的方法，用于构建药物医疗数据并帮助患者组织他们的药物计划。它可以在任何web或移动平台上实现。这项工作通过创建适用于实际生产问题的应用程序，缩小了理论工作和实际工作之间的差距。它是基于规则的阶段和深度学习方法的结合。最后，数值结果表明了该方法的优越性和相关性。摘要：Structuring medical data in France remains a challenge mainly because of the lack of medical data due to privacy concerns and the lack of methods and approaches on processing the French language. One of these challenges is structuring drug-related information in French clinical documents. To our knowledge, over the last decade, there are less than five relevant papers that study French prescriptions. This paper proposes a new approach for extracting drug-related information from French clinical scanned documents while preserving patients' privacy. In addition, we deployed our method in a health data management platform where it is used to structure drug medical data and help patients organize their drug schedules. It can be implemented on any web or mobile platform. This work closes the gap between theoretical and practical work by creating an application adapted to real production problems. It is a combination of a rule-based phase and a Deep Learning approach. Finally, numerical results show the outperformance and relevance of the proposed methodology.

【2】 Predicting infections in the Covid-19 pandemic -- lessons learned 标题：预测冠状病毒大流行中的感染--吸取的教训链接：https://arxiv.org/abs/2112.11187

作者：Sharare Zehtabian,Siavash Khodadadeh,Damla Turgut,Ladislau Bölöni 机构：Department of Computer Science, University of Central Florida, Orlando, Florida 摘要：在整个COVID-19大流行中，已经投入了大量的努力来开发技术，这些技术在各种公共政策和非药物干预假设下预测感染的数量。虽然可用数据、人工智能模型的复杂性和可用计算能力都超过了前几年的可用数据，但预测方法的总体成功率非常有限。在本文中，我们从预测算法提出X大奖奖流行病应对挑战，并考虑几个方向，可能会允许他们的改进。然后，我们调查了他们在几个月的中期预测中的表现。我们发现，通过增加有关建模区域文化的额外信息来增强算法，结合传统的分区模型和最新的深度学习架构，可以提高短期预测的性能，中期预测的准确性仍然很低，要使这些模型成为公共政策工具箱的可靠组成部分，还需要进行大量的未来研究。摘要：Throughout the Covid-19 pandemic, a significant amount of effort had been put into developing techniques that predict the number of infections under various assumptions about the public policy and non-pharmaceutical interventions. While both the available data and the sophistication of the AI models and available computing power exceed what was available in previous years, the overall success of prediction approaches was very limited. In this paper, we start from prediction algorithms proposed for XPrize Pandemic Response Challenge and consider several directions that might allow their improvement. Then, we investigate their performance over medium-term predictions extending over several months. We find that augmenting the algorithms with additional information about the culture of the modeled region, incorporating traditional compartmental models and up-to-date deep learning architectures can improve the performance for short term predictions, the accuracy of medium-term predictions is still very low and a significant amount of future research is needed to make such models a reliable component of a public policy toolbox.

【3】 HarmoFL: Harmonizing Local and Global Drifts in Federated Learning on Heterogeneous Medical Images 标题：HarmoFL：异构医学图像联合学习中局部和全局漂移的协调链接：https://arxiv.org/abs/2112.10775

作者：Meirui Jiang,Zirui Wang,Qi Dou 机构： Department of Computer Science and Engineering, The Chinese University of Hong Kong, School of Biological Science and Medical Engineering, Beihang University 摘要：多个医疗机构使用联合学习（FL）协作训练模型已成为最大限度地发挥数据驱动模型潜力的一个有希望的解决方案，但医学图像中的非独立同分布（非iid）数据仍然是现实世界实践中的一个突出挑战。不同的扫描程序或协议导致的特征异质性在本地（客户端）和全局（服务器）优化的学习过程中引入了漂移，这损害了收敛性和模型性能。以前的许多工作都试图通过局部或全局解决漂移来解决非iid问题，但如何联合解决这两个基本耦合的漂移仍然不清楚。在这项工作中，我们专注于处理本地和全球漂移，并引入了一个新的协调框架HarmoFL。首先，我们建议通过将变换到频域的图像的振幅归一化以模拟统一的成像设置来减轻局部更新漂移，以便在局部客户机上生成协调的特征空间。其次，基于协调特征，我们设计了一个客户权重扰动，引导每个局部模型达到平坦最优，其中局部最优解的邻域具有一致的低损失。在没有任何额外通信开销的情况下，扰动通过聚合多个局部平坦最优解，帮助全局模型朝着收敛最优解进行优化。我们对所提出的方法进行了理论分析，并在三个医学图像分类和分割任务上进行了大量实验，结果表明HarmoFL的性能优于一组最新的具有良好收敛性能的方法。摘要：Multiple medical institutions collaboratively training a model using federated learning (FL) has become a promising solution for maximizing the potential of data-driven models, yet the non-independent and identically distributed (non-iid) data in medical images is still an outstanding challenge in real-world practice. The feature heterogeneity caused by diverse scanners or protocols introduces a drift in the learning process, in both local (client) and global (server) optimizations, which harms the convergence as well as model performance. Many previous works have attempted to address the non-iid issue by tackling the drift locally or globally, but how to jointly solve the two essentially coupled drifts is still unclear. In this work, we concentrate on handling both local and global drifts and introduce a new harmonizing framework called HarmoFL. First, we propose to mitigate the local update drift by normalizing amplitudes of images transformed into the frequency domain to mimic a unified imaging setting, in order to generate a harmonized feature space across local clients. Second, based on harmonized features, we design a client weight perturbation guiding each local model to reach a flat optimum, where a neighborhood area of the local optimal solution has a uniformly low loss. Without any extra communication cost, the perturbation assists the global model to optimize towards a converged optimal solution by aggregating several local flat optima. We have theoretically analyzed the proposed method and empirically conducted extensive experiments on three medical image classification and segmentation tasks, showing that HarmoFL outperforms a set of recent state-of-the-art methods with promising convergence behavior.

推荐(2篇)

【1】 FedPOIRec: Privacy Preserving Federated POI Recommendation with Social Influence 标题：FedPOIRec：具有社会影响力的隐私保护联合POIRec推荐链接：https://arxiv.org/abs/2112.11134

作者：Vasileios Perifanis,George Drosatos,Giorgos Stamatelatos,Pavlos S. Efraimidis 机构：Institute for Language and Speech Processing, Athena Research Center, Kimmeria, Xanthi, Greece, Department of Electrical and Computer Engineering, Democritus University of Thrace, Kimmeria, Xanthi, Greece 摘要：随着基于位置的社交网络数量的不断增加，隐私保护位置预测已成为帮助用户发现新兴趣点（POI）的主要任务。传统的系统考虑集中式的方法，需要传输和收集用户的私有数据。在这项工作中，我们介绍了FedPOIRec，这是一种保护隐私的联合学习方法，它通过用户社交圈的功能增强了top-N$POI推荐。首先，FedPOIRec框架基于本地数据永远不会离开所有者设备的原则构建，而本地更新由参数服务器盲目聚合。其次，本地推荐人通过允许用户交换所学参数来实现个性化，从而实现朋友之间的知识转移。为此，我们利用CKKS完全同态加密方案的特性，提出了一种隐私保护协议，用于在联邦计算后集成用户朋友的偏好。为了评估FedPOIRec，我们使用两个推荐模型将我们的方法应用于五个真实世界的数据集。大量实验表明，FedPOIRec实现了与集中式方法相当的推荐质量，而社会整合协议在用户端产生了较低的计算和通信开销。摘要：With the growing number of Location-Based Social Networks, privacy preserving location prediction has become a primary task for helping users discover new points-of-interest (POIs). Traditional systems consider a centralized approach that requires the transmission and collection of users' private data. In this work, we present FedPOIRec, a privacy preserving federated learning approach enhanced with features from users' social circles for top-$N$ POI recommendations. First, the FedPOIRec framework is built on the principle that local data never leave the owner's device, while the local updates are blindly aggregated by a parameter server. Second, the local recommenders get personalized by allowing users to exchange their learned parameters, enabling knowledge transfer among friends. To this end, we propose a privacy preserving protocol for integrating the preferences of a user's friends after the federated computation, by exploiting the properties of the CKKS fully homomorphic encryption scheme. To evaluate FedPOIRec, we apply our approach into five real-world datasets using two recommendation models. Extensive experiments demonstrate that FedPOIRec achieves comparable recommendation quality to centralized approaches, while the social integration protocol incurs low computation and communication overhead on the user side.

【2】 Synthetic Data and Simulators for Recommendation Systems: Current State and Future Directions 标题：推荐系统中的合成数据和模拟器：现状和未来发展方向链接：https://arxiv.org/abs/2112.11022

作者：Adam Lesnikowski,Gabriel de Souza Pereira Moreira,Sara Rabhi,Karl Byleen-Higley 备注：7 pages, included in SimuRec 2021: Workshop on Simulation Methods for Recommender Systems at ACM RecSys 2021, October 2nd, 2021, Amsterdam, NL and online 摘要：合成数据和模拟器有可能显著提高推荐系统的性能和鲁棒性。这些方法已经在其他机器学习驱动的领域产生了有益的影响。我们确定并讨论了在过去关于推荐系统的合成数据和模拟器的工作中，数据保真度和隐私之间的关键权衡。对于从合成数据预测真实数据的算法排名的重要用例，我们提供了动机和当前的成功与局限。最后，我们概述了推荐系统的一些令人兴奋的未来方向，我们认为这些方向值得进一步关注和工作，包括混合真实和合成数据、数据集生成中的反馈、鲁棒模拟和隐私保护方法。摘要：Synthetic data and simulators have the potential to markedly improve the performance and robustness of recommendation systems. These approaches have already had a beneficial impact in other machine-learning driven fields. We identify and discuss a key trade-off between data fidelity and privacy in the past work on synthetic data and simulators for recommendation systems. For the important use case of predicting algorithm rankings on real data from synthetic data, we provide motivation and current successes versus limitations. Finally we outline a number of exciting future directions for recommendation systems that we believe deserve further attention and work, including mixing real and synthetic data, feedback in dataset generation, robust simulations, and privacy-preserving methods.

聚类(1篇)

【1】 What are Attackers after on IoT Devices? An approach based on a multi-phased multi-faceted IoT honeypot ecosystem and data clustering 标题：攻击者在物联网设备上的目标是什么？一种基于多阶段多层面物联网蜜罐生态系统和数据聚类的方法链接：https://arxiv.org/abs/2112.10974

作者：Armin Ziaie Tabari,Xinming Ou,Anoop Singhal 机构：University of South Florida, Tampa, FL, USA, National Institute of Standards and, Technology, Gaithersburg, Maryland, USA 备注：arXiv admin note: text overlap with arXiv:2003.01218 摘要：物联网（IoT）设备数量的不断增加，使得人们必须意识到它们在网络安全方面所面临的现实威胁。虽然蜜罐历来被用作诱饵设备，以帮助研究人员/组织更好地了解网络上威胁的动态及其影响，但由于设备及其物理连接的多样性，物联网设备为此提出了独特的挑战。在这项工作中，通过观察真实世界攻击者在低交互度蜜罐生态系统中的行为，我们（1）提出了一种创建多阶段、多方面蜜罐生态系统的新方法，该方法逐渐提高了蜜罐与对手交互的复杂性，（2）为摄像头设计并开发了一个低交互蜜罐，使研究人员能够更深入地了解攻击者的目标；（3）设计了一种创新的数据分析方法，以确定对手的目标。我们的蜜罐已经活跃了三年多。我们能够在每个阶段收集越来越复杂的攻击数据。此外，我们的数据分析表明，蜜罐中捕获的绝大多数攻击活动具有显著的相似性，可以进行聚类和分组，以便更好地了解野生物联网攻击的目标、模式和趋势。摘要：The growing number of Internet of Things (IoT) devices makes it imperative to be aware of the real-world threats they face in terms of cybersecurity. While honeypots have been historically used as decoy devices to help researchers/organizations gain a better understanding of the dynamic of threats on a network and their impact, IoT devices pose a unique challenge for this purpose due to the variety of devices and their physical connections. In this work, by observing real-world attackers' behavior in a low-interaction honeypot ecosystem, we (1) presented a new approach to creating a multi-phased, multi-faceted honeypot ecosystem, which gradually increases the sophistication of honeypots' interactions with adversaries, (2) designed and developed a low-interaction honeypot for cameras that allowed researchers to gain a deeper understanding of what attackers are targeting, and (3) devised an innovative data analytics method to identify the goals of adversaries. Our honeypots have been active for over three years. We were able to collect increasingly sophisticated attack data in each phase. Furthermore, our data analytics points to the fact that the vast majority of attack activities captured in the honeypots share significant similarity, and can be clustered and grouped to better understand the goals, patterns, and trends of IoT attacks in the wild.

超分辨率|去噪|去模糊|去雾(1篇)

【1】 Can We Use Neural Regularization to Solve Depth Super-Resolution? 标题：我们能用神经正则化来解决深度超分辨问题吗？链接：https://arxiv.org/abs/2112.11085

作者：Milena Gazdieva,Oleg Voynov,Alexey Artemov,Youyi Zheng,Luiz Velho,Evgeny Burnaev 机构：Skolkovo Institute of Science and Technology, Moscow, Russia, State Key Lab, Zhejiang University, Hangzhou, China, Instituto Nacional de Matem´atica Pura e Aplicada, Rio de Janeiro, Brazil 备注：9 pages 摘要：使用商品传感器捕获的深度图通常需要超分辨率才能在应用中使用。在这项工作中，我们研究了一种基于Tikhonov正则化的变分问题陈述的超分辨率方法，其中正则化子用深度神经网络参数化。这种方法曾成功地应用于光声层析成像。我们的实验表明，它的应用深度地图超分辨率是困难的，并提供了有关原因的建议。摘要：Depth maps captured with commodity sensors often require super-resolution to be used in applications. In this work we study a super-resolution approach based on a variational problem statement with Tikhonov regularization where the regularizer is parametrized with a deep neural network. This approach was previously applied successfully in photoacoustic tomography. We experimentally show that its application to depth map super-resolution is difficult, and provide suggestions about the reasons for that.

点云|SLAM|雷达|激光|深度RGBD相关(2篇)

【1】 Deep Learning Based 3D Point Cloud Regression for Estimating Forest Biomass 标题：基于深度学习的三维点云回归估计森林生物量链接：https://arxiv.org/abs/2112.11335

作者：Stefan Oehmcke,Lei Li,Jaime Revenga,Thomas Nord-Larsen,Katerina Trepekli,Fabian Gieseke,Christian Igel 机构：University of Copenhagen, University of M¨unster 摘要：了解森林生物量存量及其发展对于实施有效的气候变化缓解措施十分重要。它是研究驱动af、re和毁林过程的必要条件，也是碳核算的先决条件。利用机载激光雷达进行遥感可以大范围测量植被生物量。我们提出了一种深度学习系统，用于直接从3D激光雷达点云数据预测木材体积、地上生物量（AGB）以及随后的碳。我们为点云回归设计了不同的神经网络结构，并在国家森林资源清查中通过实地测量获得AGB估计值的地区的遥感数据上对其进行评估。我们采用Minkowski卷积神经网络进行回归得到了最好的结果。与基于点云基本统计数据的最新方法相比，深层神经网络产生了更精确的木材体积、AGB和碳估算，我们预计这一发现将对基于激光雷达的陆地生态系统动力学分析产生重大影响。摘要：Knowledge of forest biomass stocks and their development is important for implementing effective climate change mitigation measures. It is needed for studying the processes driving af-, re-, and deforestation and is a prerequisite for carbon-accounting. Remote sensing using airborne LiDAR can be used to measure vegetation biomass at large scale. We present deep learning systems for predicting wood volume, above-ground biomass (AGB), and subsequently carbon directly from 3D LiDAR point cloud data. We devise different neural network architectures for point cloud regression and evaluate them on remote sensing data of areas for which AGB estimates have been obtained from field measurements in a national forest inventory. Our adaptation of Minkowski convolutional neural networks for regression gave the best results. The deep neural networks produced significantly more accurate wood volume, AGB, and carbon estimates compared to state-of-the-art approaches operating on basic statistics of the point clouds, and we expect this finding to have a strong impact on LiDAR-based analyses of terrestrial ecosystem dynamics.

【2】 Predicting Defects in Laser Powder Bed Fusion using in-situ Thermal Imaging Data and Machine Learning 标题：基于原位热成像数据和机器学习的激光粉床融合缺陷预测链接：https://arxiv.org/abs/2112.11212

作者：Sina Malakpour Estalaki,Cody S. Lough,Robert G. Landers,Edward C. Kinzel,Tengfei Luo 机构：Department of Aerospace and Mechanical Engineering, University of Notre Dame, Notre Dame, Indiana, Department of Mechanical and Aerospace Engineering, Missouri University of Science and Technology, Rolla, MO, USA 摘要：在添加剂制造（AM）的激光粉末床熔合（LPBF）过程中，局部热历史的变化会导致微孔缺陷。有人提议采用原位传感来监测AM过程，以最大限度地减少缺陷，但成功需要在传感数据和孔隙度之间建立定量关系，这对于大量变量和计算成本来说尤其具有挑战性。在这项工作中，我们开发了机器学习（ML）模型，可以使用原位热成像数据预测LPBF不锈钢材料的微孔。这项工作考虑了从热历史中确定的两个关键特征：高于视熔化阈值（/tau）的时间和最大辐射（T_{max}）。这些特征将被计算，存储在构建材质中的每个体素中，并用作输入。每个体素的二元状态（缺陷或正常）为输出。针对二元分类任务，对不同的ML模型进行了训练和测试。除了使用每个体素的热特征来预测其自身的状态外，还包括相邻体素的热特征作为输入。这被证明可以提高预测精度，这与每个体素周围的热传输物理对其最终状态的贡献是一致的。在训练的模型中，对于随机森林，测试集的F1分数达到0.96以上。基于ML模型的特征重要性分析表明，T_{max}对体素状态的重要性大于/tau。分析还发现，当前体素上方体素的热历史比下方体素的热历史影响更大。摘要：Variation in the local thermal history during the laser powder bed fusion (LPBF) process in additive manufacturing (AM) can cause microporosity defects. in-situ sensing has been proposed to monitor the AM process to minimize defects, but the success requires establishing a quantitative relationship between the sensing data and the porosity, which is especially challenging for a large number of variables and computationally costly. In this work, we develop machine learning (ML) models that can use in-situ thermographic data to predict the microporosity of LPBF stainless steel materials. This work considers two identified key features from the thermal histories: the time above the apparent melting threshold (/tau) and the maximum radiance (T_{max}). These features are computed, stored for each voxel in the built material, are used as inputs. The binary state of each voxel, either defective or normal, is the output. Different ML models are trained and tested for the binary classification task. In addition to using the thermal features of each voxel to predict its own state, the thermal features of neighboring voxels are also included as inputs. This is shown to improve the prediction accuracy, which is consistent with thermal transport physics around each voxel contributing to its final state. Among the models trained, the F1 scores on test sets reach above 0.96 for random forests. Feature importance analysis based on the ML models shows that T_{max}is more important to the voxel state than /tau. The analysis also finds that the thermal history of the voxels above the present voxel is more influential than those beneath it.

推理|分析|理解|解释(3篇)

【1】 Offloading Algorithms for Maximizing Inference Accuracy on Edge Device Under a Time Constraint 标题：时间约束下最大化边缘设备推理精度的卸载算法链接：https://arxiv.org/abs/2112.11413

作者：Andrea Fresa,Jaya Prakash Champati 机构：IMDEA Networks Institute, Madrid, Spain 摘要：随着边缘计算的出现，边缘设备（ED）和边缘服务器（ES）之间的作业卸载问题在过去受到了极大的关注。由于越来越多的应用程序使用机器学习（ML）推理，我们通过考虑以下新的方面来研究卸载推理作业的问题：1）与典型的计算作业相比，推理作业的处理时间取决于ML模型的大小，2）最近提出的用于资源受限设备的深度神经网络（DNN）提供了缩放模型大小的选择。我们制定了一个分配问题，目的是最大限度地提高在教育部可用的n个数据样本的总推理精度，并在最大完工时间上受时间约束。我们提出了一个近似算法AMR2，并证明了它的最大完工时间最多为2T，并且实现了一个总精度，该总精度比最佳总精度低一个小常数。作为概念证明，我们在配备MobileNet的Raspberry Pi上实现了AMR2，并将其连接到配备ResNet的服务器上，研究了AMR2在图像分类应用中的总体精度和完成时间性能。摘要：With the emergence of edge computing, the problem of offloading jobs between an Edge Device (ED) and an Edge Server (ES) received significant attention in the past. Motivated by the fact that an increasing number of applications are using Machine Learning (ML) inference, we study the problem of offloading inference jobs by considering the following novel aspects: 1) in contrast to a typical computational job, the processing time of an inference job depends on the size of the ML model, and 2) recently proposed Deep Neural Networks (DNNs) for resource-constrained devices provide the choice of scaling the model size. We formulate an assignment problem with the aim of maximizing the total inference accuracy of n data samples available at the ED, subject to a time constraint T on the makespan. We propose an approximation algorithm AMR2, and prove that it results in a makespan at most 2T, and achieves a total accuracy that is lower by a small constant from optimal total accuracy. As proof of concept, we implemented AMR2 on a Raspberry Pi, equipped with MobileNet, and is connected to a server equipped with ResNet, and studied the total accuracy and makespan performance of AMR2 for image classification application.

【2】 Toward Explainable AI for Regression Models 标题：回归模型的可解释人工智能链接：https://arxiv.org/abs/2112.11407

作者：Simon Letzgus,Patrick Wagner,Jonas Lederer,Wojciech Samek,Klaus-Robert Müller,Gregoire Montavon 机构：Gr´egoire Montavon∗, Machine Learning Group, Technische Universit¨at Berlin, Berlin, Germany, Department of Artificial Intelligence, Fraunhofer Heinrich Hertz Institute, Berlin, Germany 备注：17 pages, 10 figures, preprint 摘要：除了机器学习（ML）模型令人印象深刻的预测能力外，最近出现了解释方法，可以解释复杂的非线性学习模型，如深度神经网络。获得更好的理解尤其重要，例如对于安全关键的ML应用程序或医疗诊断等。虽然此类可解释AI（XAI）技术在分类器中已经非常流行，但到目前为止，很少有人关注回归模型（XAIR）的XAI。在这篇综述中，我们阐明了XAI在回归和分类任务中的基本概念差异，为XAIR建立了新的理论见解和分析，提供了XAIR在真实实际回归问题上的演示，最后讨论了该领域仍然存在的挑战。摘要：In addition to the impressive predictive power of machine learning (ML) models, more recently, explanation methods have emerged that enable an interpretation of complex non-linear learning models such as deep neural networks. Gaining a better understanding is especially important e.g. for safety-critical ML applications or medical diagnostics etc. While such Explainable AI (XAI) techniques have reached significant popularity for classifiers, so far little attention has been devoted to XAI for regression models (XAIR). In this review, we clarify the fundamental conceptual differences of XAI for regression and classification tasks, establish novel theoretical insights and analysis for XAIR, provide demonstrations of XAIR on genuine practical regression problems, and finally discuss the challenges remaining for the field.

【3】 Doubly-Valid/Doubly-Sharp Sensitivity Analysis for Causal Inference with Unmeasured Confounding 标题：不可测混杂因果推断的双有效/双锐度灵敏度分析链接：https://arxiv.org/abs/2112.11449

作者：Jacob Dorn,Kevin Guo,Nathan Kallus 机构：Princeton University, Stanford University, Cornell University 摘要：在Tan（2006）的边际敏感性模型下，我们研究了在存在未观察到的混杂因素的情况下构建平均治疗效果的界限的问题。结合现有的涉及对抗倾向得分的特征和问题的新分布稳健特征，我们提出了这些界的新估计量，我们称之为“双有效/双锐”（DVDS）估计量。双锐度对应于这样一个事实，即即使两个干扰参数中的一个被错误指定，DVDS估计器也会一致地估计灵敏度模型所暗示的最紧可能（即锐度）的界限，并且在所有干扰参数适当一致时达到半参数效率。双重有效性是部分识别的一个全新属性：DVDS估计器即使在大多数讨厌的参数被错误指定的情况下，仍然提供有效的，但不尖锐的界限。事实上，即使在DVDS点估计不能渐近正态的情况下，标准Wald置信区间也可能仍然有效。在二元结果的情况下，DVDS估计器特别方便，并且在结果回归和倾向得分方面具有封闭形式的表达式。我们在模拟研究以及右心导管插入术的案例研究中展示了DVDS估计器。摘要：We study the problem of constructing bounds on the average treatment effect in the presence of unobserved confounding under the marginal sensitivity model of Tan (2006). Combining an existing characterization involving adversarial propensity scores with a new distributionally robust characterization of the problem, we propose novel estimators of these bounds that we call "doubly-valid/doubly-sharp" (DVDS) estimators. Double sharpness corresponds to the fact that DVDS estimators consistently estimate the tightest possible (i.e., sharp) bounds implied by the sensitivity model even when one of two nuisance parameters is misspecified and achieve semiparametric efficiency when all nuisance parameters are suitably consistent. Double validity is an entirely new property for partial identification: DVDS estimators still provide valid, though not sharp, bounds even when most nuisance parameters are misspecified. In fact, even in cases when DVDS point estimates fail to be asymptotically normal, standard Wald confidence intervals may remain valid. In the case of binary outcomes, the DVDS estimators are particularly convenient and possesses a closed-form expression in terms of the outcome regression and propensity score. We demonstrate the DVDS estimators in a simulation study as well as a case study of right heart catheterization.

检测相关(2篇)

【1】 Sports Video: Fine-Grained Action Detection and Classification of Table Tennis Strokes from Videos for MediaEval 2021 标题：体育视频：2021年中世纪体育视频中乒乓球击球动作的细粒度检测与分类链接：https://arxiv.org/abs/2112.11384

作者：Pierre-Etienne Martin,Jordan Calandre,Boris Mansencal,Jenny Benois-Pineau,Renaud Péteri,Laurent Mascarilla,Julien Morlier 机构：CCP Department, Max Planck Institute for Evolutionary Anthropology, D-, Leipzig, Germany, MIA, La Rochelle University, La Rochelle, France, Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, Talence, France, IMS, University of Bordeaux, Talence, France 备注：MediaEval 2021, Dec 2021, Online, Germany 摘要：由于应用领域的多样性，体育视频分析是一个流行的研究课题，从具有用户定制摘要的多媒体智能设备到运动员表现分析。体育视频任务是中世纪2021基准的一部分。此任务处理视频中的细粒度动作检测和分类。重点是乒乓球比赛的录音。该任务自2019年开始运行，针对在自然条件下记录的未经修剪的视频（每一次中风的时间边界已知）提出了分类挑战。今年，该数据集得到了扩展，此外，它还提供了一个来自未经剪辑的无注释视频的检测挑战。这项工作旨在为体育教练和运动员创建工具，以便分析运动成绩。运动分析和运动员档案可以建立在这些技术的基础上，以丰富运动员的训练经验，提高他们的成绩。摘要：Sports video analysis is a prevalent research topic due to the variety of application areas, ranging from multimedia intelligent devices with user-tailored digests up to analysis of athletes' performance. The Sports Video task is part of the MediaEval 2021 benchmark. This task tackles fine-grained action detection and classification from videos. The focus is on recordings of table tennis games. Running since 2019, the task has offered a classification challenge from untrimmed video recorded in natural conditions with known temporal boundaries for each stroke. This year, the dataset is extended and offers, in addition, a detection challenge from untrimmed videos without annotations. This work aims at creating tools for sports coaches and players in order to analyze sports performance. Movement analysis and player profiling may be built upon such technology to enrich the training experience of athletes and improve their performance.

【2】 A Pilot Study on Detecting Unfairness in Human Decisions With Machine Learning Algorithmic Bias Detection 标题：机器学习算法偏差检测在人类决策不公平性检测中的初步研究链接：https://arxiv.org/abs/2112.11279

作者：Zhe Yu,Xiaoyin Xi 机构：Rochester Institute of Technology, Rochester, NY, USA 摘要：决策公平是我们社会的一个长期问题。尽管在机器学习模型中关于减少不公平的研究活动越来越多，但很少有研究集中于减少人类决策中的不公平。人类决策中的公平性与机器学习模型中的公平性同等重要（如果不是更重要的话），因为存在人类做出最终决策的过程，并且机器学习模型可以从他们接受训练的人类决策中继承偏见。因此，这项工作旨在检测人类决策中的不公平，这是解决不公平人类决策问题的第一步。本文提出利用现有的机器学习公平性检测机制来检测人类决策中的不公平性。这背后的基本原理是，虽然很难直接测试人类是否做出了不公平的决定，但根据目前关于机器学习公平性的研究，现在很容易以低成本大规模测试机器学习模型是否不公平。通过在四个一般机器学习公平性数据集和一个图像处理数据集上合成不公平标签，本文表明，该方法能够检测（1）训练数据中是否存在不公平标签，以及（2）不公平的程度和方向。我们相信这项工作展示了利用机器学习公平性检测人类决策公平性的潜力。在这项工作之后，可以进行以下研究：（1）防止未来的不公平决策，（2）修复先前的不公平决策，（3）训练更公平的机器学习模型。摘要：Fairness in decision-making has been a long-standing issue in our society. Despite the increasing number of research activities on unfairness mitigation in machine learning models, there is little research focusing on mitigating unfairness in human decisions. Fairness in human decisions is as important as, if not more important than, fairness in machine learning models since there are processes where humans make the final decisions and machine learning models can inherit bias from the human decisions they were trained on. As a result, this work aims to detect unfairness in human decisions, the very first step of solving the unfair human decision problem. This paper proposes to utilize the existing machine learning fairness detection mechanisms to detect unfairness in human decisions. The rationale behind this is, while it is difficult to directly test whether a human makes unfair decisions, with current research on machine learning fairness, it is now easy to test, on a large scale at a low cost, whether a machine learning model is unfair. By synthesizing unfair labels on four general machine learning fairness datasets and one image processing dataset, this paper shows that the proposed approach is able to detect (1) whether or not unfair labels exist in the training data and (2) the degree and direction of the unfairness. We believe that this work demonstrates the potential of utilizing machine learning fairness to detect human decision fairness. Following this work, research can be conducted on (1) preventing future unfair decisions, (2) fixing prior unfair decisions, and (3) training a fairer machine learning model.

分类|识别(3篇)

【1】 Jamming Pattern Recognition over Multi-Channel Networks: A Deep Learning Approach 标题：多通道网络干扰模式识别：一种深度学习方法链接：https://arxiv.org/abs/2112.11222

作者：Ali Pourranjbar,Georges Kaddoum,Walid Saad 机构：Department of Electrical Engineering ETS - University of Quebec, Montreal QC, Canada 摘要：随着智能干扰机的出现，干扰攻击已成为无线系统性能的更严重威胁。智能干扰机能够改变其策略以最小化被合法节点跟踪的概率。因此，对抗此类干扰机需要一种能够不断调整干扰策略的抗干扰机制。值得注意的是，现有的抗干扰方法在这里不适用，因为它们主要集中在用不变的干扰策略来减轻干扰攻击，并且很少考虑智能干扰机作为对手。因此，本文提出采用干扰类型识别技术与抗干扰技术相结合。所提出的识别方法采用了一种递归神经网络，该网络将干扰机占用的信道作为输入，并输出干扰机类型。在该方案下，首先确定实时干扰策略，然后选择最合适的对策。因此，使用建议的识别技术，可以立即检测到干扰机策略的任何变化，从而允许快速切换到适合新干扰策略的新抗干扰方法。为了评估所提出的识别方法的性能，将检测精度作为干扰机策略切换时间的函数进行推导。仿真结果表明，当干扰机每5个时隙切换其策略时，所有考虑的用户数的检测精度均大于70%，当干扰机策略切换时间为45时，检测精度提高到90%。摘要：With the advent of intelligent jammers, jamming attacks have become a more severe threat to the performance of wireless systems. An intelligent jammer is able to change its policy to minimize the probability of being traced by legitimate nodes. Thus, an anti-jamming mechanism capable of constantly adjusting to the jamming policy is required to combat such a jammer. Remarkably, existing anti-jamming methods are not applicable here because they mainly focus on mitigating jamming attacks with an invariant jamming policy, and they rarely consider an intelligent jammer as an adversary. Therefore, in this paper, to employ a jamming type recognition technique working alongside an anti-jamming technique is proposed. The proposed recognition method employs a recurrent neural network that takes the jammer's occupied channels as inputs and outputs the jammer type. Under this scheme, the real-time jammer policy is first identified, and, then, the most appropriate countermeasure is chosen. Consequently, any changes to the jammer policy can be instantly detected with the proposed recognition technique allowing for a rapid switch to a new anti-jamming method fitted to the new jamming policy. To evaluate the performance of the proposed recognition method, the accuracy of the detection is derived as a function of the jammer policy switching time. Simulation results show the detection accuracy for all the considered users numbers is greater than 70% when the jammer switches its policy every 5 time slots and the accuracy raises to 90% when the jammer policy switching time is 45.

【2】 Subject-Independent Drowsiness Recognition from Single-Channel EEG with an Interpretable CNN-LSTM model 标题：基于可解释CNN-LSTM模型的单通道脑电信号非受试者嗜睡识别链接：https://arxiv.org/abs/2112.10894

作者：Jian Cui,Zirui Lan,Tianhu Zheng,Yisi Liu,Olga Sourina,Lipo Wang,Wolfgang Müller-Wittig 机构：Fraunhofer Singapore, Nanyang Technological University 备注：None 摘要：对于基于EEG的嗜睡识别，需要使用独立于主体的识别，因为对每个主体进行校准非常耗时。在本文中，我们提出了一种新的卷积神经网络（CNN）-长-短期记忆（LSTM）模型，用于从单通道EEG信号中识别独立于主体的睡意。与现有的深度学习模型（大多被视为黑盒分类器）不同，该模型可以通过揭示样本的哪些部分包含模型识别的重要特征来解释其对每个输入样本的决策。这是通过利用LSTM层输出的隐藏状态的可视化技术实现的。结果表明，在公共数据集上，该模型对11名受试者的平均识别准确率为72.97%，高于传统的基线方法55.42%-69.27%和先进的深度学习方法。可视化结果表明，该模型发现了与不同受试者不同心理状态相关的EEG信号的有意义模式。摘要：For EEG-based drowsiness recognition, it is desirable to use subject-independent recognition since conducting calibration on each subject is time-consuming. In this paper, we propose a novel Convolutional Neural Network (CNN)-Long Short-Term Memory (LSTM) model for subject-independent drowsiness recognition from single-channel EEG signals. Different from existing deep learning models that are mostly treated as black-box classifiers, the proposed model can explain its decisions for each input sample by revealing which parts of the sample contain important features identified by the model for classification. This is achieved by a visualization technique by taking advantage of the hidden states output by the LSTM layer. Results show that the model achieves an average accuracy of 72.97% on 11 subjects for leave-one-out subject-independent drowsiness recognition on a public dataset, which is higher than the conventional baseline methods of 55.42%-69.27%, and state-of-the-art deep learning methods. Visualization results show that the model has discovered meaningful patterns of EEG signals related to different mental states across different subjects.

【3】 Multiple Time Series Fusion Based on LSTM An Application to CAP A Phase Classification Using EEG 标题：基于LSTM的多时间序列融合在脑电CAP A相分类中的应用链接：https://arxiv.org/abs/2112.11218

作者：Fábio Mendonça,Sheikh Shanawaz Mostafa,Diogo Freitas,Fernando Morgado-Dias,Antonio G. Ravelo-García 机构：a Interactive Technologies Institute (ITIARDITILARSyS and M-ITI),-, Funchal, Portugal., b University of Madeira,-, Funchal, Portugal., c NOVA Laboratory for Computer Science and Informatics,-, Caparica, Portugal. 备注：47 pages, 7 figures, 4 tables, for journal publication 摘要：生物医学决策涉及来自不同传感器或不同渠道的多种信号处理。在这两种情况下，信息融合都发挥着重要作用。针对脑电循环交替模式A相位分类问题，提出了一种基于深度学习的脑电通道特征级融合方法。采用遗传算法和粒子群优化算法对信道选择、融合和分类过程进行优化。通过融合夜间额叶癫痫患者和无任何神经障碍患者的多个脑电图通道的信息来评估所开发的方法，与其他最先进的工作相比，这是一项更具挑战性的工作。结果表明，两种优化算法都选择了具有相似特征级融合的可比结构，由三个符合CAP协议的脑电图通道组成，以确保CAP检测的多通道唤醒。此外，两个优化模型在接收器工作特性曲线下的区域达到了0.82，平均精度在77%到79%之间，这一结果在专家协议的上限范围内。尽管数据集比较困难，但所提出的方法仍处于最先进作品的上限，并且具有提供全自动分析而无需任何手动程序的优势。最终，这些模型显示出抗噪声能力和对多信道损耗的恢复能力。摘要：Biomedical decision making involves multiple signal processing, either from different sensors or from different channels. In both cases, information fusion plays a significant role. A deep learning based electroencephalogram channels' feature level fusion is carried out in this work for the electroencephalogram cyclic alternating pattern A phase classification. Channel selection, fusion, and classification procedures were optimized by two optimization algorithms, namely, Genetic Algorithm and Particle Swarm Optimization. The developed methodologies were evaluated by fusing the information from multiple electroencephalogram channels for patients with nocturnal frontal lobe epilepsy and patients without any neurological disorder, which was significantly more challenging when compared to other state of the art works. Results showed that both optimization algorithms selected a comparable structure with similar feature level fusion, consisting of three electroencephalogram channels, which is in line with the CAP protocol to ensure multiple channels' arousals for CAP detection. Moreover, the two optimized models reached an area under the receiver operating characteristic curve of 0.82, with average accuracy ranging from 77% to 79%, a result which is in the upper range of the specialist agreement. The proposed approach is still in the upper range of the best state of the art works despite a difficult dataset, and has the advantage of providing a fully automatic analysis without requiring any manual procedure. Ultimately, the models revealed to be noise resistant and resilient to multiple channel loss.

表征(2篇)

【1】 Integral representations of shallow neural network with Rectified Power Unit activation function 标题：具有整流机组激活函数的浅层神经网络的积分表示链接：https://arxiv.org/abs/2112.11157

作者：Ahmed Abdeljawad,Philipp Grohs 机构：Johann Radon Institute, University of Vienna 备注：22 pages, This is the first version. Some revisions in the near future is expected to be performed. arXiv admin note: text overlap with arXiv:1910.01635 by other authors 摘要：在这项工作中，我们推导了具有整流功率单元激活函数的浅层神经网络的积分表示公式。我们的第一个结果主要涉及RePU浅层网络表示能力的单变量情况。本文的多维结果刻画了可以用有界范数和可能的无界宽度表示的函数集。摘要：In this effort, we derive a formula for the integral representation of a shallow neural network with the Rectified Power Unit activation function. Mainly, our first result deals with the univariate case of representation capability of RePU shallow networks. The multidimensional result in this paper characterizes the set of functions that can be represented with bounded norm and possibly unbounded width.

【2】 NN2Poly: A polynomial representation for deep feed-forward artificial neural networks 标题：NN2Poly：深度前馈人工神经网络的一种多项式表示链接：https://arxiv.org/abs/2112.11397

作者：Pablo Morala,Jenny Alexandra Cifuentes,Rosa E. Lillo,Iñaki Ucar 机构：auc,m-Santander Big Data Institute, Universidad Carlos III de Madrid. Spain., Department of Statistics, Universidad Carlos III de Madrid. Spain., ICADE, Department of Quantitative Methods, Administration, Universidad Pontificia Comillas. Spain. 摘要：神经网络的可解释性及其潜在的理论行为仍然是一个开放的研究领域，即使在其实际应用取得巨大成功之后，特别是随着深度学习的出现。在这项工作中，提出了NN2Poly：一种理论方法，允许获得多项式，为已经训练的深层神经网络提供一种替代表示。这扩展了arXiv:2102.03865中提出的仅限于单隐层神经网络的先前想法，以在回归和分类任务中使用任意深度前馈神经网络。本文的目标是通过在每一层对激活函数使用泰勒展开，然后使用允许识别所需多项式系数的若干组合属性来实现的。讨论了实现该理论方法时的主要计算限制，并给出了NN2Poly工作所需的神经网络权重约束示例。最后，给出了一些仿真结果，得出结论，使用NN2Poly可以获得给定神经网络的表示，且获得的预测之间的误差较小。摘要：Interpretability of neural networks and their underlying theoretical behaviour remain being an open field of study, even after the great success of their practical applications, particularly with the emergence of deep learning. In this work, NN2Poly is proposed: a theoretical approach that allows to obtain polynomials that provide an alternative representation of an already trained deep neural network. This extends the previous idea proposed in arXiv:2102.03865, which was limited to single hidden layer neural networks, to work with arbitrarily deep feed-forward neural networks in both regression and classification tasks. The objective of this paper is achieved by using a Taylor expansion on the activation function, at each layer, and then using several combinatorial properties that allow to identify the coefficients of the desired polynomials. The main computational limitations when implementing this theoretical method are discussed and it is presented an example of the constraints on the neural network weights that are necessary for NN2Poly to work. Finally, some simulations are presented were it is concluded that using NN2Poly it is possible to obtain a representation for the given neural network with low error between the obtained predictions.

3D|3D重建等相关(1篇)

【1】 PONet: Robust 3D Human Pose Estimation via Learning Orientations Only 标题：Ponet：仅基于学习方向的鲁棒三维人体姿态估计链接：https://arxiv.org/abs/2112.11153

作者：Jue Wang,Shaoli Huang,Xinchao Wang,Dacheng Tao 摘要：传统的三维人体姿态估计依赖于首先检测二维人体关键点，然后解决二维到三维的对应问题。尽管取得了有希望的结果，但这种学习模式高度依赖于2D关键点检测器的质量，2D关键点检测器不可避免地容易受到遮挡和图像缺失的影响。在本文中，我们提出了一种新的姿态定向网络（PONet），该网络仅通过学习方向就能够可靠地估计三维姿态，从而在没有图像证据的情况下绕过容易出错的关键点检测器。对于具有部分不可见肢体的图像，PONet通过利用局部图像证据来恢复3D姿势来估计这些肢体的3D方向。此外，PONet能够通过利用可见肢体之间的方向相关性来补充估计的姿势，甚至从具有完全不可见肢体的图像推断出完整的3D姿势，从而进一步提高3D姿势估计的鲁棒性。我们在多个数据集上评估我们的方法，包括Human3。6M、MPII、MPI-INF-3DHP和3DPW。我们的方法达到了与理想的设置中的最先进的技术的PAR结果，但显著地消除了对关键点检测器和相应的计算负担的依赖性。在非常具有挑战性的场景中，例如截断和擦除，我们的方法执行非常稳健，并且与最新技术相比产生了更优的结果，显示了其在实际应用中的潜力。摘要：Conventional 3D human pose estimation relies on first detecting 2D body keypoints and then solving the 2D to 3D correspondence problem.Despite the promising results, this learning paradigm is highly dependent on the quality of the 2D keypoint detector, which is inevitably fragile to occlusions and out-of-image absences.In this paper,we propose a novel Pose Orientation Net (PONet) that is able to robustly estimate 3D pose by learning orientations only, hence bypassing the error-prone keypoint detector in the absence of image evidence. For images with partially invisible limbs, PONet estimates the 3D orientation of these limbs by taking advantage of the local image evidence to recover the 3D pose.Moreover, PONet is competent to infer full 3D poses even from images with completely invisible limbs, by exploiting the orientation correlation between visible limbs to complement the estimated poses,further improving the robustness of 3D pose estimation.We evaluate our method on multiple datasets, including Human3.6M, MPII, MPI-INF-3DHP, and 3DPW. Our method achieves results on par with state-of-the-art techniques in ideal settings, yet significantly eliminates the dependency on keypoint detectors and the corresponding computation burden. In highly challenging scenarios, such as truncation and erasing, our method performs very robustly and yields much superior results as compared to state of the art,demonstrating its potential for real-world applications.

优化|敛散性(3篇)

【1】 Soft Actor-Critic with Cross-Entropy Policy Optimization 标题：基于交叉熵政策优化的软行动者-批评者链接：https://arxiv.org/abs/2112.11115

作者：Zhenyang Shi,Surya P. N. Singh 机构： University of Queensland 摘要：软参与者批评（SAC）是一种基于最大熵的非策略强化学习（RL）算法。SAC在一系列连续控制任务中表现良好，具有良好的稳定性和鲁棒性。SAC学习一个随机高斯策略，该策略可以最大化总期望报酬和策略熵之间的权衡。为了更新策略，SAC最小化当前策略密度和软值函数密度之间的KL差异。然后使用重新参数化技巧获得该散度的近似梯度。在本文中，我们提出了具有交叉熵策略优化的软行动者批评（SAC-CEPO），它使用交叉熵方法（CEM）来优化SAC的策略网络。最初的想法是使用CEM对最接近软值函数密度的分布进行迭代采样，并使用结果分布作为更新策略网络的目标。为了降低计算复杂度，我们还引入了一种解耦策略结构，该结构将高斯策略解耦为一个学习均值的策略和另一个学习偏差的策略，使得CEM只训练均值策略。我们证明了这种解耦的策略结构确实收敛到一个最优解，并且我们还通过实验证明SAC-CEPO实现了与原始SAC相比的竞争性能。摘要：Soft Actor-Critic (SAC) is one of the state-of-the-art off-policy reinforcement learning (RL) algorithms that is within the maximum entropy based RL framework. SAC is demonstrated to perform very well in a list of continous control tasks with good stability and robustness. SAC learns a stochastic Gaussian policy that can maximize a trade-off between total expected reward and the policy entropy. To update the policy, SAC minimizes the KL-Divergence between the current policy density and the soft value function density. Reparameterization trick is then used to obtain the approximate gradient of this divergence. In this paper, we propose Soft Actor-Critic with Cross-Entropy Policy Optimization (SAC-CEPO), which uses Cross-Entropy Method (CEM) to optimize the policy network of SAC. The initial idea is to use CEM to iteratively sample the closest distribution towards the soft value function density and uses the resultant distribution as a target to update the policy network. For the purpose of reducing the computational complexity, we also introduce a decoupled policy structure that decouples the Gaussian policy into one policy that learns the mean and one other policy that learns the deviation such that only the mean policy is trained by CEM. We show that this decoupled policy structure does converge to a optimal and we also demonstrate by experiments that SAC-CEPO achieves competitive performance against the original SAC.

【2】 A Theoretical View of Linear Backpropagation and Its Convergence 标题：线性反向传播及其收敛性的理论观点链接：https://arxiv.org/abs/2112.11018

作者：Ziang Li,Yiwen Guo,Haodi Liu,Changshui Zhang 机构： Zhang are with the Institute for Artificial Intelli-gence, Tsinghua University (THUAI), Department of Automation, TsinghuaUniversity 摘要：反向传播广泛用于计算深度神经网络（DNN）中的梯度。反向传播通常与随机梯度下降（SGD）或其变体一起应用，在包括DNN训练和对抗性攻击/防御在内的各种机器学习任务中，反向传播被认为是一种事实上的选择。最近，Guo等人引入了一种称为LinBP的线性BP，用于生成更多可转移的黑箱对抗攻击的对抗示例。然而，尚未对其进行理论研究，也缺乏这种方法的收敛性分析。本文对郭等人的论文进行了补充和扩展，对涉及对抗性攻击和模型训练等学习任务的神经网络中的LinBP进行了理论分析。我们证明，有点令人惊讶的是，与BP相比，LinBP可以在相同的超参数设置下更快地收敛于这些任务。我们通过大量实验证实了我们的理论结果。摘要：Backpropagation is widely used for calculating gradients in deep neural networks (DNNs). Applied often along with stochastic gradient descent (SGD) or its variants, backpropagation is considered as a de-facto choice in a variety of machine learning tasks including DNN training and adversarial attack/defense. Recently, a linear variant of BP named LinBP was introduced for generating more transferable adversarial examples for black-box adversarial attacks, by Guo et al. Yet, it has not been theoretically studied and the convergence analysis of such a method is lacking. This paper serves as a complement and somewhat an extension to Guo et al.'s paper, by providing theoretical analyses on LinBP in neural-network-involved learning tasks including adversarial attack and model training. We demonstrate that, somewhat surprisingly, LinBP can lead to faster convergence in these tasks in the same hyper-parameter settings, compared to BP. We confirm our theoretical results with extensive experiments.

【3】 Nearly Optimal Policy Optimization with Stable at Any Time Guarantee 标题：具有随时稳定保证的近最优策略优化链接：https://arxiv.org/abs/2112.10935

作者：Tianhao Wu,Yunchang Yang,Han Zhong,Liwei Wang,Simon S. Du,Jiantao Jiao 机构：University of California, Berkeley, Center for Data Science, Peking University, University of Washington, Key Laboratory of Machine Perception (MOE), School of EECS, Peking University 摘要：策略优化方法是强化学习（RL）算法中应用最广泛的一类。然而，对这些方法的理论理解仍然不够。即使在幕式（时间不均匀）表格环境中，基于政策的方法在{Shani2020Optimized}中的最新理论结果也只有$ ilde{O}（sqrt{S^2AH^4K}）$，其中$$S$是状态数，$A$是行动数，$H$是地平线，$K$是幕数，与信息论下限$ ilde{Omega}（sqrt{SAH^3K}）$相比，存在$sqrt{SH}$差距。为了弥补这一差距，我们提出了一种新的基于参考的策略优化算法，该算法具有随时稳定的保证（algnameacro），具有“随时稳定”的特性。我们证明了我们的算法实现了$ ilde{O}（sqrt{SAH^3K}+sqrt{AH^4}）$遗憾。当$S>H$时，忽略对数因子，我们的算法是极小极大最优的。据我们所知，RPO-SAT是第一个计算效率高、近似最大极小最优的基于策略的表格RL算法。摘要：Policy optimization methods are one of the most widely used classes of Reinforcement Learning (RL) algorithms. However, theoretical understanding of these methods remains insufficient. Even in the episodic (time-inhomogeneous) tabular setting, the state-of-the-art theoretical result of policy-based method in citet{shani2020optimistic} is only $ ilde{O}(sqrt{S^2AH^4K})$ where $S$ is the number of states, $A$ is the number of actions, $H$ is the horizon, and $K$ is the number of episodes, and there is a $sqrt{SH}$ gap compared with the information theoretic lower bound $ ilde{Omega}(sqrt{SAH^3K})$. To bridge such a gap, we propose a novel algorithm Reference-based Policy Optimization with Stable at Any Time guarantee (algnameacro), which features the property "Stable at Any Time". We prove that our algorithm achieves $ ilde{O}(sqrt{SAH^3K} + sqrt{AH^4})$ regret. When $S > H$, our algorithm is minimax optimal when ignoring logarithmic factors. To our best knowledge, RPO-SAT is the first computationally efficient, nearly minimax optimal policy-based algorithm for tabular RL.

预测|估计(3篇)

【1】 Neural network guided adjoint computations in dual weighted residual error estimation 标题：双重加权残差估计中神经网络引导的伴随计算链接：https://arxiv.org/abs/2112.11360

作者：Ayan Chakraborty,Thomas Wick,Xiaoying Zhuang,Timon Rabczuk 机构：Division of Computational Mechanics, Ton Duc Thang University, Ho Chi Minh City, Vietnam, Department of Geotechnical Engineering,College of Civil Engineering, Tongji University, Shanghai, China 摘要：深度学习已成功应用于视觉识别和某些人工智能任务。深度学习也被认为是一个强大的工具，具有高度的灵活性来近似函数。在本工作中，设计了具有期望性质的函数来近似偏微分方程的解。我们的方法是基于后验误差估计，在后验误差估计中，伴随问题被解决用于误差定位，从而在神经网络的框架内形成误差估计。提出了一种利用对偶加权残值法获得多目标泛函后验误差估计的高效且易于实现的算法，然后利用神经网络计算原始解和伴随解。目前的研究表明，即使训练数据相对较少，这种基于数据驱动模型的学习也能更好地近似感兴趣的数量。新算法的发展得到了数值试验实例的证实。论证了使用深度神经网络优于浅层神经网络的优点，并给出了收敛增强技术摘要：Deep learning has shown successful application in visual recognition and certain artificial intelligence tasks. Deep learning is also considered as a powerful tool with high flexibility to approximate functions. In the present work, functions with desired properties are devised to approximate the solutions of PDEs. Our approach is based on a posteriori error estimation in which the adjoint problem is solved for the error localization to formulate an error estimator within the framework of neural network. An efficient and easy to implement algorithm is developed to obtain a posteriori error estimate for multiple goal functionals by employing the dual-weighted residual approach, which is followed by the computation of both primal and adjoint solutions using the neural network. The present study shows that such a data-driven model based learning has superior approximation of quantities of interest even with relatively less training data. The novel algorithmic developments are substantiated with numerical test examples. The advantages of using deep neural network over the shallow neural network are demonstrated and the convergence enhancing techniques are also presented

【2】 AutoCTS: Automated Correlated Time Series Forecasting -- Extended Version 标题：AutoCTS：自动相关时间序列预测--扩展版链接：https://arxiv.org/abs/2112.11174

作者：Xinle Wu,Dalin Zhang,Chenjuan Guo,Chaoyang He,Bin Yang,Christian S. Jensen 机构：Aalborg University, Denmark, University of Southern California, USA 备注：to appear in PVLDB 2022 摘要：相关时间序列（CTS）预测在许多网络物理系统中起着至关重要的作用，在这些系统中，多个传感器发出时间序列来捕获相互关联的过程。基于深度学习的解决方案提供最先进的CTS预测性能，采用各种时空（ST）块，能够对时间序列之间的时间依赖性和空间相关性进行建模。然而，仍然存在两个挑战。首先，ST块是手工设计的，这既耗时又昂贵。其次，现有的预测模型只是将相同的ST块多次叠加，这限制了模型的潜力。为了应对这些挑战，我们提出了能够自动识别高度竞争的ST块以及使用不同拓扑连接的异构ST块预测模型的AutoCT，而不是使用简单堆叠连接的相同ST块。具体而言，我们设计了一个微观和宏观搜索空间来模拟ST块的可能结构以及异构ST块之间的连接，并提供了一种搜索策略，能够联合探索搜索空间以确定最佳预测模型。在八个常用CTS预测基准数据集上进行的大量实验证明了我们的设计选择是正确的，并证明AutoCTS能够自动发现性能优于最先进的人类设计模型的预测模型。这是“自动相关时间序列预测”的扩展版本，将出现在PVLDB 2022中。摘要：Correlated time series (CTS) forecasting plays an essential role in many cyber-physical systems, where multiple sensors emit time series that capture interconnected processes. Solutions based on deep learning that deliver state-of-the-art CTS forecasting performance employ a variety of spatio-temporal (ST) blocks that are able to model temporal dependencies and spatial correlations among time series. However, two challenges remain. First, ST-blocks are designed manually, which is time consuming and costly. Second, existing forecasting models simply stack the same ST-blocks multiple times, which limits the model potential. To address these challenges, we propose AutoCTS that is able to automatically identify highly competitive ST-blocks as well as forecasting models with heterogeneous ST-blocks connected using diverse topologies, as opposed to the same ST-blocks connected using simple stacking. Specifically, we design both a micro and a macro search space to model possible architectures of ST-blocks and the connections among heterogeneous ST-blocks, and we provide a search strategy that is able to jointly explore the search spaces to identify optimal forecasting models. Extensive experiments on eight commonly used CTS forecasting benchmark datasets justify our design choices and demonstrate that AutoCTS is capable of automatically discovering forecasting models that outperform state-of-the-art human-designed models. This is an extended version of ``AutoCTS: Automated Correlated Time Series Forecasting'', to appear in PVLDB 2022.

【3】 RetroComposer: Discovering Novel Reactions by Composing Templates for Retrosynthesis Prediction 标题：RetroComposer：通过合成用于逆向合成预测的模板来发现新的反应链接：https://arxiv.org/abs/2112.11225

作者：Chaochao Yan,Peilin Zhao,Chan Lu,Yang Yu,Junzhou Huang 机构：University of Texas at Arlington , Tencent AI Lab 备注：10 pages 摘要：逆合成的主要目标是递归地将所需的分子分解成可用的构建块。现有的基于模板的反转录合成方法遵循模板选择模式，并且训练模板有限，这阻碍了它们发现新的反应。为了克服这一局限性，我们提出了一种创新的反向合成预测框架，它可以在训练模板的基础上合成新的模板。据我们所知，这是第一种可以找到新的模板进行逆转录合成预测的方法。此外，我们还提出了一个有效的反应物候选评分模型，该模型能够捕获原子级的变换信息，并且有助于我们的方法大大优于现有的方法。实验结果表明，我们的方法可以为USPTO-50K数据集中的328个测试反应生成新的模板，包括21个未被训练模板覆盖的测试反应。摘要：The main target of retrosynthesis is to recursively decompose desired molecules into available building blocks. Existing template-based retrosynthesis methods follow a template selection stereotype and suffer from the limited training templates, which prevents them from discovering novel reactions. To overcome the limitation, we propose an innovative retrosynthesis prediction framework that can compose novel templates beyond training templates. So far as we know, this is the first method that can find novel templates for retrosynthesis prediction. Besides, we propose an effective reactant candidates scoring model that can capture atom-level transformation information, and it helps our method outperform existing methods by a large margin. Experimental results show that our method can produce novel templates for 328 test reactions in the USPTO-50K dataset, including 21 test reactions that are not covered by the training templates.

其他神经网络|深度学习|模型|建模(31篇)

【1】 Max-Margin Contrastive Learning 标题：最大裕度对比学习链接：https://arxiv.org/abs/2112.11450

作者：Anshul Shah,Suvrit Sra,Rama Chellappa,Anoop Cherian 机构：Johns Hopkins University, Baltimore, MD, Massachusetts Institute of Technology, Cambridge, MA, Mitsubishi Electric Research Labs, Cambridge, MA 备注：Accepted at AAAI 2022 摘要：标准的对比学习方法通常需要大量的负面因素才能实现有效的无监督学习，并且往往表现出缓慢的收敛性。我们怀疑这种行为是由于选择了次优的底片来提供与正片的对比。我们从支持向量机（SVM）中汲取灵感，提出了最大边际对比学习（MMCL），从而克服了这一困难。我们的方法通过二次优化问题选择否定作为稀疏支持向量，并通过最大化决策裕度来增强对比性。由于支持向量机优化可能需要计算，特别是在端到端的环境中，我们提出了简化方法，以减轻计算负担。我们在标准的视觉基准数据集上验证了我们的方法，证明了与最新技术相比，我们在无监督表示学习方面有更好的性能，同时具有更好的经验收敛特性。摘要：Standard contrastive learning approaches usually require a large number of negatives for effective unsupervised learning and often exhibit slow convergence. We suspect this behavior is due to the suboptimal selection of negatives used for offering contrast to the positives. We counter this difficulty by taking inspiration from support vector machines (SVMs) to present max-margin contrastive learning (MMCL). Our approach selects negatives as the sparse support vectors obtained via a quadratic optimization problem, and contrastiveness is enforced by maximizing the decision margin. As SVM optimization can be computationally demanding, especially in an end-to-end setting, we present simplifications that alleviate the computational burden. We validate our approach on standard vision benchmark datasets, demonstrating better performance in unsupervised representation learning over state-of-the-art, while having better empirical convergence properties.

【2】 Machine Learning Emulation of Urban Land Surface Processes 标题：城市陆面过程的机器学习仿真链接：https://arxiv.org/abs/2112.11429

作者：David Meyer,Sue Grimmond,Peter Dueben,Robin Hogan,Maarten van Reeuwijk 机构：Department of Meteorology, University of Reading, Reading, UK, Department of Civil and Environmental Engineering, Imperial College London, London, UK, European Centre for Medium-Range Weather Forecasts, Reading, UK, Key Points:, • 摘要：我们能用机器学习（ML）改进城市地表过程的建模吗？先前对城市地表模型（ULSM）的比较发现，没有一个模型能够“最佳”预测所有的共同地表通量。在这里，我们开发了一个城市神经网络（UNN），该网络根据一个地点22个超低硫柴油的平均预测通量进行训练。UNN精确模拟ULSM的平均输出。与参考ULSM（城镇能量平衡；TEB）相比，UNN相对于通量观测具有更高的精度、更少的计算成本，并且需要更少的输入参数。当与使用TensorFlow绑定的天气研究预测（WRF）模型耦合时，WRF-UNN比参考WRF-TEB更稳定、更准确。虽然目前应用受到训练数据（1个站点）的限制，但我们展示了一种新的方法，通过使用ML将多个ULSM的强度组合成一个来改进表面通量的建模。摘要：Can we improve the modeling of urban land surface processes with machine learning (ML)? A prior comparison of urban land surface models (ULSMs) found that no single model is 'best' at predicting all common surface fluxes. Here, we develop an urban neural network (UNN) trained on the mean predicted fluxes from 22 ULSMs at one site. The UNN emulates the mean output of ULSMs accurately. When compared to a reference ULSM (Town Energy Balance; TEB), the UNN has greater accuracy relative to flux observations, less computational cost, and requires fewer input parameters. When coupled to the Weather Research Forecasting (WRF) model using TensorFlow bindings, WRF-UNN is stable and more accurate than the reference WRF-TEB. Although the application is currently constrained by the training data (1 site), we show a novel approach to improve the modeling of surface fluxes by combining the strengths of several ULSMs into one using ML.

【3】 Deep Learning and Earth Observation to Support the Sustainable Development Goals 标题：支持可持续发展目标的深度学习和地球观测链接：https://arxiv.org/abs/2112.11367

作者：Claudio Persello,Jan Dirk Wegner,Ronny Hänsch,Devis Tuia,Pedram Ghamisi,Mila Koeva,Gustau Camps-Valls 机构：This is the pre-acceptance version of the paper. The final, version will appear in the IEEE Geoscience and Remote, Sensing Magazine. 摘要：深度学习模型与地球观测的协同结合有望在支持可持续发展目标（SDG）方面取得重大进展。新的发展和大量的应用已经改变了人类面对地球生存挑战的方式。本文回顾了当前地球观测数据的深度学习方法，以及它们在监测和实现受地球观测深度学习快速发展影响最大的可持续发展目标方面的应用。我们系统地审查案例研究，以1）实现零饥饿，2）可持续城市，3）提供土地保有权保障，4）缓解和适应气候变化，以及5）保护生物多样性。涉及重要的社会、经济和环境影响。令人兴奋的时刻即将到来，算法和地球数据将帮助我们努力解决气候危机，支持更可持续的发展。摘要：The synergistic combination of deep learning models and Earth observation promises significant advances to support the sustainable development goals (SDGs). New developments and a plethora of applications are already changing the way humanity will face the living planet challenges. This paper reviews current deep learning approaches for Earth observation data, along with their application towards monitoring and achieving the SDGs most impacted by the rapid development of deep learning in Earth observation. We systematically review case studies to 1) achieve zero hunger, 2) sustainable cities, 3) deliver tenure security, 4) mitigate and adapt to climate change, and 5) preserve biodiversity. Important societal, economic and environmental implications are concerned. Exciting times ahead are coming where algorithms and Earth data can help in our endeavor to address the climate crisis and support more sustainable development.

【4】 PrimSeq: a deep learning-based pipeline to quantitate rehabilitation training 标题：PrimSeq：一种基于深度学习的康复训练量化途径链接：https://arxiv.org/abs/2112.11330

作者：Avinash Parnandi,Aakash Kaku,Anita Venkatesan,Natasha Pandit,Audre Wirtanen,Haresh Rajamohan,Kannan Venkataramanan,Dawn Nilsen,Carlos Fernandez-Granda,Heidi Schambra 机构：Affiliations:, Department of Neurology, New York University Langone Health, New York, USA, Center for Data Science, New York University, New York, USA, Department of Rehabilitation and Regenerative Medicine, Columbia University, New York 摘要：中风康复旨在通过反复练习功能性运动来增加神经可塑性，但由于重复次数不足，对恢复的影响可能最小。目前还不知道最佳训练内容和数量，因为没有实用的工具来衡量这些内容和数量。在这里，我们介绍PrimSeq，这是一个对中风康复训练中的功能性运动进行分类和计数的管道。我们的方法集成了可穿戴传感器以捕捉上半身运动、预测运动序列的深度学习模型以及记录运动的算法。训练后的模型准确地将康复活动分解为部件功能运动，优于竞争性机器学习方法。PrimSeq进一步量化了这些运动，其时间和人力成本仅为人类专家的一小部分。我们展示了PrimSeq在先前未发现的患有一系列上肢运动障碍的中风患者中的功能。我们期望这些进展将支持中风康复定量给药试验所需的严格测量。摘要：Stroke rehabilitation seeks to increase neuroplasticity through the repeated practice of functional motions, but may have minimal impact on recovery because of insufficient repetitions. The optimal training content and quantity are currently unknown because no practical tools exist to measure them. Here, we present PrimSeq, a pipeline to classify and count functional motions trained in stroke rehabilitation. Our approach integrates wearable sensors to capture upper-body motion, a deep learning model to predict motion sequences, and an algorithm to tally motions. The trained model accurately decomposes rehabilitation activities into component functional motions, outperforming competitive machine learning methods. PrimSeq furthermore quantifies these motions at a fraction of the time and labor costs of human experts. We demonstrate the capabilities of PrimSeq in previously unseen stroke patients with a range of upper extremity motor impairment. We expect that these advances will support the rigorous measurement required for quantitative dosing trials in stroke rehabilitation.

【5】 Accurate online training of dynamical spiking neural networks through Forward Propagation Through Time 标题：动态棘波神经网络的时间前向精确在线训练链接：https://arxiv.org/abs/2112.11231

作者：Bojian Yin,Federico Corradi,Sander M. Bohte 机构：CWI, IMEC, Sander M. Bohté 备注：12 pages, 4 figures 摘要：大脑中尖峰神经元之间的事件驱动和稀疏通信特性为灵活高效的人工智能带来了巨大的希望。学习算法的最新进展表明，与标准的递归神经网络相比，尖峰神经元的递归网络可以有效地训练，以获得具有竞争力的性能。尽管如此，由于这些学习算法使用时间误差反向传播（BPTT），它们的内存需求很高，训练速度很慢，并且与在线学习不兼容。这限制了这些学习算法在相对较小的网络和有限的时间序列长度上的应用。已经提出了具有较低计算和内存复杂度的BPTT在线近似（e-prop，OSTL），但在实践中也受到内存限制，并且作为近似，其性能并不优于标准BPTT训练。在这里，我们展示了一种新开发的BPTT替代方案，即通过时间的前向传播（FPTT）如何应用于尖峰神经网络。与BPTT不同，FPTT试图最大限度地降低持续的动态规范化损失风险。因此，FPTT可以在线计算，并且相对于序列长度具有固定的复杂性。结合一种新的动态尖峰神经元模型——液体时间常数神经元，我们证明了用FPTT训练的SNN优于在线BPTT近似，在时间分类任务上接近或超过离线BPTT精度。因此，这种方法可以在长序列上以对记忆友好的在线方式训练SNN，并将SNN扩展到新颖复杂的神经结构。摘要：The event-driven and sparse nature of communication between spiking neurons in the brain holds great promise for flexible and energy-efficient AI. Recent advances in learning algorithms have demonstrated that recurrent networks of spiking neurons can be effectively trained to achieve competitive performance compared to standard recurrent neural networks. Still, as these learning algorithms use error-backpropagation through time (BPTT), they suffer from high memory requirements, are slow to train, and are incompatible with online learning. This limits the application of these learning algorithms to relatively small networks and to limited temporal sequence lengths. Online approximations to BPTT with lower computational and memory complexity have been proposed (e-prop, OSTL), but in practice also suffer from memory limitations and, as approximations, do not outperform standard BPTT training. Here, we show how a recently developed alternative to BPTT, Forward Propagation Through Time (FPTT) can be applied in spiking neural networks. Different from BPTT, FPTT attempts to minimize an ongoing dynamically regularized risk on the loss. As a result, FPTT can be computed in an online fashion and has fixed complexity with respect to the sequence length. When combined with a novel dynamic spiking neuron model, the Liquid-Time-Constant neuron, we show that SNNs trained with FPTT outperform online BPTT approximations, and approach or exceed offline BPTT accuracy on temporal classification tasks. This approach thus makes it feasible to train SNNs in a memory-friendly online fashion on long sequences and scale up SNNs to novel and complex neural architectures.

【6】 Energy-bounded Learning for Robust Models of Code 标题：代码健壮模型的能量受限学习链接：https://arxiv.org/abs/2112.11226

作者：Nghi D. Q. Bui,Yijun Yu 机构：Trustworthiness Lab, Huawei Ireland Research Centre 备注：arXiv admin note: text overlap with arXiv:2010.03759 by other authors 摘要：在编程中，学习代码表示有多种应用，包括代码分类、代码搜索、注释生成、错误预测等。已经提出了以标记、语法树、依赖关系图、代码导航路径或其变体的组合表示代码的各种方法，但是，现有的普通学习技术在鲁棒性方面有一个主要限制，即。，当输入以微妙的方式改变时，模型很容易做出错误的预测。为了增强鲁棒性，现有的方法侧重于识别对抗性样本，而不是特定分布之外的有效样本，我们称之为分布外（out-of-distribution，OOD）样本。识别此类OOD样本是本文研究的新问题。为此，我们建议首先使用分布外样本扩充分布内数据集，以便在一起训练时，它们将增强模型的鲁棒性。我们建议使用能量有界学习目标函数为分布内样本分配较高的分数，为分布外样本分配较低的分数，以便将这种分布外样本纳入源代码模型的训练过程。在OOD检测和对抗性样本检测方面，我们的评估结果表明，现有源代码模型具有更强的鲁棒性，能够更准确地识别OOD数据，同时更能抵抗对抗性攻击。此外，所提出的能量有界分数大大优于所有现有的OOD检测分数，包括softmax置信分数、Mahalanobis分数和ODIN。摘要：In programming, learning code representations has a variety of applications, including code classification, code search, comment generation, bug prediction, and so on. Various representations of code in terms of tokens, syntax trees, dependency graphs, code navigation paths, or a combination of their variants have been proposed, however, existing vanilla learning techniques have a major limitation in robustness, i.e., it is easy for the models to make incorrect predictions when the inputs are altered in a subtle way. To enhance the robustness, existing approaches focus on recognizing adversarial samples rather than on the valid samples that fall outside a given distribution, which we refer to as out-of-distribution (OOD) samples. Recognizing such OOD samples is the novel problem investigated in this paper. To this end, we propose to first augment the in=distribution datasets with out-of-distribution samples such that, when trained together, they will enhance the model's robustness. We propose the use of an energy-bounded learning objective function to assign a higher score to in-distribution samples and a lower score to out-of-distribution samples in order to incorporate such out-of-distribution samples into the training process of source code models. In terms of OOD detection and adversarial samples detection, our evaluation results demonstrate a greater robustness for existing source code models to become more accurate at recognizing OOD data while being more resistant to adversarial attacks at the same time. Furthermore, the proposed energy-bounded score outperforms all existing OOD detection scores by a large margin, including the softmax confidence score, the Mahalanobis score, and ODIN.

【7】 Interpretable Knowledge Tracing: Simple and Efficient Student Modeling with Causal Relations 标题：可解释知识追踪：具有因果关系的简单有效的学生建模链接：https://arxiv.org/abs/2112.11209

作者：Sein Minn,Jill-Jenn Vie,Koh Takeuchi,Hisashi Kashima,Feida Zhu 机构： Univ. Lille, Inria, CNRS, Centrale Lille, UMR , - CRIStAL, F-, Lille, France, Universit´e Paris-Saclay, Inria, CEA, Palaiseau, France, Kyoto University, Japan, Singapore Management University, Singapore 备注：AAAI Symposium on Educational Advances in Artificial Intelligence EAAI-22. arXiv admin note: text overlap with arXiv:2012.12218 摘要：智能教学系统在未来的学习环境中变得至关重要。知识追踪（KT）是该系统的关键部分。它是关于推断学生的技能掌握情况并预测他们的表现，从而相应地调整课程。与传统模型相比，基于深度学习的KT模型具有显著的预测性能。然而，很难从神经网络中成千上万个与认知理论相关的参数中提取出有心理学意义的解释。有几种方法可以实现学生成绩预测的高准确性，但诊断和预测推理在学习科学中更为关键。由于KT问题几乎没有可观察的特征（问题ID和学生在每次练习中的正确性），我们使用机器学习和数据挖掘技术从学生的反应数据中提取有意义的潜在特征。在这项工作中，我们提出了可解释知识追踪（IKT），这是一个简单的模型，依赖于三个有意义的潜在特征：个人技能掌握、能力概况（跨技能学习迁移）和问题难度。IKT对未来学生表现的预测是使用树增强朴素贝叶斯分类器（TAN）进行的，因此其预测比基于深度学习的学生模型更容易解释。IKT也显示出比基于深度学习的学生模型更好的学生成绩预测，而不需要大量的参数。我们对每个特征进行消融研究，以检查它们对学生成绩预测的贡献。因此，IKT在现实教育系统中提供具有因果推理的自适应个性化教学方面具有巨大潜力。摘要：Intelligent Tutoring Systems have become critically important in future learning environments. Knowledge Tracing (KT) is a crucial part of that system. It is about inferring the skill mastery of students and predicting their performance to adjust the curriculum accordingly. Deep Learning-based KT models have shown significant predictive performance compared with traditional models. However, it is difficult to extract psychologically meaningful explanations from the tens of thousands of parameters in neural networks, that would relate to cognitive theory. There are several ways to achieve high accuracy in student performance prediction but diagnostic and prognostic reasoning is more critical in learning sciences. Since KT problem has few observable features (problem ID and student's correctness at each practice), we extract meaningful latent features from students' response data by using machine learning and data mining techniques. In this work, we present Interpretable Knowledge Tracing (IKT), a simple model that relies on three meaningful latent features: individual skill mastery, ability profile (learning transfer across skills), and problem difficulty. IKT's prediction of future student performance is made using a Tree-Augmented Naive Bayes Classifier (TAN), therefore its predictions are easier to explain than deep learning-based student models. IKT also shows better student performance prediction than deep learning-based student models without requiring a huge amount of parameters. We conduct ablation studies on each feature to examine their contribution to student performance prediction. Thus, IKT has great potential for providing adaptive and personalized instructions with causal reasoning in real-world educational systems.

【8】 Developing and Validating Semi-Markov Occupancy Generative Models: A Technical Report 标题：半马尔可夫占用生成模型的开发与验证：技术报告链接：https://arxiv.org/abs/2112.11111

作者：Soumya Kundu,Saptarshi Bhattacharya,Himanshu Sharma,Veronica Adetola 机构：Prepared for the U.S. Department of Energy, Under contract DE-AC,-,RL, arXiv:,.,v, [cs.LG] , Dec 摘要：本报告记录了太平洋西北国家实验室（PNNL）在开发和验证商业建筑随机占用模型方面的最新技术工作，作为美国能源部（DOE）建筑技术办公室（BTO）传感器影响评估和验证项目的一部分。在本报告中，我们介绍了我们在开发和验证非齐次半马尔可夫链模型方面的工作，该模型用于生成商业建筑中区域级别的占用存在和占用计数序列。真实数据集用于学习和验证生成占用模型。使用标准化Jensen-Shannon距离（NJSD）等相关度量来证明模型能够表达真实的占用行为模式。摘要：This report documents recent technical work on developing and validating stochastic occupancy models in commercial buildings, performed by the Pacific Northwest National Laboratory (PNNL) as part of the Sensor Impact Evaluation and Verification project under the U.S. Department of Energy (DOE) Building Technologies Office (BTO). In this report, we present our work on developing and validating inhomogeneous semi-Markov chain models for generating sequences of zone-level occupancy presence and occupancy counts in a commercial building. Real datasets are used to learn and validate the generative occupancy models. Relevant metrics such as normalized Jensen-Shannon distance (NJSD) are used to demonstrate the ability of the models to express realistic occupancy behavioral patterns.

【9】 Aerial Base Station Positioning and Power Control for Securing Communications: A Deep Q-Network Approach 标题：用于保密通信的空中基站定位和功率控制：一种深度Q网络方法链接：https://arxiv.org/abs/2112.11090

作者：Aly Sabri Abdalla,Ali Behfarnia,Vuk Marojevic 机构：∗Department of Electrical and Computer Engineering, Mississippi State University, MS , USA, †Department of Engineering, University of Tennessee at Martin, TN, USA 备注：This article has been accepted for publication in the IEEE Wireless Communications and Networking Conference 摘要：无人机（UAV）是支持包括通信在内的多种服务的技术突破之一。无人机将在增强无线网络物理层安全方面发挥关键作用。本文定义了地面用户与作为空中基站（ABS）的无人机之间链路的窃听问题。提出了增强学习算法Q-learning和深度Q网络（DQN）来优化ABS的位置和传输功率，以提高地面用户的数据传输速率。这在系统不知道窃听者位置的情况下增加了保密能力。仿真结果表明，与Q-学习和基线方法相比，该算法收敛速度快，保密能力高。摘要：The unmanned aerial vehicle (UAV) is one of the technological breakthroughs that supports a variety of services, including communications. UAV will play a critical role in enhancing the physical layer security of wireless networks. This paper defines the problem of eavesdropping on the link between the ground user and the UAV, which serves as an aerial base station (ABS). The reinforcement learning algorithms Q-learning and deep Q-network (DQN) are proposed for optimizing the position of the ABS and the transmission power to enhance the data rate of the ground user. This increases the secrecy capacity without the system knowing the location of the eavesdropper. Simulation results show fast convergence and the highest secrecy capacity of the proposed DQN compared to Q-learning and baseline approaches.

【10】 Explanation of Machine Learning Models Using Shapley Additive Explanation and Application for Real Data in Hospital 标题：用Shapley加法解释机器学习模型医院实际数据的解释与应用链接：https://arxiv.org/abs/2112.11071

作者：Yasunobu Nohara,Koutarou Matsumoto,Hidehisa Soejima,Naoki Nakashima 机构：Nakashima, Kumamoto University, Kumamoto, JAPAN, Kurume University, Fukuoka, JAPAN, Saiseikai Kumamoto Hospital, Kumamoto, JAPAN, Kyushu University Hospital, Fukuoka, JAPAN 备注：Computer Methods and Programs in Biomedicine, Vol. 214, Article 106584 摘要：在决策过程中使用机器学习技术时，模型的可解释性非常重要。在本论文中，我们采用Shapley加法解释（SHAP），该解释基于多个利益相关者之间的公平利润分配，取决于他们的贡献，用于解释使用医院数据的梯度提升决策树模型。为了更好的解释性，我们提出了以下两种新技术：（1）使用SHAP的新特征重要性度量；（2）称为特征打包的技术，该技术将多个相似特征打包为一个分组特征，以便在不重建模型的情况下更容易理解模型。然后，我们比较了SHAP框架和现有方法之间的解释结果。此外，我们利用我们的医院数据和建议的技术，展示了A/G比率如何作为脑梗死的一个重要预后因素。摘要：When using machine learning techniques in decision-making processes, the interpretability of the models is important. In the present paper, we adopted the Shapley additive explanation (SHAP), which is based on fair profit allocation among many stakeholders depending on their contribution, for interpreting a gradient-boosting decision tree model using hospital data. For better interpretability, we propose two novel techniques as follows: (1) a new metric of feature importance using SHAP and (2) a technique termed feature packing, which packs multiple similar features into one grouped feature to allow an easier understanding of the model without reconstruction of the model. We then compared the explanation results between the SHAP framework and existing methods. In addition, we showed how the A/G ratio works as an important prognostic factor for cerebral infarction using our hospital data and proposed techniques.

【11】 Distributed Machine Learning and the Semblance of Trust 标题：分布式机器学习与表面上的信任链接：https://arxiv.org/abs/2112.11040

作者：Dmitrii Usynin,Alexander Ziller,Daniel Rueckert,Jonathan Passerat-Palmbach,Georgios Kaissis 机构：Department of Computing, Imperial College London, Institute for Artificial Intelligence and Informatics in Medicine, Technical University of Munich, Institute of Diagnostic and Interventional Radiology, Technical University of Munich 备注：Accepted at The Third AAAI Workshop on Privacy-Preserving Artificial Intelligence 摘要：为了促进对许多有意义问题的科学洞察，需要大规模利用大型和多样化的机器学习（ML）数据集。然而，由于GDPR等数据治理法规以及道德问题，个人和敏感数据的聚合存在问题，这促使开发了分布式ML（DML）等替代策略。联邦学习（FL）等技术允许数据所有者维护数据治理并在本地执行模型训练，而无需共享数据。FL和相关技术通常被描述为隐私保护。我们解释了为什么这个术语不合适，并概述了与过度依赖未考虑隐私正式定义的协议相关的风险。我们进一步提供建议和示例，说明如何增强此类算法，从而为一般ML受众提供治理、安全、隐私和可验证性保证，而无需事先接触正式的隐私技术。摘要：The utilisation of large and diverse datasets for machine learning (ML) at scale is required to promote scientific insight into many meaningful problems. However, due to data governance regulations such as GDPR as well as ethical concerns, the aggregation of personal and sensitive data is problematic, which prompted the development of alternative strategies such as distributed ML (DML). Techniques such as Federated Learning (FL) allow the data owner to maintain data governance and perform model training locally without having to share their data. FL and related techniques are often described as privacy-preserving. We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind. We further provide recommendations and examples on how such algorithms can be augmented to provide guarantees of governance, security, privacy and verifiability for a general ML audience without prior exposure to formal privacy techniques.

【12】 Mapping industrial poultry operations at scale with deep learning and aerial imagery 标题：利用深度学习和航空图像绘制规模的工业家禽作业图链接：https://arxiv.org/abs/2112.10988

作者：Caleb Robinson,Ben Chugg,Brandon Anderson,Juan M. Lavista Ferres,Daniel E. Ho 机构：E. Ho, Microsoft AI for Good Research Lab, Redmond, WA, Stanford RegLab, Stanford, CA 摘要：集中动物饲养作业（CAFO）对空气、水和公共健康构成严重风险，但已证明其监管具有挑战性。美国政府问责局指出，一个基本挑战是缺乏关于CAFO的全面位置信息。我们使用美国农业部的国家农业图像计划（NAIP）1m/像素航空图像检测美国大陆的家禽咖啡馆。我们训练卷积神经网络（CNN）模型来识别单个家禽饲养场，并将性能最佳的模型应用于超过42 TB的图像，以创建第一个国家级、开放源代码的家禽CAFO数据集。我们根据加利福尼亚州10个手动标记县的家禽CAFO设施位置的验证集验证模型预测，并证明该方法具有填补环境监测空白的巨大潜力。摘要：Concentrated Animal Feeding Operations (CAFOs) pose serious risks to air, water, and public health, but have proven to be challenging to regulate. The U.S. Government Accountability Office notes that a basic challenge is the lack of comprehensive location information on CAFOs. We use the USDA's National Agricultural Imagery Program (NAIP) 1m/pixel aerial imagery to detect poultry CAFOs across the continental United States. We train convolutional neural network (CNN) models to identify individual poultry barns and apply the best performing model to over 42 TB of imagery to create the first national, open-source dataset of poultry CAFOs. We validate the model predictions against held-out validation set on poultry CAFO facility locations from 10 hand-labeled counties in California and demonstrate that this approach has significant potential to fill gaps in environmental monitoring.

【13】 Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting 标题：具有输入独立动态重路由的紧凑型多级稀疏神经网络链接：https://arxiv.org/abs/2112.10930

作者：Minghai Qin,Tianyun Zhang,Fei Sun,Yen-Kuang Chen,Makan Fardad,Yanzhi Wang,Yuan Xie 机构：Western Digital Research, Cleveland State University, Alibaba DAMO Academy, Syracuse University, Northeastern University 摘要：深度神经网络（DNN）在许多实际应用中表现出卓越的性能，但其巨大的计算成本和存储需求使其无法部署到许多边缘和物联网（IoT）设备上。稀疏深度神经网络的大部分权值参数为零，可以大大降低模型的计算复杂度和内存消耗。在实际使用场景中，设备在不同的环境下可能会受到可用计算和内存资源的大幅波动，并且由于存在延迟较大的长尾推断，服务质量（QoS）难以维持。面对现实生活中的挑战，我们建议训练一个支持多个稀疏层次的稀疏模型。即，满足权重的层次结构，使得稀疏子模型的稀疏子模型区域子集的非零参数的位置和值小于稀疏子模型的稀疏子模型区域子集的非零参数的位置和值。通过这种方式，可以在推理过程中动态选择合适的稀疏度级别，而存储成本由最小稀疏子模型来限制。我们已经在各种DNN模型和任务上验证了我们的方法，包括ResNet-50、PointNet++、GNMT和图形注意网络。我们得到的稀疏子模型平均权重为13.38%，失败率为14.97%，而精度与密集子模型相当。在相对精度损失仅为3.25%的情况下，可以获得权重为5.38%和FLOPs为4.47%的更稀疏子模型，这些子模型是较稀疏子模型的子集。摘要：Deep neural networks (DNNs) have shown to provide superb performance in many real life applications, but their large computation cost and storage requirement have prevented them from being deployed to many edge and internet-of-things (IoT) devices. Sparse deep neural networks, whose majority weight parameters are zeros, can substantially reduce the computation complexity and memory consumption of the models. In real-use scenarios, devices may suffer from large fluctuations of the available computation and memory resources under different environment, and the quality of service (QoS) is difficult to maintain due to the long tail inferences with large latency. Facing the real-life challenges, we propose to train a sparse model that supports multiple sparse levels. That is, a hierarchical structure of weights are satisfied such that the locations and the values of the non-zero parameters of the more-sparse sub-model area subset of the less-sparse sub-model. In this way, one can dynamically select the appropriate sparsity level during inference, while the storage cost is capped by the least sparse sub-model. We have verified our methodologies on a variety of DNN models and tasks, including the ResNet-50, PointNet++, GNMT, and graph attention networks. We obtain sparse sub-models with an average of 13.38% weights and 14.97% FLOPs, while the accuracies are as good as their dense counterparts. More-sparse sub-models with 5.38% weights and 4.47% of FLOPs, which are subsets of the less-sparse ones, can be obtained with only 3.25% relative accuracy loss.

【14】 Load-balanced Gather-scatter Patterns for Sparse Deep Neural Networks 标题：稀疏深度神经网络的负载平衡集散模式链接：https://arxiv.org/abs/2112.10898

作者：Fei Sun,Minghai Qin,Tianyun Zhang,Xiaolong Ma,Haoran Li,Junwen Luo,Zihao Zhao,Yen-Kuang Chen,Yuan Xie 机构：Alibaba DAMO Academy, Western Digital Research, Cleveland State University, Northeastern Unviersity, Fudan University 摘要：深度神经网络（DNN）已被证明能有效地解决许多实际问题，但其高昂的计算成本使得这些模型无法应用于边缘设备。剪枝作为一种将零引入模型权重的方法，已被证明是在模型精度和计算效率之间提供良好折衷的有效方法，并且是生成压缩模型的一种广泛使用的方法。但是，修剪的粒度会做出重要的权衡。在相同的稀疏度水平下，粗粒度结构化稀疏模式在传统硬件上效率更高，但精度更差，而细粒度非结构化稀疏模式可以获得更好的精度，但在现有硬件上效率较低。另一方面，一些现代处理器配备了快速片上暂存器存储器和收集/分散引擎，可对这些存储器执行间接加载和存储操作。在这项工作中，我们提出了一组新的稀疏模式，称为聚集-分散（GS）模式，以利用草稿行存储器和聚集/分散引擎来加速神经网络推理。相应地，我们提出了一种紧凑的稀疏格式。提出的稀疏模式集，以及一种新的修剪方法，解决了负载不平衡问题，使模型的质量接近非结构化稀疏模型，计算效率接近结构化稀疏模型。我们的实验表明，与传统的结构化稀疏模式相比，GS模式在精度和计算效率之间始终具有更好的权衡。GS模式可以在相同的精度级别将DNN组件的运行时间减少两到三倍。这在三种不同的深度学习任务和流行模型上得到了证实，即用于机器翻译的GNMT、用于图像识别的ResNet50和用于声学语音识别的Japser。摘要：Deep neural networks (DNNs) have been proven to be effective in solving many real-life problems, but its high computation cost prohibits those models from being deployed to edge devices. Pruning, as a method to introduce zeros to model weights, has shown to be an effective method to provide good trade-offs between model accuracy and computation efficiency, and is a widely-used method to generate compressed models. However, the granularity of pruning makes important trade-offs. At the same sparsity level, a coarse-grained structured sparse pattern is more efficient on conventional hardware but results in worse accuracy, while a fine-grained unstructured sparse pattern can achieve better accuracy but is inefficient on existing hardware. On the other hand, some modern processors are equipped with fast on-chip scratchpad memories and gather/scatter engines that perform indirect load and store operations on such memories. In this work, we propose a set of novel sparse patterns, named gather-scatter (GS) patterns, to utilize the scratchpad memories and gather/scatter engines to speed up neural network inferences. Correspondingly, we present a compact sparse format. The proposed set of sparse patterns, along with a novel pruning methodology, address the load imbalance issue and result in models with quality close to unstructured sparse models and computation efficiency close to structured sparse models. Our experiments show that GS patterns consistently make better trade-offs between accuracy and computation efficiency compared to conventional structured sparse patterns. GS patterns can reduce the runtime of the DNN components by two to three times at the same accuracy levels. This is confirmed on three different deep learning tasks and popular models, namely, GNMT for machine translation, ResNet50 for image recognition, and Japser for acoustic speech recognition.

【15】 VELVET: a noVel Ensemble Learning approach to automatically locate VulnErable sTatements 标题：VELVET：一种新的自动定位易受攻击语句的集成学习方法链接：https://arxiv.org/abs/2112.10893

作者：Yangruibo Ding,Sahil Suneja,Yunhui Zheng,Jim Laredo,Alessandro Morari,Gail Kaiser,Baishakhi Ray 机构：∗Columbia University, †IBM Research 备注：Accepted by the Research Track of 29th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2022) 摘要：在源代码中自动定位易受攻击的语句对于确保软件安全和减轻开发人员的调试工作至关重要。在当今的软件生态系统中，这一点变得更为重要，易受攻击的代码可以在GitHub这样的软件存储库中轻松、不知不觉地流动。在数百万行代码中，传统的静态和动态方法难以扩展。尽管现有的基于机器学习的方法在这样的环境下看起来很有前途，但大多数工作都是在更高的粒度上检测易受攻击的代码——在方法或文件级别。因此，开发人员仍然需要检查大量代码来定位需要修复的易受攻击的语句。本文提出了一种新的集成学习方法VELVET来定位易受攻击的语句。我们的模型结合了基于图和基于序列的神经网络，成功地捕获程序图的局部和全局上下文，并有效地理解代码语义和易受攻击的模式。为了研究VELVET的有效性，我们使用了现成的合成数据集和最近发布的真实数据集。在静态分析设置中，在未提前检测到易受攻击的功能的情况下，VELVET在真实数据上的性能是基线静态分析器的4.5倍。对于孤立的漏洞定位任务，我们假设函数的漏洞是已知的，而特定的漏洞语句是未知的，我们将Velve与几个同样关注局部和全局代码上下文的神经网络进行比较。与合成数据和真实数据相比，VELVET分别达到99.6%和43.6%的top-1准确率，优于基线深度学习模型5.3-29.0%。摘要：Automatically locating vulnerable statements in source code is crucial to assure software security and alleviate developers' debugging efforts. This becomes even more important in today's software ecosystem, where vulnerable code can flow easily and unwittingly within and across software repositories like GitHub. Across such millions of lines of code, traditional static and dynamic approaches struggle to scale. Although existing machine-learning-based approaches look promising in such a setting, most work detects vulnerable code at a higher granularity -- at the method or file level. Thus, developers still need to inspect a significant amount of code to locate the vulnerable statement(s) that need to be fixed. This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements. Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph and effectively understand code semantics and vulnerable patterns. To study VELVET's effectiveness, we use an off-the-shelf synthetic dataset and a recently published real-world dataset. In the static analysis setting, where vulnerable functions are not detected in advance, VELVET achieves 4.5x better performance than the baseline static analyzers on the real-world data. For the isolated vulnerability localization task, where we assume the vulnerability of a function is known while the specific vulnerable statement is unknown, we compare VELVET with several neural networks that also attend to local and global context of code. VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively, outperforming the baseline deep-learning models by 5.3-29.0%.

【16】 PRONTO: Preamble Overhead Reduction with Neural Networks for Coarse Synchronization 标题：PROTON：利用神经网络降低前同步码的粗同步开销链接：https://arxiv.org/abs/2112.10885

作者：Nasim Soltani,Debashri Roy,Kaushik Chowdhury 机构：Electrical and Computer Engineering Department, Northeastern University, Boston, MA 摘要：在基于IEEE 802.11 WiFi的波形中，接收机使用称为传统短训练场（L-STF）的前导码的第一个字段执行粗略的时间和频率同步。L-STF占用多达40%的前导码长度，并占用多达32 us的广播时间。为了减少通信开销，我们提出了一种改进的波形，其中通过消除L-STF来减少前导长度。为了解码这种修改后的波形，我们提出了一种称为PRONTO的基于机器学习（ML）的方案，该方案使用其他前导字段，特别是传统的长训练字段（L-LTF），执行粗略的时间和频率估计。我们的贡献有三个方面：（i）我们提出了PRONTO，其特点是定制卷积神经网络（CNN）用于数据包检测和粗CFO估计，以及用于稳健训练的数据增强步骤。（ii）我们提出了一种广义决策流程，使PRONTO与包括标准L-STF的传统波形兼容。（iii）我们验证了来自软件定义无线电（SDR）测试平台的无线WiFi数据集的结果。我们的评估表明，PRONTO能够以100%的准确率执行数据包检测，并且能够以小到3%的误差执行粗略的CFO估计。我们证明了PRONTO在无误码率（BER）下降的情况下提供高达40%的前导码长度缩减。最后，我们通过实验展示了PRONTO通过GPU并行化在相应的纯CPU实现上实现的加速。摘要：In IEEE 802.11 WiFi-based waveforms, the receiver performs coarse time and frequency synchronization using the first field of the preamble known as the legacy short training field (L-STF). The L-STF occupies upto 40% of the preamble length and takes upto 32 us of airtime. With the goal of reducing communication overhead, we propose a modified waveform, where the preamble length is reduced by eliminating the L-STF. To decode this modified waveform, we propose a machine learning (ML)-based scheme called PRONTO that performs coarse time and frequency estimations using other preamble fields, specifically the legacy long training field (L-LTF). Our contributions are threefold: (i) We present PRONTO featuring customized convolutional neural networks (CNNs) for packet detection and coarse CFO estimation, along with data augmentation steps for robust training. (ii) We propose a generalized decision flow that makes PRONTO compatible with legacy waveforms that include the standard L-STF. (iii) We validate the outcomes on an over-the-air WiFi dataset from a testbed of software defined radios (SDRs). Our evaluations show that PRONTO can perform packet detection with 100% accuracy, and coarse CFO estimation with errors as small as 3%. We demonstrate that PRONTO provides upto 40% preamble length reduction with no bit error rate (BER) degradation. Finally, we experimentally show the speedup achieved by PRONTO through GPU parallelization over the corresponding CPU-only implementations.

【17】 Learning Bayesian Networks in the Presence of Structural Side Information 标题：结构边信息存在下的贝叶斯网络学习链接：https://arxiv.org/abs/2112.10884

作者：Ehsan Mokhtarian,Sina Akbari,Fateme Jamshidi,Jalal Etesami,Negar Kiyavash 机构： Department of Computer and Communication Science, EPFL, Lausanne, Switzerland, College of Management of Technology, EPFL, Lausanne, Switzerland 备注：20 pages, 7 figures, 5 tables, AAAI 2022 conference 摘要：我们研究了当系统的结构侧信息可用时，学习一组变量的贝叶斯网络（BN）的问题。众所周知，学习一般BN的结构在计算和统计上都具有挑战性。然而，在许多应用程序中，关于底层结构的附加信息可能会降低学习的复杂性。在本文中，我们开发了一种基于递归约束的算法，该算法可以有效地将这些知识（即边信息）融入到学习过程中。特别是，我们研究了两类关于基本BN的结构侧信息：（I）其团数的上界已知，或（II）无钻石。我们为学习算法提供理论保证，包括每个场景中所需的最坏情况测试数。作为我们工作的结果，我们证明了有界树宽BNs可以以多项式复杂度学习。此外，我们评估了我们的算法在合成和真实结构中的性能和可伸缩性，并表明它们优于最先进的结构学习算法。摘要：We study the problem of learning a Bayesian network (BN) of a set of variables when structural side information about the system is available. It is well known that learning the structure of a general BN is both computationally and statistically challenging. However, often in many applications, side information about the underlying structure can potentially reduce the learning complexity. In this paper, we develop a recursive constraint-based algorithm that efficiently incorporates such knowledge (i.e., side information) into the learning process. In particular, we study two types of structural side information about the underlying BN: (I) an upper bound on its clique number is known, or (II) it is diamond-free. We provide theoretical guarantees for the learning algorithms, including the worst-case number of tests required in each scenario. As a consequence of our work, we show that bounded treewidth BNs can be learned with polynomial complexity. Furthermore, we evaluate the performance and the scalability of our algorithms in both synthetic and real-world structures and show that they outperform the state-of-the-art structure learning algorithms.

【18】 AGPNet -- Autonomous Grading Policy Network 标题：AGPNet--自主评分政策网络链接：https://arxiv.org/abs/2112.10877

作者：Chana Ross,Yakov Miron,Yuval Goldfracht,Dotan Di Castro 机构： Israel 2Charney School of Marine Sciences, University of Haifa 备注：7 pages, paper submitted to IEEE International Conference on Robotics and Automation 摘要：在这项工作中，我们建立了启发式和学习策略，用于自动控制推土机在布满沙堆的不均匀区域进行分级。我们将该问题形式化为一个马尔可夫决策过程，设计了一个演示agent与环境交互的仿真，最后将我们的模拟器与一个真正的推土机原型进行了比较。我们使用强化学习、行为克隆和对比学习的方法来训练混合策略。我们经过训练的代理AGPNet达到了人类水平的性能，并且在自主评分任务方面优于当前最先进的机器学习方法。此外，我们的代理能够从随机场景推广到看不见的现实世界问题。摘要：In this work, we establish heuristics and learning strategies for the autonomous control of a dozer grading an uneven area studded with sand piles. We formalize the problem as a Markov Decision Process, design a simulation which demonstrates agent-environment interactions and finally compare our simulator to a real dozer prototype. We use methods from reinforcement learning, behavior cloning and contrastive learning to train a hybrid policy. Our trained agent, AGPNet, reaches human-level performance and outperforms current state-of-the-art machine learning methods for the autonomous grading task. In addition, our agent is capable of generalizing from random scenarios to unseen real world problems.

【19】 Efficient Tensor Robust PCA under Hybrid Model of Tucker and Tensor Train 标题：Tucker和张量训练混合模型下的有效张量鲁棒PCA 链接：https://arxiv.org/abs/2112.10771

作者：Yuning Qiu,Guoxu Zhou,Zhenhao Huang,Qibin Zhao,Shengli Xie 机构：XiearewiththeSchoolofAutomation, GuangdongUniversityofTechnology 摘要：张量稳健主成分分析（TRPCA）是机器学习和计算机视觉的基本模型。最近，张量序列（TT）分解已被证实能有效地捕获张量恢复任务的全局低秩相关性。然而，由于实际应用中存在大规模的张量数据，以往的TRPCA模型往往具有较高的计算复杂度。在这封信中，我们在Tucker和TT的混合模型下提出了一种有效的TRPCA。具体来说，在理论上，我们揭示了原始大张量的TT核范数（TTNN）可以通过Tucker压缩格式等价地转换为更小张量的TT核范数，从而显著降低奇异值分解（SVD）的计算成本。在合成和真实张量数据上的数值实验验证了该模型的优越性。摘要：Tensor robust principal component analysis (TRPCA) is a fundamental model in machine learning and computer vision. Recently, tensor train (TT) decomposition has been verified effective to capture the global low-rank correlation for tensor recovery tasks. However, due to the large-scale tensor data in real-world applications, previous TRPCA models often suffer from high computational complexity. In this letter, we propose an efficient TRPCA under hybrid model of Tucker and TT. Specifically, in theory we reveal that TT nuclear norm (TTNN) of the original big tensor can be equivalently converted to that of a much smaller tensor via a Tucker compression format, thereby significantly reducing the computational cost of singular value decomposition (SVD). Numerical experiments on both synthetic and real-world tensor data verify the superiority of the proposed model.

【20】 Logarithmic Unbiased Quantization: Practical 4-bit Training in Deep Learning 标题：对数无偏量化：深度学习中的实用4位训练链接：https://arxiv.org/abs/2112.10769

作者：Brian Chmiel,Ron Banner,Elad Hoffer,Hilla Ben Yaacov,Daniel Soudry 机构：†Habana Labs – An Intel company, Caesarea, Israel, ◦Department of Electrical Engineering - Technion, Haifa, Israel 摘要：量化权重和激活是减少深度神经网络（DNN）训练计算量的主要方法之一。当前的方法能够对前向相位进行4位量化。然而，这只占训练过程的三分之一。减少整个训练过程的计算量需要量化神经梯度，即相对于中间神经层输出的损失梯度。在这项工作中，我们检验了无偏量化在量化神经网络训练中的重要性，在何处保持无偏量化，以及如何保持无偏量化。在此基础上，我们提出了一种$ extit{对数无偏量化}$（LUQ）方法，将前向和后向相位量化为4位，在无开销的情况下实现4位训练的最新结果。例如，在ImageNet上的ResNet50中，我们实现了1.18%的降级。在单历元的高精度微调和方差减少方法相结合后，我们进一步将其改进为仅降低0.64%——这两种方法都增加了与先前建议的方法相当的开销。最后，我们提出了一种使用低精度格式的方法，以避免在三分之二的训练过程中进行乘法运算，从而将乘法器使用的面积减少5倍。摘要：Quantization of the weights and activations is one of the main methods to reduce the computational footprint of Deep Neural Networks (DNNs) training. Current methods enable 4-bit quantization of the forward phase. However, this constitutes only a third of the training process. Reducing the computational footprint of the entire training process requires the quantization of the neural gradients, i.e., the loss gradients with respect to the outputs of intermediate neural layers. In this work, we examine the importance of having unbiased quantization in quantized neural network training, where to maintain it, and how. Based on this, we suggest a $ extit{logarithmic unbiased quantization}$ (LUQ) method to quantize both the forward and backward phase to 4-bit, achieving state-of-the-art results in 4-bit training without overhead. For example, in ResNet50 on ImageNet, we achieved a degradation of 1.18%. We further improve this to degradation of only 0.64% after a single epoch of high precision fine-tuning combined with a variance reduction method -- both add overhead comparable to previously suggested methods. Finally, we suggest a method that uses the low precision format to avoid multiplications during two-thirds of the training process, thus reducing by 5x the area used by the multiplier.

【21】 Improving Learning-to-Defer Algorithms Through Fine-Tuning 标题：通过微调改进学习延迟算法链接：https://arxiv.org/abs/2112.10768

作者：Naveen Raman,Michael Yee 机构：Department of Computer Science, University of Maryland, College Park, MD, MIT Lincoln Laboratory, Lexington, MA 摘要：人工智能的普遍存在导致了人类和人工智能共同工作的情况，因此需要学习延迟算法，以确定如何在人工智能和人类之间划分任务。当与特定个体配对时，我们通过合并两种微调算法并使用合成和图像数据集测试其有效性来改进延迟算法的学习。我们发现，微调可以捕捉简单的人类技能模式，但要与细微差别作斗争，我们建议今后的工作使用稳健的半监督模式来改进学习。摘要：The ubiquity of AI leads to situations where humans and AI work together, creating the need for learning-to-defer algorithms that determine how to partition tasks between AI and humans. We work to improve learning-to-defer algorithms when paired with specific individuals by incorporating two fine-tuning algorithms and testing their efficacy using both synthetic and image datasets. We find that fine-tuning can pick up on simple human skill patterns, but struggles with nuance, and we suggest future work that uses robust semi-supervised to improve learning.

【22】 A Grid-Structured Model of Tubular Reactors 标题：管式反应器的网格结构模型链接：https://arxiv.org/abs/2112.10765

作者：Katsiaryna Haitsiukevich,Samuli Bergman,Cesar de Araujo Filho,Francesco Corona,Alexander Ilin 机构：Aalto University, Neste, Espoo, Finland, ©, IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including 备注：2021 IEEE 19th International Conference on Industrial Informatics (INDIN) 摘要：我们提出了一个管状反应器的网格状计算模型。该体系结构的灵感来自于偏微分方程解算器的计算，偏微分方程描述了管式反应器内化学过程的动力学。所提出的模型可能完全基于偏微分方程的已知形式，也可能包含通用的机器学习组件，如多层感知器。我们表明，所提出的模型可以使用有限的数据来描述固定床催化反应器的状态。经过训练的模型可以使用沿反应器的入口浓度和温度测量值重建未测量的状态，如催化剂活性。摘要：We propose a grid-like computational model of tubular reactors. The architecture is inspired by the computations performed by solvers of partial differential equations which describe the dynamics of the chemical process inside a tubular reactor. The proposed model may be entirely based on the known form of the partial differential equations or it may contain generic machine learning components such as multi-layer perceptrons. We show that the proposed model can be trained using limited amounts of data to describe the state of a fixed-bed catalytic reactor. The trained model can reconstruct unmeasured states such as the catalyst activity using the measurements of inlet concentrations and temperatures along the reactor.

【23】 Physics-informed neural network method for modelling beam-wall interactions 标题：模拟梁-墙相互作用的物理信息神经网络方法链接：https://arxiv.org/abs/2112.11323

作者：Kazuhiro Fujita 机构： Department of Information Systems, Saitama Institute of Technology, Fusaiji, Fukaya City ,-, Japan 备注：3 pages, 3 figures, submitted for IET possible publications 摘要：提出了一种模拟粒子加速器中梁-壁相互作用的无网格方法。我们的方法的关键思想是使用深度神经网络作为一组偏微分方程（涉及粒子束和表面阻抗概念）的替代解。将该方法应用于具有薄导电涂层的加速器真空室的耦合阻抗，并与现有的解析公式进行了比较验证。摘要：A mesh-free approach for modelling beam-wall interactions in particle accelerators is proposed. The key idea of our method is to use a deep neural network as a surrogate for the solution to a set of partial differential equations involving the particle beam, and the surface impedance concept. The proposed approach is applied to the coupling impedance of an accelerator vacuum chamber with thin conductive coating, and also verified in comparison with the existing analytical formula.

【24】 Deep Learning Based Cloud Cover Parameterization for ICON 标题：基于深度学习的图标云层参数化链接：https://arxiv.org/abs/2112.11317

作者：Arthur Grundner,Tom Beucler,Fernando Iglesias-Suarez,Pierre Gentine,Marco A. Giorgetta,Veronika Eyring 机构：Deutsches Zentrum f¨ur Luft- und Raumfahrt e.V. (DLR), Institut f¨ur Physik der Atmosph¨are, Oberpfaffenhofen, Germany, Columbia University, Center for Learning the Earth with Artificial intelligence And Physics (LEAP), New York, NY , USA 备注：40 pages, 16 figures, Submitted to 'Journal of Advances in Modeling Earth Systems' (JAMES) 摘要：改进气候模型内的云参数化和气候预测的一个有希望的方法是将深度学习与风暴分辨率模型（SRM）模拟的训练数据相结合。二十面体非流体静力（ICON）建模框架允许从数值天气预测到气候预测的模拟，这使其成为为亚网格尺度过程开发基于神经网络（NN）的参数化的理想目标。在ICON框架内，我们基于真实的区域和全球ICON SRM模拟，使用粗粒度数据训练基于神经网络的云量参数化。我们建立了三种不同类型的NNs，它们在从粗粒度大气状态变量诊断云量时假设的垂直局部化程度不同。NNs从与训练数据具有相似地理特征的粗粒度数据中准确估计亚网格尺度的云量。此外，经过全局训练的NNs可以重现区域SRM模拟的亚网格尺度云量。利用基于博弈论的可解释性库SHapley加法解释，我们发现过分强调特定湿度和云冰是基于列的神经网络无法从全球粗粒度SRM数据完美推广到区域粗粒度SRM数据的原因。可解释性工具还有助于可视化区域和全球训练的柱状NNs之间特征重要性的相似性和差异，并揭示其云量预测和热力学环境之间的局部关系。我们的结果显示了深入学习从全球SRM中获得准确但可解释的云量参数化的潜力，并表明基于邻域的模型可能是精度和可推广性之间的一个很好的折衷方案。摘要：A promising approach to improve cloud parameterizations within climate models and thus climate projections is to use deep learning in combination with training data from storm-resolving model (SRM) simulations. The Icosahedral Non-Hydrostatic (ICON) modeling framework permits simulations ranging from numerical weather prediction to climate projections, making it an ideal target to develop neural network (NN) based parameterizations for sub-grid scale processes. Within the ICON framework, we train NN based cloud cover parameterizations with coarse-grained data based on realistic regional and global ICON SRM simulations. We set up three different types of NNs that differ in the degree of vertical locality they assume for diagnosing cloud cover from coarse-grained atmospheric state variables. The NNs accurately estimate sub-grid scale cloud cover from coarse-grained data that has similar geographical characteristics as their training data. Additionally, globally trained NNs can reproduce sub-grid scale cloud cover of the regional SRM simulation. Using the game-theory based interpretability library SHapley Additive exPlanations, we identify an overemphasis on specific humidity and cloud ice as the reason why our column-based NN cannot perfectly generalize from the global to the regional coarse-grained SRM data. The interpretability tool also helps visualize similarities and differences in feature importance between regionally and globally trained column-based NNs, and reveals a local relationship between their cloud cover predictions and the thermodynamic environment. Our results show the potential of deep learning to derive accurate yet interpretable cloud cover parameterizations from global SRMs, and suggest that neighborhood-based models may be a good compromise between accuracy and generalizability.

【25】 Preserving gauge invariance in neural networks 标题：在神经网络中保持规范不变性链接：https://arxiv.org/abs/2112.11239

作者：Matteo Favoni,Andreas Ipp,David I. Müller,Daniel Schuh 机构：Institute for Theoretical Physics, TU Wien, Wiedner Hauptstr. ,-, Vienna, Austria, Speaker and corresponding author 备注：8 pages, 3 figures, proceedings for vConf 2021 摘要：在这些过程中，我们提出了格点规范等变卷积神经网络（L-CNN），它能够处理来自格点规范理论模拟的数据，同时精确地保持规范对称性。我们回顾了体系结构的各个方面，并展示了L-CNN如何在格上表示一大类规范不变和等变函数。我们使用非线性回归问题比较了L-CNN和非等变网络的性能，并证明了非等变模型的规范不变性是如何被打破的。摘要：In these proceedings we present lattice gauge equivariant convolutional neural networks (L-CNNs) which are able to process data from lattice gauge theory simulations while exactly preserving gauge symmetry. We review aspects of the architecture and show how L-CNNs can represent a large class of gauge invariant and equivariant functions on the lattice. We compare the performance of L-CNNs and non-equivariant networks using a non-linear regression problem and demonstrate how gauge invariance is broken for non-equivariant models.

【26】 Manifold learning via quantum dynamics 标题：基于量子动力学的流形学习链接：https://arxiv.org/abs/2112.11161

作者：Akshat Kumar,Mohan Sarovar 机构：)Department of Mathematics, Clarkson University, Potsdam, NY , USA, )Instituto de Telecomunica¸c˜oes, Lisbon, Portugal, )Sandia National Laboratories, Livermore, California , USA 备注：This is a companion paper to "On a quantum-classical correspondence: from graphs to manifolds" arXiv:2112.10748 摘要：我们介绍了一种在采样流形上计算测地线的算法，该算法依赖于在采样数据的图嵌入上模拟量子动力学。我们的方法利用了半经典分析和量子经典对应中的经典结果，并为学习数据集采样的流形以及随后高维数据集的非线性降维技术奠定了基础。2019冠状病毒疾病的数据流，我们用模型流形采样数据，并通过基于COVID-19的移动性数据的聚类演示。最后，我们的方法揭示了数据采样和量化提供的离散化之间有趣的联系。摘要：We introduce an algorithm for computing geodesics on sampled manifolds that relies on simulation of quantum dynamics on a graph embedding of the sampled data. Our approach exploits classic results in semiclassical analysis and the quantum-classical correspondence, and forms a basis for techniques to learn the manifold from which a dataset is sampled, and subsequently for nonlinear dimensionality reduction of high-dimensional datasets. We illustrate the new algorithm with data sampled from model manifolds and also by a clustering demonstration based on COVID-19 mobility data. Finally, our method reveals interesting connections between the discretization provided by data sampling and quantization.

【27】 High pressure hydrogen by machine learning and quantum Monte Carlo 标题：基于机器学习和量子蒙特卡罗的高压氢研究链接：https://arxiv.org/abs/2112.11099

作者：Andrea Tirelli,Giacomo Tenti,Kousuke Nakano,Sandro Sorella 机构：International School for Advanced Studies (SISSA), Via Bonomea , Trieste, Italy, School of Information Science, JAIST, Asahidai ,-, Nomi, Ishikawa ,-, Japan, Computational Materials Science Research Team 备注：6 + 14 pages, comments welcome! 摘要：我们开发了一种结合量子蒙特卡罗描述电子关联的准确性和机器学习势（MLP）效率的技术。我们使用核线性回归结合SOAP（平滑重叠原子位置）方法，在这里以一种非常有效的方式实现。关键要素是：i）基于最远点采样的稀疏化技术，确保MLP的通用性和可转移性；ii）所谓的$Delta$-学习，允许小的训练数据集，这是高精度但计算要求高的计算的基本属性，比如基于量子蒙特卡罗的那些。作为第一个应用，我们对高压氢的液-液转变进行了基准研究，并通过强调高精度对于这个非常有争议的主题的重要性，展示了我们的MLP的质量。在这个主题中，实验室实验很困难，理论还远未得出结论。摘要：We have developed a technique combining the accuracy of quantum Monte Carlo in describing the electron correlation with the efficiency of a machine learning potential (MLP). We use kernel linear regression in combination with SOAP (Smooth Overlap Atomic Position) approach, implemented here in a very efficient way. The key ingredients are: i) a sparsification technique, based on farthest point sampling, ensuring generality and transferability of our MLPs and ii) the so called $Delta$-learning, allowing a small training data set, a fundamental property for highly accurate but computationally demanding calculations, such as the ones based on quantum Monte Carlo. As a first application we present a benchmark study of the liquid-liquid transition of high-pressure hydrogen and show the quality of our MLP, by emphasizing the importance of high accuracy for this very debated subject, where experiments are difficult in the lab, and theory is still far from being conclusive.

【28】 Differentiated uniformization: A new method for inferring Markov chains on combinatorial state spaces including stochastic epidemic models 标题：微分均匀化：一种推导包含随机流行病模型的组合状态空间上马氏链的新方法链接：https://arxiv.org/abs/2112.10971

作者：Kevin Rupp,Rudolf Schill,Jonas Süskind,Peter Georg,Maren Klever,Andreas Lösch,Lars Grasedyck,Tilo Wettig,Rainer Spang 机构：Department of Statistical Bioinformatics, University of Regensburg, Regensburg, Germany, Department of Physics, University of Regensburg, Regensburg, Germany, Institut f¨ur Geometrie und Praktische Mathematik, RWTH Aachen University, Aachen, Germany 摘要：动机：我们考虑连续时间马尔可夫链描述动态系统的随机演变的过渡率矩阵$ Q$，这取决于参数$θ$。计算$t$时状态的概率分布需要矩阵指数$exp（tQ）$，从数据推断$ heta$需要其导数$partialexp！（tQ）/partial heta$。当状态空间和$Q$的大小很大时，两者都很难计算。当状态空间由几个相互作用的离散变量的值的所有组合组成时，就会发生这种情况。通常甚至不可能存储$Q$。然而，当$Q$可以写为张量积之和时，通过均匀化方法计算$exp（tQ）$是可行的，它不需要$Q$的显式存储。结果：这里我们提供了一个计算$partialexp的类似算法！（tQ）/partial heta$，差异化均匀化方法。我们证明了传染病扩散的随机SIR模型的算法，我们证明了$Q$可以写成张量积的和。我们估计每月的感染和恢复率在第一波的COVID-19大流行在奥地利，并量化其不确定性在一个完整的贝叶斯分析。可用性：实施和数据可在https://github.com/spang-lab/TenSIR. 摘要：Motivation: We consider continuous-time Markov chains that describe the stochastic evolution of a dynamical system by a transition-rate matrix $Q$ which depends on a parameter $ heta$. Computing the probability distribution over states at time $t$ requires the matrix exponential $exp(tQ)$, and inferring $ heta$ from data requires its derivative $partialexp!(tQ)/partial heta$. Both are challenging to compute when the state space and hence the size of $Q$ is huge. This can happen when the state space consists of all combinations of the values of several interacting discrete variables. Often it is even impossible to store $Q$. However, when $Q$ can be written as a sum of tensor products, computing $exp(tQ)$ becomes feasible by the uniformization method, which does not require explicit storage of $Q$. Results: Here we provide an analogous algorithm for computing $partialexp!(tQ)/partial heta$, the differentiated uniformization method. We demonstrate our algorithm for the stochastic SIR model of epidemic spread, for which we show that $Q$ can be written as a sum of tensor products. We estimate monthly infection and recovery rates during the first wave of the COVID-19 pandemic in Austria and quantify their uncertainty in a full Bayesian analysis. Availability: Implementation and data are available at https://github.com/spang-lab/TenSIR.

【29】 Joint Learning of Linear Time-Invariant Dynamical Systems 标题：线性时不变动态系统的联合学习链接：https://arxiv.org/abs/2112.10955

作者：Aditya Modi,Mohamad Kazem Shirani Faradonbeh,Ambuj Tewari,George Michailidis 机构：Computer Science and Engineering, University of Michigan & Microsoft Inc., Department of Statistics, University of Georgia, Department of Statistics, University of Michigan, Department of Statistics, University of Florida 摘要：学习线性时不变动力系统（LTIDS）的参数是当前感兴趣的问题。在许多应用中，人们感兴趣的是联合学习多个相关LTID的参数，这些参数迄今尚未被探索。为此，我们开发了一种联合估计，用于学习共享公共基矩阵的LTID的转移矩阵。此外，我们还建立了有限时间误差界，该误差界取决于潜在的样本大小、维数、任务数和转移矩阵的谱特性。结果是在温和的规律性假设下获得的，与单独学习每个系统相比，展示了跨LTID汇集信息的收益。我们还研究了错误指定转移矩阵的联合结构的影响，并表明在存在中度错误指定的情况下，已建立的结果是稳健的。摘要：Learning the parameters of a linear time-invariant dynamical system (LTIDS) is a problem of current interest. In many applications, one is interested in jointly learning the parameters of multiple related LTIDS, which remains unexplored to date. To that end, we develop a joint estimator for learning the transition matrices of LTIDS that share common basis matrices. Further, we establish finite-time error bounds that depend on the underlying sample size, dimension, number of tasks, and spectral properties of the transition matrices. The results are obtained under mild regularity assumptions and showcase the gains from pooling information across LTIDS, in comparison to learning each system separately. We also study the impact of misspecifying the joint structure of the transition matrices and show that the established results are robust in the presence of moderate misspecifications.

【30】 Surrogate Model for Shallow Water Equations Solvers with Deep Learning 标题：基于深度学习的浅水方程求解代理模型链接：https://arxiv.org/abs/2112.10889

作者：Yalan Song,Chaopeng Shen,Xiaofeng Liu 机构：• A surrogate model for shallow water equations solver was developed using, a point-to-point prediction approach., • The new model overcomes the limitations in raster-image based approaches. 摘要：浅水方程是洪水和河流水力学分析的大多数模型的基础。这些基于物理的模型通常价格昂贵且运行缓慢，因此不适合实时预测或参数反演。一个有吸引力的替代模型是代理模型。本文介绍了一种基于深度学习的高效、准确、灵活的代理模型NN-p2p，它可以在非结构化或不规则网格上进行点对点预测。对新方法进行了评估，并与现有的基于卷积神经网络（CNN）的方法进行了比较，后者只能在结构化或规则网格上进行图像间预测。在NN-p2p中，输入包括空间坐标和边界特征，可以描述水工结构的几何结构，如桥墩。所有代理模型都能很好地预测训练域中不同类型桥墩周围的水流。然而，当执行空间外推时，只有NN-p2p工作良好。基于CNN的方法的局限性根源于其光栅图像性质，无法准确捕获边界几何和流动特征，这对流体动力学至关重要。NN-p2p在预测神经网络未观测到的桥墩周围流量方面也具有良好的性能。NN-p2p模型也更严格地遵守守恒定律。通过计算桥墩的阻力系数$C_D$证明了所提出的代理模型的应用，发现$C_D$与桥墩长宽比的对数变换之间存在新的线性关系。摘要：Shallow water equations are the foundation of most models for flooding and river hydraulics analysis. These physics-based models are usually expensive and slow to run, thus not suitable for real-time prediction or parameter inversion. An attractive alternative is surrogate model. This work introduces an efficient, accurate, and flexible surrogate model, NN-p2p, based on deep learning and it can make point-to-point predictions on unstructured or irregular meshes. The new method was evaluated and compared against existing methods based on convolutional neural networks (CNNs), which can only make image-to-image predictions on structured or regular meshes. In NN-p2p, the input includes both spatial coordinates and boundary features that can describe the geometry of hydraulic structures, such as bridge piers. All surrogate models perform well in predicting flow around different types of piers in the training domain. However, only NN-p2p works well when spatial extrapolation is performed. The limitations of CNN-based methods are rooted in their raster-image nature which cannot capture boundary geometry and flow features exactly, which are of paramount importance to fluid dynamics. NN-p2p also has good performance in predicting flow around piers unseen by the neural network. The NN-p2p model also respects conservation laws more strictly. The application of the proposed surrogate model was demonstrated by calculating the drag coefficient $C_D$ for piers and a new linear relationship between $C_D$ and the logarithmic transformation of pier's length/width ratio was discovered.

【31】 Machine learning discovery of new phases in programmable quantum simulator snapshots 标题：可编程量子模拟器快照中新阶段的机器学习发现链接：https://arxiv.org/abs/2112.10789

作者：Cole Miles,Rhine Samajdar,Sepehr Ebadi,Tout T. Wang,Hannes Pichler,Subir Sachdev,Mikhail D. Lukin,Markus Greiner,Kilian Q. Weinberger,Eun-Ah Kim 机构：Department of Physics, Cornell University, Ithaca, NY , USA, Department of Physics, Harvard University, Cambridge, MA , USA, Institute for Theoretical Physics, University of Innsbruck A-, Austria, Institute for Quantum Optics and Quantum Information 备注：9 pages, 5 figures + 12 pages, 10 figures appendix 摘要：机器学习最近成为研究具有丰富数据集特征的复杂现象的一种很有前途的方法。特别是，以数据为中心的方法有可能自动发现手动检查可能遗漏的实验数据集中的结构。在这里，我们介绍了一种可解释的无监督混合机器学习方法，即混合相关卷积神经网络（hybrid CCNN），并将其应用于基于里德堡原子阵列的可编程量子模拟器生成的实验数据。具体来说，我们应用混合CCNN来分析具有可编程相互作用的正方形晶格上的新量子相位。最初的无监督降维和聚类阶段首先揭示了五个不同的量子相区域。在第二个监督阶段，我们通过训练完全可解释的CCNN并提取每个相位的相关关系来细化这些相位边界并描述每个相位。每个相位中特定识别的特征空间权重和关联片段捕获了条纹相位中的量子涨落，并识别了两个先前未检测到的相位，菱形相位和边界有序相位。这些观察结果表明，可编程量子模拟器与机器学习相结合，可以作为详细探索物质相关量子态的有力工具。摘要：Machine learning has recently emerged as a promising approach for studying complex phenomena characterized by rich datasets. In particular, data-centric approaches lend to the possibility of automatically discovering structures in experimental datasets that manual inspection may miss. Here, we introduce an interpretable unsupervised-supervised hybrid machine learning approach, the hybrid-correlation convolutional neural network (Hybrid-CCNN), and apply it to experimental data generated using a programmable quantum simulator based on Rydberg atom arrays. Specifically, we apply Hybrid-CCNN to analyze new quantum phases on square lattices with programmable interactions. The initial unsupervised dimensionality reduction and clustering stage first reveals five distinct quantum phase regions. In a second supervised stage, we refine these phase boundaries and characterize each phase by training fully interpretable CCNNs and extracting the relevant correlations for each phase. The characteristic spatial weightings and snippets of correlations specifically recognized in each phase capture quantum fluctuations in the striated phase and identify two previously undetected phases, the rhombic and boundary-ordered phases. These observations demonstrate that a combination of programmable quantum simulators with machine learning can be used as a powerful tool for detailed exploration of correlated quantum states of matter.

其他(15篇)

【1】 Deliberation of Streaming RNN-Transducer by Non-autoregressive Decoding 标题：采用非自回归译码的流式RNN传感器的商榷链接：https://arxiv.org/abs/2112.11442

作者：Weiran Wang,Ke Hu,Tara Sainath 机构：Tara N. Sainath, Google, Inc. 摘要：我们建议考虑流式RNN-T模型与先前提出的Align-Refine非自回归解码方法及其改进版本的假设一致性。该方法执行几个细化步骤，其中每个步骤共享一个转换器解码器，该解码器同时关注文本特征（从对齐中提取）和音频特征，并输出完整的更新对齐。transformer解码器使用CTC损失进行训练，这有助于并行贪婪解码，并执行完全上下文注意以捕获标签依赖。我们通过引入级联编码器（在细化之前捕获更多音频上下文）和对齐增强（强制学习标签依赖）来改进对齐细化。我们表明，在流式RNN-T模型的假设对齐的条件下，我们的方法比第一次通过的RNN-T获得了更准确的识别结果，并且只有少量的模型参数。摘要：We propose to deliberate the hypothesis alignment of a streaming RNN-T model with the previously proposed Align-Refine non-autoregressive decoding method and its improved versions. The method performs a few refinement steps, where each step shares a transformer decoder that attends to both text features (extracted from alignments) and audio features, and outputs complete updated alignments. The transformer decoder is trained with the CTC loss which facilitates parallel greedy decoding, and performs full-context attention to capture label dependencies. We improve Align-Refine by introducing cascaded encoder that captures more audio context before refinement, and alignment augmentation which enforces learning label dependency. We show that, conditioned on hypothesis alignments of a streaming RNN-T model, our method obtains significantly more accurate recognition results than the first-pass RNN-T, with only small amount of model parameters.

【2】 Implicit Neural Video Compression 标题：隐式神经网络视频压缩链接：https://arxiv.org/abs/2112.11312

作者：Yunfan Zhang,Ties van Rozendaal,Johann Brehmer,Markus Nagel,Taco Cohen 机构：Taco S. Cohen, Qualcomm AI Research 摘要：我们提出了一种用隐式神经表示法压缩全分辨率视频序列的方法。每个帧都表示为一个神经网络，将坐标位置映射到像素值。我们使用一个单独的隐式网络来调制坐标输入，从而实现帧间的有效运动补偿。再加上一个小的剩余网络，这使得我们能够相对于前一帧有效地压缩P帧。通过使用学习的整数量化存储网络权重，我们进一步降低了比特率。我们称之为隐式像素流（IPF）的方法对已建立的神经视频编解码器进行了一些简化：它不要求接收器访问预训练的神经网络，不使用昂贵的基于插值的扭曲操作，也不需要单独的训练数据集。我们证明了对图像和视频数据进行神经隐式压缩的可行性。摘要：We propose a method to compress full-resolution video sequences with implicit neural representations. Each frame is represented as a neural network that maps coordinate positions to pixel values. We use a separate implicit network to modulate the coordinate inputs, which enables efficient motion compensation between frames. Together with a small residual network, this allows us to efficiently compress P-frames relative to the previous frame. We further lower the bitrate by storing the network weights with learned integer quantization. Our method, which we call implicit pixel flow (IPF), offers several simplifications over established neural video codecs: it does not require the receiver to have access to a pretrained neural network, does not use expensive interpolation-based warping operations, and does not require a separate training dataset. We demonstrate the feasibility of neural implicit compression on image and video data.

【3】 Extending CLIP for Category-to-image Retrieval in E-commerce 标题：电子商务中分类到图像检索的扩展剪辑链接：https://arxiv.org/abs/2112.11294

作者：Mariya Hendriksen,Maurits Bleeker,Svitlana Vakulenko,Nanne van Noord,Ernst Kuiper,Maarten de Rijke 机构： AIRLab, University of Amsterdam 备注：15 pages, accepted as a full paper at ECIR 2022 摘要：电子商务提供了丰富的多式联运数据，在实践中几乎没有得到利用。该数据的一个方面是用于搜索和推荐的类别树。然而，在实践中，在用户会话期间，给定类别的文本表示和视觉表示之间往往不匹配。基于这一问题，我们将分类任务引入到电子商务中的图像检索中，并提出了一个任务模型CLIP-ITA。该模型利用来自多个模态（文本、视觉和属性模态）的信息来创建产品表示。我们将探讨从多个模态（文本、视觉和属性模态）添加信息如何影响模型的性能。特别是，我们观察到CLIP-ITA显著优于仅利用视觉模态的可比模型和利用视觉和属性模态的可比模型。摘要：E-commerce provides rich multimodal data that is barely leveraged in practice. One aspect of this data is a category tree that is being used in search and recommendation. However, in practice, during a user's session there is often a mismatch between a textual and a visual representation of a given category. Motivated by the problem, we introduce the task of category-to-image retrieval in e-commerce and propose a model for the task, CLIP-ITA. The model leverages information from multiple modalities (textual, visual, and attribute modality) to create product representations. We explore how adding information from multiple modalities (textual, visual, and attribute modality) impacts the model's performance. In particular, we observe that CLIP-ITA significantly outperforms a comparable model that leverages only the visual modality and a comparable model that leverages the visual and attribute modality.

【4】 VW-SDK: Efficient Convolutional Weight Mapping Using Variable Windows for Processing-In-Memory Architectures 标题：VW-SDK：在内存处理结构中使用可变窗口的高效卷积权重映射链接：https://arxiv.org/abs/2112.11282

作者：Johnny Rhe,Sungmin Moon,Jong Hwan Ko 机构：∗Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South Korea, †College of Information and Communication Engineering, Sungkyunkwan University, Suwon, South Korea 备注：Accepted as a conference paper at Design, Automation & Test in Europe Conference & Exhibition (DATE) 2022 摘要：内存处理（PIM）阵列具有较高的能量效率，越来越多地用于卷积神经网络（CNN）推理。在基于PIM的CNN推断中，计算延迟和能量取决于CNN权重如何映射到PIM阵列。最近的一项研究提出了移位和重复内核（SDK）映射，该映射使用一个并行窗口单元重用输入特征映射，该窗口与重复内核进行卷积以并行获得多个输出元素。然而，现有的基于SDK的映射算法并不总是导致最小的计算周期，因为它只映射一个正方形的平行窗口与整个通道。在本文中，我们介绍了一种新的映射算法，称为可变窗口SDK（VW-SDK），它自适应地确定并行窗口的形状，从而使给定卷积层和PIM阵列的计算周期最小。通过允许带有部分通道的矩形窗口，VW-SDK可以更有效地利用PIM阵列，从而进一步减少计算周期。使用512x512 PIM阵列和Resnet-18进行的仿真表明，与现有基于SDK的算法相比，VW-SDK的推理速度提高了1.69倍。摘要：With their high energy efficiency, processing-in-memory (PIM) arrays are increasingly used for convolutional neural network (CNN) inference. In PIM-based CNN inference, the computational latency and energy are dependent on how the CNN weights are mapped to the PIM array. A recent study proposed shifted and duplicated kernel (SDK) mapping that reuses the input feature maps with a unit of a parallel window, which is convolved with duplicated kernels to obtain multiple output elements in parallel. However, the existing SDK-based mapping algorithm does not always result in the minimum computing cycles because it only maps a square-shaped parallel window with the entire channels. In this paper, we introduce a novel mapping algorithm called variable-window SDK (VW-SDK), which adaptively determines the shape of the parallel window that leads to the minimum computing cycles for a given convolutional layer and PIM array. By allowing rectangular-shaped windows with partial channels, VW-SDK utilizes the PIM array more efficiently, thereby further reduces the number of computing cycles. The simulation with a 512x512 PIM array and Resnet-18 shows that VW-SDK improves the inference speed by 1.69x compared to the existing SDK-based algorithm.

【5】 Value Activation for Bias Alleviation: Generalized-activated Deep Double Deterministic Policy Gradients 标题：偏差缓解的价值激活：广义激活的深层双重确定性政策梯度链接：https://arxiv.org/abs/2112.11216

作者：Jiafei Lyu,Yu Yang,Jiangpeng Yan,Xiu Li 机构：• We propose a novel generalized-activated weighting operator for bias alleviation in deep reinforcement learning., • We theoretically and experimentally show that the distance between the max operator and the generalized-activated 备注：13 pages 摘要：在深度强化学习（DRL）中，准确估计值函数是至关重要的，这样agent就可以执行正确的操作，而不是次优的操作。然而，现有的演员批评方法或多或少地受到低估或高估偏差的影响，这对他们的表现产生了负面影响。在本文中，我们揭示了一个简单但有效的原理：适当的值校正有利于减少偏差，我们提出了广义激活加权算子，它使用任何非递减函数，即激活函数，作为更好的值估计的权重。特别地，我们将广义激活加权算子集成到值估计中，并引入了一种新的算法——广义激活深度双确定性策略梯度（GD3）。我们从理论上证明了GD3能够缓解潜在的估计偏差。我们有趣地发现，简单的激活函数可以在不增加额外技巧的情况下获得令人满意的性能，并且有助于更快的收敛。在大量具有挑战性的连续控制任务上的实验结果表明，具有任务特定激活的GD3优于常用的基线方法。我们还发现了一个事实，即微调多项式激活函数可以在大多数任务中获得更好的结果。摘要：It is vital to accurately estimate the value function in Deep Reinforcement Learning (DRL) such that the agent could execute proper actions instead of suboptimal ones. However, existing actor-critic methods suffer more or less from underestimation bias or overestimation bias, which negatively affect their performance. In this paper, we reveal a simple but effective principle: proper value correction benefits bias alleviation, where we propose the generalized-activated weighting operator that uses any non-decreasing function, namely activation function, as weights for better value estimation. Particularly, we integrate the generalized-activated weighting operator into value estimation and introduce a novel algorithm, Generalized-activated Deep Double Deterministic Policy Gradients (GD3). We theoretically show that GD3 is capable of alleviating the potential estimation bias. We interestingly find that simple activation functions lead to satisfying performance with no additional tricks, and could contribute to faster convergence. Experimental results on numerous challenging continuous control tasks show that GD3 with task-specific activation outperforms the common baseline methods. We also uncover a fact that fine-tuning the polynomial activation function achieves superior results on most of the tasks.

【6】 Discrete fully probabilistic design: a tool to design control policies from examples 标题：离散全概率设计：从实例设计控制策略的工具链接：https://arxiv.org/abs/2112.11210

作者：Enrico Ferrentino,Pasquale Chiacchio,Giovanni Russo 机构：Dept. of Information and Electrical Eng. & Applied Math. at the University of Salerno, Italy 摘要：我们提出了一种离散化设计，阐述了Gagliardi和Russo（2021）最近引入的一种算法，用于从约束、可能是随机和非线性系统的示例中综合控制策略。在可能有噪声的示例数据中不需要满足约束条件，而这些数据又可能从不同于受控系统的系统中收集。对于这种离散化设计，我们讨论了一些性质，并给出了一个设计管道。该设计，我们称之为离散全概率设计，在一个示例上进行了数值基准测试，该示例涉及从不满足系统特定驱动约束的物理不同摆采集的数据开始，控制具有驱动约束的倒立摆。摘要：We present a discretized design that expounds an algorithm recently introduced in Gagliardi and Russo (2021) to synthesize control policies from examples for constrained, possibly stochastic and nonlinear, systems. The constraints do not need to be fulfilled in the possibly noisy example data, which in turn might be collected from a system that is different from the one under control. For this discretized design, we discuss a number of properties and give a design pipeline. The design, which we term as discrete fully probabilistic design, is benchmarked numerically on an example that involves controlling an inverted pendulum with actuation constraints starting from data collected from a physically different pendulum that does not satisfy the system-specific actuation constraints.

【7】 How are cities pledging net zero? A computational approach to analyzing subnational climate strategies 标题：城市是如何承诺净零的呢？一种分析国家以下气候战略的计算方法链接：https://arxiv.org/abs/2112.11207

作者：Siddharth Sachdeva,Angel Hsu,Ian French,Elwin Lim 机构： Data-Driven EnviroLab, University of North Carolina-Chapel Hill, S. Columbia Street, Chapel Hill, NC , Department University of North Carolina-Chapel Hill, S. Columbia Street, Chapel Hill, NC, Yale-NUS College, College Ave W, Singapore 备注：14 pages, 6 figures, submitted to nature urban sustainability 摘要：城市已经成为气候变化的主要参与者，并且越来越多地制定旨在实现净零排放的目标。国家以下各级政府“竞相实现零排放”并阐明自己的气候缓解计划的迅速扩散，需要进行更仔细的检查，以了解这些行为者打算如何实现这些目标。然而，城市气候政策文件的分散性、不完整性和异质性使得其系统分析具有挑战性。我们使用基于机器学习的自然语言处理（NLP）技术分析了318份气候行动文件，这些文件来自承诺实现净零目标的城市或加入了一项跨国气候倡议。我们使用这些方法来实现两个主要目标：1）确定预测“雄心勃勃”净零排放目标的文本模式，其中我们将雄心勃勃的目标定义为包含国家以下各级政府经济范围排放量的目标；2）进行部门分析，以确定气候行动主题（即土地利用、工业、建筑等）的模式和权衡。我们发现，定义了雄心勃勃的气候行动的城市往往在其计划中强调量化指标和特定的高排放部门，并提到治理和公民参与。城市在其计划中主要强调与能源有关的行动，特别是在建筑、运输和供暖部门，但往往以牺牲其他部门为代价，包括土地使用和气候影响。本文提出的方法为分析气候行动计划提供了一种可复制、可扩展的方法，也是促进跨城市学习的第一步。摘要：Cities have become primary actors on climate change and are increasingly setting goals aimed at net-zero emissions. The rapid proliferation of subnational governments "racing to zero" emissions and articulating their own climate mitigation plans warrants closer examination to understand how these actors intend to meet these goals. The scattered, incomplete and heterogeneous nature of city climate policy documents, however, has made their systemic analysis challenging. We analyze 318 climate action documents from cities that have pledged net-zero targets or joined a transnational climate initiative with this goal using machine learning-based natural language processing (NLP) techniques. We use these approaches to accomplish two primary goals: 1) determine text patterns that predict "ambitious" net-zero targets, where we define an ambitious target as one that encompasses a subnational government's economy-wide emissions; and 2) perform a sectoral analysis to identify patterns and trade-offs in climate action themes (i.e., land-use, industry, buildings, etc.). We find that cities that have defined ambitious climate actions tend to emphasize quantitative metrics and specific high-emitting sectors in their plans, supported by mentions of governance and citizen participation. Cities predominantly emphasize energy-related actions in their plans, particularly in the buildings, transport and heating sectors, but often at the expense of other sectors, including land-use and climate impacts. The method presented in this paper provides a replicable, scalable approach to analyzing climate action plans and a first step towards facilitating cross-city learning.

【8】 Dynamically Stable Poincaré Embeddings for Neural Manifolds 标题：神经流形的动态稳定Poincaré嵌入链接：https://arxiv.org/abs/2112.11172

作者：Jun Chen,Yuang Liu,Xiangrui Zhao,Yong Liu 机构： 1Institute of Cyber-Systems and Control 摘要：在黎曼流形中，Ricci流是一个偏微分方程，用于将度量演化为更规则的度量。我们希望这些度量的拓扑结构可以用来帮助机器学习的任务。然而，这部分工作仍然缺失。在本文中，我们通过对神经流形进行动态稳定的庞加莱嵌入，弥合了Ricci流和深度神经网络之间的鸿沟。因此，我们证明了，如果初始度量有一个偏离Poincar球上双曲度量的$L^2$-范数扰动，则此类度量的标度Ricci-DeTurck流平滑且指数收敛于双曲度量。具体地说，里奇流的作用是自然演化为稳定的庞加莱球，然后映射回欧几里德空间。对于Ricci流下的这种动态稳定的神经流形，嵌入这种流形的神经网络的收敛性不受扰动的影响。我们还表明，这种Ricci流辅助神经网络在图像分类任务（CIFAR数据集）上的表现优于其全欧几里德版本。摘要：In a Riemannian manifold, the Ricci flow is a partial differential equation for evolving the metric to become more regular. We hope that topological structures from such metrics may be used to assist in the tasks of machine learning. However, this part of the work is still missing. In this paper, we bridge this gap between the Ricci flow and deep neural networks by dynamically stable Poincar'e embeddings for neural manifolds. As a result, we prove that, if initial metrics have an $L^2$-norm perturbation which deviates from the Hyperbolic metric on the Poincar'e ball, the scaled Ricci-DeTurck flow of such metrics smoothly and exponentially converges to the Hyperbolic metric. Specifically, the role of the Ricci flow is to serve as naturally evolving to the stable Poincar'e ball that will then be mapped back to the Euclidean space. For such dynamically stable neural manifolds under the Ricci flow, the convergence of neural networks embedded with such manifolds is not susceptible to perturbations. And we show that such Ricci flow assisted neural networks outperform with their all Euclidean versions on image classification tasks (CIFAR datasets).

【9】 RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality 标题：RepMLPNet：具有重新参数化局部性的分层视觉MLP 链接：https://arxiv.org/abs/2112.11081

作者：Xiaohan Ding,Honghao Chen,Xiangyu Zhang,Jungong Han,Guiguang Ding 机构： Beijing National Research Center for Information Science and Technology (BNRist);, School of Software, Tsinghua University, Beijing, China, Institute of Automation, Chinese Academy of Sciences, MEGVII Technology 备注：The code and models are available at this https URL arXiv admin note: text overlap with arXiv:2105.01883 摘要：与卷积层相比，完全连接（FC）层在建模长距离依赖方面更好，但在捕获局部模式方面更差，因此通常不太适合用于图像识别。在本文中，我们提出了一种方法，局部注入，通过将并行conv核的训练参数合并到FC核中，将局部先验合并到FC层。局部注入可以看作是一种新的结构再参数化方法，因为它通过变换参数来等价地转换结构。在此基础上，我们提出了一种多层感知器（MLP）块RepMLP块，该块使用三层FC来提取特征，并提出了一种新的结构RepMLPNet。分层设计将RepMLPNet与其他同时提出的视觉mlp区分开来。当它生成不同层次的特征图时，它可以作为下游任务（如语义分割）的主干模型。我们的结果表明：1）局部注入是MLP模型的一种通用方法；2）与其他MLP相比，RepMLPNet具有良好的精度-效率权衡；3） RepMLPNet是第一个无缝转移到城市景观语义分割的MLP。有关代码和模型，请访问https://github.com/DingXiaoH/RepMLP. 摘要：Compared to convolutional layers, fully-connected (FC) layers are better at modeling the long-range dependencies but worse at capturing the local patterns, hence usually less favored for image recognition. In this paper, we propose a methodology, Locality Injection, to incorporate local priors into an FC layer via merging the trained parameters of a parallel conv kernel into the FC kernel. Locality Injection can be viewed as a novel Structural Re-parameterization method since it equivalently converts the structures via transforming the parameters. Based on that, we propose a multi-layer-perceptron (MLP) block named RepMLP Block, which uses three FC layers to extract features, and a novel architecture named RepMLPNet. The hierarchical design distinguishes RepMLPNet from the other concurrently proposed vision MLPs. As it produces feature maps of different levels, it qualifies as a backbone model for downstream tasks like semantic segmentation. Our results reveal that 1) Locality Injection is a general methodology for MLP models; 2) RepMLPNet has favorable accuracy-efficiency trade-off compared to the other MLPs; 3) RepMLPNet is the first MLP that seamlessly transfer to Cityscapes semantic segmentation. The code and models are available at https://github.com/DingXiaoH/RepMLP.

【10】 Nonlinear Transform Source-Channel Coding for Semantic Communications 标题：语义通信中的非线性变换信源信道编码链接：https://arxiv.org/abs/2112.10961

作者：Jincheng Dai,Sixian Wang,Kailin Tan,Zhongwei Si,Xiaoqi Qin,Kai Niu,Ping Zhang 摘要：在本文中，我们提出了一类新的高效的深度联合信源信道编码方法，这种方法能够紧密适应非线性变换下的信源分布，可以被命名为非线性变换信源信道编码（NTSCC）。在所考虑的模型中，发射机首先学习非线性分析变换，将源数据映射到潜在空间，然后通过深度联合信源信道编码将潜在表示发送给接收机。该模型将非线性变换作为强先验，有效地提取信源语义特征，为信源信道编码提供旁侧信息。与现有的深度联合信源信道编码方法不同，所提出的NTSCC本质上学习信源潜在表示和熵模型作为潜在表示的先验。因此，开发了新的自适应速率传输和超先验辅助编解码器细化机制来升级深度联合信源信道编码。整个系统设计被描述为一个优化问题，其目标是在既定的感知质量指标下最小化端到端传输率失真性能。通过简单的示例源和测试图像源，我们发现所提出的NTSCC传输方法通常优于使用标准深度联合信源信道编码的模拟传输和基于经典分离的数字传输。值得注意的是，由于其强大的内容感知能力，所提出的NTSCC方法可能支持未来的语义通信。摘要：In this paper, we propose a new class of high-efficient deep joint source-channel coding methods that can closely adapt to the source distribution under the nonlinear transform, it can be collected under the name nonlinear transform source-channel coding (NTSCC). In the considered model, the transmitter first learns a nonlinear analysis transform to map the source data into latent space, then transmits the latent representation to the receiver via deep joint source-channel coding. Our model incorporates the nonlinear transform as a strong prior to effectively extract the source semantic features and provide side information for source-channel coding. Unlike existing conventional deep joint source-channel coding methods, the proposed NTSCC essentially learns both the source latent representation and an entropy model as the prior on the latent representation. Accordingly, novel adaptive rate transmission and hyperprior-aided codec refinement mechanisms are developed to upgrade deep joint source-channel coding. The whole system design is formulated as an optimization problem whose goal is to minimize the end-to-end transmission rate-distortion performance under established perceptual quality metrics. Across simple example sources and test image sources, we find that the proposed NTSCC transmission method generally outperforms both the analog transmission using the standard deep joint source-channel coding and the classical separation-based digital transmission. Notably, the proposed NTSCC method can potentially support future semantic communications due to its vigorous content-aware ability.

【11】 Common Misconceptions about Population Data 标题：关于人口数据的常见误解链接：https://arxiv.org/abs/2112.10912

作者：Peter Christen,Rainer Schnell 机构：School of Computing, The Australian National University, Canberra ACT , Australia, Research Methodology Group, University of Duisburg-Essen, Duisburg, Germany 摘要：涵盖所有人口个体的数据库越来越多地用于从公共卫生到社会科学等领域的研究。政府和企业对利用人口数据支持数据驱动的决策也越来越感兴趣。这种数据库的庞大规模常常被误认为是对感兴趣的人群进行有效推断的保证。然而，人口数据的特点使其难以使用，包括如何收集此类数据以及对其应用了何种处理方式等各种假设。此外，人口数据的全部潜力往往只有在与其他数据库链接时才能释放，这一过程增加了新的挑战。这篇文章讨论了关于人口数据的各种误解，我们认为任何使用这些数据的人都需要注意这些误解。这些误解中的许多并没有在科学出版物中得到很好的记录，只是在研究人员和实践者之间进行了轶事性的讨论。最后，我们提出了一套使用人口数据进行推断的建议。摘要：Databases covering all individuals of a population are increasingly used for research studies in domains ranging from public health to the social sciences. There is also growing interest by governments and businesses to use population data to support data-driven decision making. The massive size of such databases is often mistaken as a guarantee for valid inferences on the population of interest. However, population data have characteristics that make them challenging to use, including various assumptions being made how such data were collected and what types of processing have been applied to them. Furthermore, the full potential of population data can often only be unlocked when such data are linked to other databases, a process that adds fresh challenges. This article discusses a diverse range of misconceptions about population data that we believe anybody who works with such data needs to be aware of. Many of these misconceptions are not well documented in scientific publications but only discussed anecdotally among researchers and practitioners. We conclude with a set of recommendations for inference when using population data.

【12】 Natural language processing to identify lupus nephritis phenotype in electronic health records 标题：自然语言处理识别电子健康档案中狼疮性肾炎表型链接：https://arxiv.org/abs/2112.10821

作者：Yu Deng,Jennifer A. Pacheco,Anh Chung,Chengsheng Mao,Joshua C. Smith,Juan Zhao,Wei-Qi Wei,April Barnado,Chunhua Weng,Cong Liu,Adam Cordon,Jingzhi Yu,Yacob Tedla,Abel Kho,Rosalind Ramsey-Goldman,Theresa Walunas,Yuan Luo 机构：Affiliations: ,Center for Health Information Partnerships, Feinberg School of Medicine, Northwestern Univer-, Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, USA 摘要：系统性红斑狼疮（SLE）是一种罕见的自身免疫性疾病，其特征是不可预测的发作和缓解过程，表现多样。狼疮性肾炎是SLE的主要疾病表现之一，可导致器官损害和死亡，是狼疮分类标准的关键组成部分。因此，在电子健康记录（EHR）中准确识别狼疮性肾炎将有利于大型队列观察性研究和临床试验，其中患者群体的特征对于招募、研究设计和分析至关重要。狼疮性肾炎可以通过程序代码和结构化数据识别，如实验室检查。然而，记录狼疮性肾炎的其他关键信息，如肾活检的组织学报告和先前的病史叙述，需要复杂的文本处理来从病理报告和临床记录中挖掘信息。在这项研究中，我们开发了使用EHR数据识别狼疮性肾炎的算法，无论是否使用自然语言处理（NLP）。我们开发了四种算法：仅使用结构化数据的基于规则的算法（基线算法）和使用不同NLP模型的三种算法。三个NLP模型基于正则化逻辑回归，分别使用不同的特征集，包括概念唯一标识符（CUI）的正面提及、CUI的出现次数以及三个成分的混合。在范德比尔特大学医学中心（VUMC）的数据集上对基线算法和性能最佳的NLP算法进行了外部验证。与基线狼疮肾炎算法相比，在NMEDW（0.41 vs 0.79）和VUMC（0.62 vs 0.96）数据集中，我们的最佳NLP模型结合了结构化数据、正则表达式概念和映射CUI的特征，提高了F度量。摘要：Systemic lupus erythematosus (SLE) is a rare autoimmune disorder characterized by an unpredictable course of flares and remission with diverse manifestations. Lupus nephritis, one of the major disease manifestations of SLE for organ damage and mortality, is a key component of lupus classification criteria. Accurately identifying lupus nephritis in electronic health records (EHRs) would therefore benefit large cohort observational studies and clinical trials where characterization of the patient population is critical for recruitment, study design, and analysis. Lupus nephritis can be recognized through procedure codes and structured data, such as laboratory tests. However, other critical information documenting lupus nephritis, such as histologic reports from kidney biopsies and prior medical history narratives, require sophisticated text processing to mine information from pathology reports and clinical notes. In this study, we developed algorithms to identify lupus nephritis with and without natural language processing (NLP) using EHR data. We developed four algorithms: a rule-based algorithm using only structured data (baseline algorithm) and three algorithms using different NLP models. The three NLP models are based on regularized logistic regression and use different sets of features including positive mention of concept unique identifiers (CUIs), number of appearances of CUIs, and a mixture of three components respectively. The baseline algorithm and the best performed NLP algorithm were external validated on a dataset from Vanderbilt University Medical Center (VUMC). Our best performing NLP model incorporating features from both structured data, regular expression concepts, and mapped CUIs improved F measure in both the NMEDW (0.41 vs 0.79) and VUMC (0.62 vs 0.96) datasets compared to the baseline lupus nephritis algorithm.

【13】 More is Less: Inducing Sparsity via Overparameterization 标题：多即是少：过度参数化导致稀疏性链接：https://arxiv.org/abs/2112.11027

作者：Hung-Hsu Chou,Johannes Maly,Holger Rauhut 摘要：在深度学习中，通常会对神经网络进行过参数化，即使用比训练样本更多的参数。非常令人惊讶的是，通过（随机）梯度下降训练神经网络，可以得到泛化性非常好的模型，而经典统计数据则表明拟合过度。为了理解这一隐式偏差现象，我们研究了稀疏恢复（压缩感知）的特殊情况，它本身就很有趣。更准确地说，为了从欠定线性测量中重构向量，我们引入了一个相应的过参数平方损失泛函，其中要重构的向量被深度分解为多个向量。我们证明，在测量矩阵的一个非常温和的假设下，过参数化损失泛函的香草梯度流收敛到最小$ellu 1$-范数的解。后者以推广稀疏解决方案而闻名。作为一个副产品，我们的结果显著提高了压缩感知的样本复杂度。该理论在数值实验中准确地预测了回收率。对于证明，我们引入了{ extit{解熵}的概念，它绕过了非凸性造成的障碍，应该是独立的。摘要：In deep learning it is common to overparameterize the neural networks, that is, to use more parameters than training samples. Quite surprisingly training the neural network via (stochastic) gradient descent leads to models that generalize very well, while classical statistics would suggest overfitting. In order to gain understanding of this implicit bias phenomenon we study the special case of sparse recovery (compressive sensing) which is of interest on its own. More precisely, in order to reconstruct a vector from underdetermined linear measurements, we introduce a corresponding overparameterized square loss functional, where the vector to be reconstructed is deeply factorized into several vectors. We show that, under a very mild assumption on the measurement matrix, vanilla gradient flow for the overparameterized loss functional converges to a solution of minimal $ell_1$-norm. The latter is well-known to promote sparse solutions. As a by-product, our results significantly improve the sample complexity for compressive sensing in previous works. The theory accurately predicts the recovery rate in numerical experiments. For the proofs, we introduce the concept of { extit{solution entropy}}, which bypasses the obstacles caused by non-convexity and should be of independent interest.

【14】 The entropic barrier is n-self-concordant标题：熵屏障是n-自协调的链接：https://arxiv.org/abs/2112.10947

作者：Sinho Chewi 机构：Massachusetts Institute of Technology (MIT) 备注：13 pages 摘要：对于任何凸体$Ksubsteqmathbb R^n$，S.Bubeck和R.Eldan在$K$上引入熵势垒，并证明它是$（1+o（1）），n$-自洽势垒。在本文中，我们观察到，由于维Brascamp-Lieb不等式，自洽参数的最优界为$n$。摘要：For any convex body $K subseteq mathbb R^n$, S. Bubeck and R. Eldan introduced the entropic barrier on $K$ and showed that it is a $(1+o(1)) , n$-self-concordant barrier. In this note, we observe that the optimal bound of $n$ on the self-concordance parameter holds as a consequence of the dimensional Brascamp-Lieb inequality.

【15】 The effective noise of Stochastic Gradient Descent 标题：随机梯度下降的有效噪声链接：https://arxiv.org/abs/2112.10852

作者：Francesca Mignacco,Pierfrancesco Urbani 机构：Université Paris-Saclay, CNRS, CEA, Institut de physique théorique, Gif-sur-Yvette, France. 备注：7 pages + appendix, 5 figures 摘要：随机梯度下降法（SGD）是深度学习技术的主要算法。在训练阶段的每一步，从训练数据集中抽取一小批样本，并根据该特定样本子集的性能调整神经网络的权重。小批量抽样过程在梯度下降过程中引入了随机动力学，并带有非平凡的状态相关噪声。在一个典型的神经网络模型中，我们描述了SGD的随机性和最近引入的变体，持久性SGD。在欠参数化区域，当最终训练误差为正时，SGD动力学达到平稳状态，我们根据波动耗散定理定义了有效温度，该定理由动力学平均场理论计算得出。我们使用有效温度来量化SGD噪声的大小，作为问题参数的函数。在训练误差为零的过参数化区域，我们通过计算具有相同初始化和两种不同SGD噪声实现的系统的两个副本之间的平均距离来测量SGD的噪声幅度。我们发现，作为问题参数的函数，这两种噪声度量的行为相似。此外，我们观察到，噪声较大的算法会导致相应约束满足问题的决策边界变宽。摘要：Stochastic Gradient Descent (SGD) is the workhorse algorithm of deep learning technology. At each step of the training phase, a mini batch of samples is drawn from the training dataset and the weights of the neural network are adjusted according to the performance on this specific subset of examples. The mini-batch sampling procedure introduces a stochastic dynamics to the gradient descent, with a non-trivial state-dependent noise. We characterize the stochasticity of SGD and a recently-introduced variant, persistent SGD, in a prototypical neural network model. In the under-parametrized regime, where the final training error is positive, the SGD dynamics reaches a stationary state and we define an effective temperature from the fluctuation-dissipation theorem, computed from dynamical mean-field theory. We use the effective temperature to quantify the magnitude of the SGD noise as a function of the problem parameters. In the over-parametrized regime, where the training error vanishes, we measure the noise magnitude of SGD by computing the average distance between two replicas of the system with the same initialization and two different realizations of SGD noise. We find that the two noise measures behave similarly as a function of the problem parameters. Moreover, we observe that noisier algorithms lead to wider decision boundaries of the corresponding constraint satisfaction problem.

机器翻译，仅供参考

猜你喜欢

Jease 2.6发布 Java开源内容框架
EasyCVR对接华为iVS订阅摄像机和用户变更请求接口介绍
JVM调优总结：反思
【技术种草】cdn+轻量服务器+hugo=让博客“云原生”一下
JVM调优总结：调优方法
前端面试【JavaScript】— typeof 是否能正确判断类型？
JVM调优总结：新一代的垃圾回收算法
前端面试【JavaScript】— instanceof 能否判断基本数据类型？
JVM调优总结：典型配置举例
前端面试【JavaScript】— 能不能手动实现一下 instanceof 的功能？
前端面试【JavaScript】— Object.is和=== 有什么区别？
JVM调优总结：分代垃圾回收详述
前端面试【JavaScript】— JS中类型转换有哪几种？
WPF开发入门尝试
前端面试【JavaScript】— == 和 ===有什么区别？
一个Java程序员对2011年的回顾
前端面试【JavaScript】— 对象转原始类型是根据什么流程运行的？
JVM调优总结：垃圾回收面临的问题
直接在代码里面对list集合进行分页
JVM调优总结：基本垃圾回收算法

zl程序教程

当前栏目

机器学习学术速递[12.22]

相关文章