zl程序教程

您现在的位置是:首页 >  其他

当前栏目

具有线性阈值激活的神经网络:结构和算法

2023-03-15 21:57:45 时间

在这篇文章中,我们提出了关于具有线性阈值激活函数的神经网络的新结果。我们精确地描述了这类神经网络可表示的函数类别,并表明2个隐藏层对于表示该类别中任何可表示的函数都是必要且充分的。考虑到最近对使用其他流行激活函数(如整流线性单元)的神经网络进行的精确可表示性调查,这是一个令人惊讶的结果。我们还给出了代表该类中任何函数所需的神经网络规模的精确界限。最后,我们设计了一种算法来解决经验风险最小化(ERM)问题,使这些具有固定结构的神经网络达到全球最优。如果输入维度和网络结构的大小被认为是固定的常数,该算法的运行时间与数据样本的大小成多项式。该算法是独一无二的,因为它适用于任何层数的架构,而之前的多项式时间全局最优算法只适用于非常有限的架构类别。

原文题目:Neural networks with linear threshold activations: structure and algorithms

原文:In this article we present new results on neural networks with linear threshold activation functions. We precisely characterize the class of functions that are representable by such neural networks and show that 2 hidden layers are necessary and sufficient to represent any function representable in the class. This is a surprising result in the light of recent exact representability investigations for neural networks using other popular activation functions like rectified linear units (ReLU). We also give precise bounds on the sizes of the neural networks required to represent any function in the class. Finally, we design an algorithm to solve the empirical risk minimization (ERM) problem to global optimality for these neural networks with a fixed architecture. The algorithm's running time is polynomial in the size of the data sample, if the input dimension and the size of the network architecture are considered fixed constants. The algorithm is unique in the sense that it works for any architecture with any number of layers, whereas previous polynomial time globally optimal algorithms work only for very restricted classes of architectures.