Paper之DL之BP:《Understanding the difficulty of training deep feedforward neural networks》
Paper之DL之BP:《Understanding the difficulty of training deep feedforward neural networks》
目录
原文解读
原文:Understanding the difficulty of training deep feedforward neural networks
文章内容以及划重点
Sigmoid的四层局限
sigmoid函数的test loss和training loss要经过很多轮数一直为0.5,后再有到0.1的差强人意的变化。
We hypothesize that this behavior is due to the combinationof random initialization and the fact that an hidden unitoutput of 0 corresponds to a saturated sigmoid. Note that deep networks with sigmoids but initialized from unsupervisedpre-training (e.g. from RBMs) do not suffer fromthis saturation behavior.
tanh、softsign的五层局限
换为tanh函数,就会很好很快的收敛
结论
1、The normalization factor may therefore be important when initializing deep networks because of the multiplicative effect through layers, and we suggest the following initialization procedure to approximately satisfy our objectives of maintaining activation variances and back-propagated gradients variance as one moves up or down the network. We call it the normalized initialization
2、结果可知分布更加均匀
Activation values normalized histograms with hyperbolic tangent activation, with standard (top) vs normalized initialization (bottom). Top: 0-peak increases for higher layers.
Several conclusions can be drawn from these error curves:
(1)、The more classical neural networks with sigmoid or hyperbolic tangent units and standard initialization fare rather poorly, converging more slowly and apparently towards ultimately poorer local minima.
(2)、The softsign networks seem to be more robust to the initialization procedure than the tanh networks, presumably because of their gentler non-linearity.
(3)、For tanh networks, the proposed normalized initialization can be quite helpful, presumably because the layer-to-layer transformations maintain magnitudes of activations (flowing upward) and gradients (flowing backward).
3、Sigmoid 5代表有5层,N代表正则化,可得出预训练会得到更小的误差
相关文章
- SqlBulkCopy – The given value of type String from the data source cannot be converted to type
- The method assertEquals(double, double) from the type Assert is deprecated
- 【错误记录】Flutter 编译报错 ( The parameter ‘‘ can‘t have a value of ‘null‘ because of its type, but the im )
- ORA-22613: buflen does not match the size of the scalar ORACLE 报错 故障修复 远程处理
- ORA-22994: source offset is beyond the end of the source LOB ORACLE 报错 故障修复 远程处理
- ORA-24000: invalid value string, string should be of the form [SCHEMA.]NAME ORACLE 报错 故障修复 远程处理
- ORA-27044: unable to write the header block of file ORACLE 报错 故障修复 远程处理
- ORA-28085: The input and output lengths of the redaction do not match. ORACLE 报错 故障修复 远程处理
- ORA-30513: cannot create system triggers of INSTEAD OF type ORACLE 报错 故障修复 远程处理
- ORA-30736: objects in a table or view hierarchy have to be in the same schema ORACLE 报错 故障修复 远程处理
- ORA-31128: The event handler calls cannot exceed the depth of string ORACLE 报错 故障修复 远程处理
- ORA-54028: cannot change the HIDDEN/VISIBLE property of a virtual column ORACLE 报错 故障修复 远程处理
- ORA-13906: The tablespace is not of the right type. ORACLE 报错 故障修复 远程处理
- ORA-16741: the destination parameter of standby database “string” has incorrect syntax ORACLE 报错 故障修复 远程处理
- The Power of Linux: Unleashing the Potential of OpenSource Software(linux英文)
- MySQL 5.7.17:The Latest Upgrade of the Database System(mysql5.7.17)
- 「Linux 录屏软件下载」功能强大,操作简单!(Note: The translation of the title can be read as Linux Screen Recording Software Download: Powerful Functionality Simple Operation! (linux录屏软件下载)
- The Power of the Hydra Oracle: Unlocking Potential(hydraoracle)
- Exploring the Benefits of MySQL 5.7 Update for Enhanced Performance and Security(mysql5.7更新)
- Exploring the Dynamic Duo: The Power of Solr MongoDB Integration(solrmongodb)
- Deep Dive into Redis Cascade: Examining the Benefits and Use Cases of Redis Cascade(redis级联)
- Exploring the Benefits of Storing Data with MongoDB and XML(mongodbxml)
- Exploring the Benefits and Challenges of Adopting SCO Linux for Your Business.(scolinux)
- Exploring the Benefits of Linux RFS: An Overview of Reliable File System Management(linuxrfs)
- Exploring the Power of Echo Command in Linux: Everything You Need to Know(echolinux)
- Exploring the Features and Benefits of SQL Server 7.0: Maximizing Your Data Management Capabilities(sqlserver7.0)