您现在的位置是：首页 > 其它

当前栏目

[Stats385] Lecture 01-02, warm up with some questions

with 01 02 Up some

2023-09-27 14:23:25 时间

Theories of Deep Learning

借该课程，进入战略要地的局部战斗中，采用红色字体表示值得深究的概念，以及想起的一些需要注意的地方。

Lecture 01

Lecture01: Deep Learning Challenge. Is There Theory? (Donoho/Monajemi/Papyan)

Video link

纯粹的简介，意义不大。

Lecture 02

Video: Stats385 - Theories of Deep Learning - David Donoho - Lecture 2

资料：http://deeplearning.net/reading-list/ 【有点意思的链接】

Readings for this lecture

1 A mathematical theory of deep convolutional neural networks for feature extraction
2 Energy propagation in deep convolutional neural networks
3 Discrete deep feature extraction: A theory and new architectures
4 Topology reduction in deep convolutional feature extraction networks

重要点记录：

未知概念：能量传播，Topology reduction

Lecturer said:　

"Deep learning is simply an era where brute force has sudenly exploded its potential."

"How to use brute force (with limited scope) methold to yield result."

介绍ImageNet，没啥可说的；然后是基本back-propagation。

提了一句：

Newton法的发明人牛顿从来没想过用到NN这种地方，尬聊。

output的常见输出cost计算【补充】，介绍三种：

Assume z is the actual output and t is the target output.

*squared error:*	E = (z-t)²/2
*cross entropy:*	E = -t log(z) - (1-t)log(1-z)
*softmax:*	E = -(z_i - log Σ_j exp(z_j)), where i is the correct class.

第一个难点：

严乐春大咖：http://yann.lecun.com/exdb/publis/pdf/lecun-88.pdf

通过拉格朗日不等式认识反向传播，摘自论文链接前言。

开始介绍常见的卷积网络模型以及对应引进的feature。

讲到在正则方面，dropout有等价ridge regression的效果。

在损失函数中，weight decay是放在正则项（regularization）前面的一个系数，

正则项一般指示模型的复杂度，所以weight decay的作用是调节模型复杂度对损失函数的影响，

若weight decay很大，则复杂的模型损失函数的值也就大。

第二个难点：

通过这个对比：AlexNet vs. Olshausen and Field 引出了一些深度思考：

Why does AlexNet learn filters similar to Olshausen/Field?
Is there an implicit sparsity-promotion in training network?
How would classification results change if replace learned filters in first layer with analytically defined wavelets, e.g. Gabors?
Filters in the first layer are spatially localized, oriented and bandpass. What properties do filters in remaining layers satisfy?
Can we derive mathematically?

这些内容貌似在之后的lecture展开，在此作下标记。

Ref reading：sparse coding，paper

Batch Normalization：

其中有提出这么一个问题，甚是有趣：

Does this imply filters can be learned in unsupervised manner?

第三个难点：

关于卷积可视化，以及DeepDream的原理。

第四个难点：

补充一个难点：权重初始化的策略

Links:

以上提及的重难点，未来将在此附上对应的博客链接。

猜你喜欢

***phpredis扩展安装总结
安卓开发入门教程-Activity
libusb中文简介教程
python解析json
hive与presto解析json中的字段值
JWT 如何解码和获得令牌失效的日期
网页动态切换母版页(MasterPage)
node--http静态服务
提高逼格，给自己的网站加入智能聊天功能
改进YOLO系列 | 添加轻量化Decouple_Head 和 ASFF_Head
思科在Chrome和火狐浏览器上的WebEx扩展有远程代码执行漏洞CVE-2017-6753
Python import与from import使用
thinkphp 关联
图数据库neo4j和关系数据库的区别
【FFMPEG】谈谈RTP传输中的负载类型和时间戳
idea远程debug

相关主题

SQL中with的用法
python的with用法
WITH (NOLOCK)
with语句
python之with...as
sql 之 with as
python with用法
python with语句
WITH AS 用法
by,with

zl程序教程

当前栏目

[Stats385] Lecture 01-02, warm up with some questions

Theories of Deep Learning

Links:

相关文章