Pytorch——torch.nn.init 中实现的初始化函数
参考:官方
1. 均匀分布
torch.nn.init.
uniform_
(tensor, a=0.0, b=1.0)
解释:
Fills the input Tensor with values drawn from the uniform distribution $\mathcal{U}(a, b)$
参数:
-
tensor – an n-dimensional torch.Tensor
-
a – the lower bound of the uniform distribution
-
b – the upper bound of the uniform distribution
例子:
w = torch.empty(2, 2)
print('before init w = \n',w)
torch.nn.init.uniform_(w, a=0.0, b=1.0)
print('after init w = \n',w)
结果:
before init w =
tensor([[1.4013e-45, 0.0000e+00],
[0.0000e+00, 0.0000e+00]])
after init w =
tensor([[0.8658, 0.3711],
[0.8950, 0.1419]])
2. 高斯分布
torch.nn.init.
normal_
(tensor, mean=0.0, std=1.0)
解释:
Fills the input Tensor with values drawn from the normal distribution $\mathcal{N}\left(\text { mean, } \operatorname{std}^{2}\right) $.
参数:
-
tensor – an n-dimensional torch.Tensor
-
mean – the mean of the normal distribution
-
std – the standard deviation of the normal distribution
例子:
w = torch.empty(2, 2)
print('before init w = \n',w)
torch.nn.init.normal_(w,mean=10,std=0.01)
print('after init w = \n',w)
结果:
before init w =
tensor([[2.3877e-38, 1.0010e+01],
[2.2421e-44, 0.0000e+00]])
after init w =
tensor([[10.0128, 10.0086],
[10.0064, 9.9983]])
3. 初始化为常数
torch.nn.init.
constant_
(tensor, val)
解释:
Fills the input Tensor with the value $val$.
参数:
-
tensor – an n-dimensional torch.Tensor
-
val – the value to fill the tensor with
例子:
w = torch.empty(2, 2)
print('before init w = \n',w)
torch.nn.init.constant_(w,18)
print('after init w = \n',w)
结果:
before init w =
tensor([[1.4013e-45, 0.0000e+00],
[0.0000e+00, 0.0000e+00]])
after init w =
tensor([[18., 18.],
[18., 18.]])
4.初始化为全 1
torch.nn.init.
ones_
(tensor)
解释:
Fills the input Tensor with the scalar value 1.
参数:
- tensor – an n-dimensional torch.Tensor
例子:
w = torch.empty(2, 2) print('before init w = \n',w) torch.nn.init.ones_(w) print('after init w = \n',w)
结果:
before init w =
tensor([[9.1477e-41, 0.0000e+00],
[8.4078e-44, 0.0000e+00]])
after init w =
tensor([[1., 1.],
[1., 1.]])
5.初始化为全 0
torch.nn.init.
zeros_
(tensor)
解释:
Fills the input Tensor with the scalar value 0.
参数:
- tensor – an n-dimensional torch.Tensor
例子:
w = torch.empty(2, 2) print('before init w = \n',w) torch.nn.init.zeros_(w) print('after init w = \n',w)
结果:
before init w =
tensor([[9.1477e-41, 0.0000e+00],
[4.4842e-44, 0.0000e+00]])
after init w =
tensor([[0., 0.],
[0., 0.]])
6.初始化为对角单位阵
torch.nn.init.
eye_
(tensor)
解释:
Fills the 2-dimensional input Tensor with the identity matrix. Preserves the identity of the inputs in Linear layers, where as many inputs are preserved as possible.
参数:
- tensor – a 2-dimensional torch.Tensor
例子:
w = torch.empty(2, 2) print('before init w = \n',w) torch.nn.init.eye_(w) print('after init w = \n',w)
结果:
before init w =
tensor([[1., 1.],
[1., 1.]])
after init w =
tensor([[1., 0.],
[0., 1.]])
7 .Xavier 均匀分布
torch.nn.init.
xavier_uniform_
(tensor, gain=1.0)
解释:
Fills the input Tensor with values according to the method described in Understanding the difficulty of training deep feedforward neural networks - Glorot, X. & Bengio, Y. (2010), using a uniform distribution. The resulting tensor will have values sampled from $U(−a,a) $ where
$a=\operatorname{gain} \times \sqrt{\frac{6}{\text { fan_in }+\text { fan_out }}}$
参数:
- tensor – an n-dimensional torch.Tensor
- gain – an optional scaling factor
例子:
w = torch.empty(2, 2) print('before init w = \n',w) torch.nn.init.xavier_uniform_(w,gain=nn.init.calculate_gain('relu')) print('after init w = \n',w)
结果:
before init w =
tensor([[1.4013e-45, 0.0000e+00],
[0.0000e+00, 0.0000e+00]])
after init w =
tensor([[ 0.6120, -0.9743],
[-1.5010, 0.5827]])
例子:
gain=nn.init.calculate_gain('relu') gain
结果:
1.4142135623730951
例子:
gain=nn.init.calculate_gain('sigmoid') gain
结果:
1
8 .Xavier 高斯分布
torch.nn.init.
xavier_normal_
(tensor, gain=1.0)
解释:
Fills the input Tensor with values according to the method described in Understanding the difficulty of training deep feedforward neural networks - Glorot, X. \& Bengio, Y. (2010), using a normal distribution. The resulting tensor will have values sampled from $\mathcal{N}\left(0, \mathrm{std}^{2}\right)$ where
$\operatorname{std}=\operatorname{gain} \times \sqrt{\frac{2}{\text { fan_in }+\text { fan_out }}}$
参数:
-
tensor – an n-dimensional torch.Tensor
-
gain – an optional scaling factor
例子:
w = torch.empty(2, 2) print('before init w = \n',w) torch.nn.init.xavier_normal_(w,gain=nn.init.calculate_gain('relu')) print('after init w = \n',w)
结果:
before init w =
tensor([[0., 0.],
[0., 0.]])
after init w =
tensor([[ 0.9703, 1.0088],
[ 1.1271, -0.0602]])
9.He均匀分布
torch.nn.init.
kaiming_uniform_
(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')
解释:
Fills the input Tensor with values according to the method described in Delving deep into rectifiers: Surpassing humanlevel performance on ImageNet classification - He, K. et al. (2015), using a uniform distribution. The resulting tensor will have values sampled from $\mathcal{U}(- bound, bound)$ where
$\text { bound }=\text { gain } \times \sqrt{\frac{3}{\text { fan_mode }}}$
参数:
-
tensor – an n-dimensional torch.Tensor
-
a – the negative slope of the rectifier used after this layer (only used with
'leaky_relu'
) -
mode – either
'fan_in'
(default) or'fan_out'
. Choosing'fan_in'
preserves the magnitude of the variance of the weights in the forward pass. Choosing'fan_out'
preserves the magnitudes in the backwards pass. -
nonlinearity – the non-linear function (nn.functional name), recommended to use only with
'relu'
or'leaky_relu'
(default).
例子:
w = torch.empty(2, 2) print('before init w = \n',w) torch.nn.init.kaiming_uniform_(w, mode='fan_in', nonlinearity='relu') print('after init w = \n',w)
结果:
before init w =
tensor([[-3.6893e+19, 1.3658e+00],
[ 2.2421e-44, 0.0000e+00]])
after init w =
tensor([[-0.8456, 1.3498],
[-0.8480, -1.1506]])
10.He高斯分布
torch.nn.init.
kaiming_normal_
(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')
解释:
Fills the input Tensor with values according to the method described in Delving deep into rectifiers: Surpassing humanlevel performance on ImageNet classification - He, K. et al. (2015), using a normal distribution. The resulting tensor will have values sampled from $\mathcal{N}\left(0, \mathrm{std}^{2}\right)$ where
$\operatorname{std}=\frac{\text { gain }}{\sqrt{\text { fan_mode }}}$
参数:
-
tensor – an n-dimensional torch.Tensor
-
a – the negative slope of the rectifier used after this layer (only used with
'leaky_relu'
) -
mode – either
'fan_in'
(default) or'fan_out'
. Choosing'fan_in'
preserves the magnitude of the variance of the weights in the forward pass. Choosing'fan_out'
preserves the magnitudes in the backwards pass. -
nonlinearity – the non-linear function (nn.functional name), recommended to use only with
'relu'
or'leaky_relu'
(default).
例子:
w = torch.empty(2, 2) print('before init w = \n',w) torch.nn.init.kaiming_normal_(w, mode='fan_out', nonlinearity='relu') print('after init w = \n',w)
结果:
before init w =
tensor([[-0.8456, 1.3498],
[-0.8480, -1.1506]])
after init w =
tensor([[-1.0357, -1.1732],
[ 0.1517, 0.4935]])
相关文章
- 从零开始学Pytorch(九)之批量归一化和残差网络
- 从零开始学Pytorch(十)之循环神经网络基础
- 从零开始学Pytorch(十七)之目标检测基础
- 计算机视觉中自注意力构建块的PyTorch实现
- 从零开始学Pytorch(三)之多层感知机的实现
- 更简单的掩码图像建模框架SimMIM介绍和PyTorch代码实现
- PyTorch实现ResNet18
- pytorch – ohem 代码实现
- pytorch实现L2和L1正则化regularization的方法
- Pytorch模型训练实用教程学习笔记:二、模型的构建
- pytorch实现ShuffleNet「建议收藏」
- 基于pytorch_pytorch handbook
- Aarch64 安装Anaconda 和 pytorch
- pytorch(8)– resnet101 迁移学习记录
- pytorch实现卷积神经网络_pytorch项目
- ResNet34学习笔记+用pytorch手写实现
- 闻其声而知雅意,M1 Mac基于PyTorch(mps/cpu/cuda)的人工智能AI本地语音识别库Whisper(Python3.10)
- 100行Pytorch代码实现三维重建技术神经辐射场 (NeRF)
- PyTorch-24h 01_PyTorch深度学习流程
- 使用scikit-learn为PyTorch 模型进行超参数网格搜索
- 比PyTorch、TensorFlow更快,MindSpore开源一周年升级巨量新特性
- 有bug!PyTorch在AMD CPU的计算机上卡死了
- PyTorch + NumPy这么做会降低模型准确率,这是bug还是预期功能?
- PyTorch 1.9发布,支持新API,可在边缘设备中执行
- 超越TensorFlow?Yann LeCun:“Why? PyTorch. That's why.”
- 为什么 PyTorch 这么火?一线开发者这样说