PyTorch学习
- torch.nn.init.xavier_uniform_(tensor,gain=1.0)
网络训练过程中容易出现梯度消失或者梯度爆炸的情况,导致大部分反向传播得到的梯度不起作用或者起反作用。因此就需要一种合理的权重初始化方法,让计算过程中的数值分布更稳定。
Xavier初始化也称Glorot初始化,出自文章Understanding the difficulty of training deep feedforward neural networks.
输出结果将从\(\mathcal{U}(-a,a)\)中采样,
\[a=gain\times\sqrt{\frac{6}{fan\_in+fan\_out}}\]
类似的函数还有torch.init.xavier_normal,结果从\(\mathcal{N}(0,std^2)\)中采样,
\[std=gain\times\sqrt{\frac{2}{fan\_in+fan\_out}}\]
nonlinearity | gain |
---|---|
Linear/Identity | \(1\) |
Conv{1,2,3}D | \(2\) |
Sigmoid | \(1\) |
Tanh | \(\frac{5}{3}\) |
ReLu | \(\sqrt{2}\) |
Leaky ReLU | \(\sqrt{\frac{2}{1+negative\_slope^2}}\) |
SELU | \(\frac{3}{4}\) |
PyTorch学习
http://k0145vin.xyz/2022/10/27/PyTorch学习/