最新残差块
时间: 2025-03-21 09:01:04 浏览: 33
### 最新残差块(Residual Block)在深度学习架构中的应用
残差块(Residual Block)的概念最早由He等人提出,其核心思想是通过引入跳跃连接(skip connections),使得深层神经网络能够更轻松地优化权重参数并缓解梯度消失问题[^1]。这一机制显著提升了模型的训练效率和收敛速度。
#### ResNet及其改进版本
最初的ResNet采用简单的加法操作将输入直接传递至后续层,从而形成如下形式的表达式:
\[ \text{output} = F(x, W_i) + x \]
其中 \(F(x, W_i)\) 表示卷积运算的结果,\(x\) 是输入张量。为了进一步提升性能,研究者们提出了多种改进方案,例如Pre-Activation ResNet,在激活函数之前执行批量归一化(Batch Normalization)以稳定训练过程[^2]。
#### 更新的变种:Bottleneck Structure with Attention Mechanism
近年来,注意力机制(Attention Mechanism)逐渐融入到传统的残差模块设计之中。这些新型结构不仅保留了原有框架的优势,还增强了对重要特征的关注能力。例如SE-ResNet利用通道间的关系自适应调整各部分的重要性;CBAM则分别从空间维度与信道维度两个角度出发构建注意图谱[^3]。
以下是基于PyTorch实现的一个典型带注意力机制的残差单元代码片段:
```python
import torch.nn as nn
class SEBlock(nn.Module):
def __init__(self, channel, reduction=16):
super(SEBlock, self).__init__()
self.avg_pool = nn.AdaptiveAvgPool2d(1)
self.fc = nn.Sequential(
nn.Linear(channel, channel // reduction),
nn.ReLU(inplace=True),
nn.Linear(channel // reduction, channel),
nn.Sigmoid()
)
def forward(self, x):
b, c, _, _ = x.size()
y = self.avg_pool(x).view(b, c)
y = self.fc(y).view(b, c, 1, 1)
return x * y.expand_as(x)
class ResidualBlockWithSE(nn.Module):
expansion = 4
def __init__(self, inplanes, planes, stride=1, downsample=None):
super(ResidualBlockWithSE, self).__init__()
self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
self.bn1 = nn.BatchNorm2d(planes)
self.relu = nn.ReLU(inplace=True)
self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
padding=1, bias=False)
self.bn2 = nn.BatchNorm2d(planes)
self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1, bias=False)
self.bn3 = nn.BatchNorm2d(planes * self.expansion)
self.se = SEBlock(planes * self.expansion)
self.downsample = downsample
def forward(self, x):
residual = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
out = self.relu(out)
out = self.conv3(out)
out = self.bn3(out)
out = self.se(out)
if self.downsample is not None:
residual = self.downsample(x)
out += residual
out = self.relu(out)
return out
```
上述代码定义了一个带有挤压激励(Squeeze-and-Excitation)组件的标准瓶颈型残差块。它先经过三次连续的不同尺寸卷积核变换后再加入全局上下文感知特性调节最终输出结果[^4]。
---
阅读全文
相关推荐


















