swim-transformer是那一年提出的?
时间: 2025-06-23 07:16:23 浏览: 9
### Swin-Transformer 的提出年份及相关信息
Swin-Transformer 是一种基于 Transformer 的层次化架构,专为计算机视觉任务设计。该模型在 **2021年** 被提出[^1]。Swin-Transformer 通过引入滑动窗口机制(Shifted Window Mechanism),有效减少了计算复杂度并提升了局部特征的建模能力[^1]。
以下是 Swin-Transformer 的核心特点:
- 层次化结构:Swin-Transformer 将图像划分为多个不重叠的窗口,并在每个窗口内独立计算自注意力,从而形成分层的特征表示。
- 滑动窗口机制:通过交替使用固定窗口和移位窗口,增强了跨窗口的特征交互,同时保持了较低的计算开销。
- 多尺度特征提取:通过逐层合并相邻窗口,逐步生成多尺度特征图,适用于各种视觉任务,如图像分类、目标检测和语义分割。
```python
import tensorflow as tf
from tensorflow.keras import layers
def window_partition(x, window_size):
# 将特征图划分为不重叠的窗口
batch_size, height, width, channels = x.shape
patch_height = height // window_size
patch_width = width // window_size
x = tf.reshape(x, (batch_size, patch_height, window_size, patch_width, window_size, channels))
windows = tf.transpose(x, (0, 1, 3, 2, 4, 5))
windows = tf.reshape(windows, (-1, window_size, window_size, channels))
return windows
def shifted_window_attention(x, window_size, num_heads):
# 实现移位窗口注意力机制
_, height, width, channels = x.shape
shift_size = window_size // 2
x_shifted = tf.roll(x, shift=[-shift_size, -shift_size], axis=[1, 2])
windows = window_partition(x_shifted, window_size)
# 计算自注意力
qkv = layers.Dense(3 * channels)(windows)
qkv = tf.reshape(qkv, (-1, window_size * window_size, 3, num_heads, channels // num_heads))
qkv = tf.transpose(qkv, (2, 0, 3, 1, 4))
q, k, v = qkv[0], qkv[1], qkv[2]
attention_scores = tf.matmul(q, k, transpose_b=True) / tf.math.sqrt(tf.cast(channels // num_heads, tf.float32))
attention_probs = tf.nn.softmax(attention_scores, axis=-1)
output = tf.matmul(attention_probs, v)
output = tf.transpose(output, (0, 2, 1, 3))
output = tf.reshape(output, (-1, window_size, window_size, channels))
return output
```
阅读全文
相关推荐


















