PyTorch Lightning中的迁移学习实践指南-CSDN博客

本文链接：https://blog.csdn.net/gitblog_00756/article/details/148360192

PyTorch Lightning中的迁移学习实践指南

pytorch-lightning Lightning-AI/pytorch-lightning: PyTorch Lightning 是一个轻量级的高级接口，用于简化 PyTorch 中深度学习模型的训练流程。它抽象出了繁杂的工程细节，使研究者能够专注于模型本身的逻辑和实验设计，同时仍能充分利用PyTorch底层的灵活性。项目地址: https://gitcode.com/gh_mirrors/py/pytorch-lightning

什么是迁移学习

迁移学习(Transfer Learning)是深度学习中的一种重要技术，它允许我们将一个预训练模型的知识迁移到新的任务上。这种方法特别适用于数据量有限的新任务，因为它可以利用在大规模数据集上训练得到的通用特征表示。

为什么选择PyTorch Lightning进行迁移学习

PyTorch Lightning作为一个轻量级的PyTorch封装框架，为迁移学习提供了简洁而强大的支持。它保持了PyTorch的灵活性，同时通过标准化的训练流程简化了模型开发。

基础使用方式

使用任意PyTorch模型

在PyTorch Lightning中，任何继承自torch.nn.Module的模型都可以无缝使用，因为LightningModule本身就是nn.Module的子类。这意味着现有的PyTorch模型可以轻松集成到Lightning框架中。

使用预训练的LightningModule

下面是一个典型的使用预训练自动编码器作为特征提取器的示例：

class CIFAR10Classifier(LightningModule):
    def __init__(self):
        super().__init__()
        # 加载预训练模型
        self.feature_extractor = AutoEncoder.load_from_checkpoint(PATH)
        self.feature_extractor.freeze()  # 冻结参数
        
        # 添加新的分类层
        self.classifier = nn.Linear(100, 10)  # 假设自动编码器输出100维特征
        
    def forward(self, x):
        representations = self.feature_extractor(x)
        return self.classifier(representations)

在这个例子中，我们冻结了预训练模型的所有参数，只训练新添加的分类层，这是迁移学习的常见做法。

计算机视觉应用实例：ImageNet预训练模型

对于计算机视觉任务，我们可以利用在ImageNet上预训练的模型：

class ImagenetTransferLearning(LightningModule):
    def __init__(self):
        super().__init__()
        # 加载预训练的ResNet50
        backbone = models.resnet50(weights="DEFAULT")
        num_filters = backbone.fc.in_features
        
        # 移除最后的全连接层
        self.feature_extractor = nn.Sequential(*list(backbone.children())[:-1])
        self.feature_extractor.eval()  # 设置为评估模式
        
        # 添加新的分类层
        self.classifier = nn.Linear(num_filters, 10)  # CIFAR-10有10个类别
        
    def forward(self, x):
        with torch.no_grad():  # 不计算梯度
            representations = self.feature_extractor(x).flatten(1)
        return self.classifier(representations)

微调与预测

# 微调模型
model = ImagenetTransferLearning()
trainer = Trainer()
trainer.fit(model)

# 预测
model = ImagenetTransferLearning.load_from_checkpoint(PATH)
model.freeze()  # 冻结所有参数
predictions = model(x)  # x是你的输入数据

在实际应用中，我们通常在小型数据集上微调预训练模型，然后用于特定领域的预测任务。

自然语言处理应用实例：BERT模型

PyTorch Lightning同样适用于NLP领域的迁移学习，例如使用BERT模型：

class BertMNLIFinetuner(LightningModule):
    def __init__(self):
        super().__init__()
        # 加载预训练的BERT模型
        self.bert = BertModel.from_pretrained("bert-base-cased", output_attentions=True)
        self.bert.train()  # 设置为训练模式
        
        # 添加新的分类层
        self.W = nn.Linear(self.bert.config.hidden_size, 3)  # MNLI有3个类别
        self.num_classes = 3
        
    def forward(self, input_ids, attention_mask, token_type_ids):
        h, _, attn = self.bert(
            input_ids=input_ids,
            attention_mask=attention_mask,
            token_type_ids=token_type_ids
        )
        h_cls = h[:, 0]  # 获取[CLS]标记的隐藏状态
        return self.W(h_cls), attn