AlexNet模型手写数字识别pycharm
时间: 2025-03-20 19:16:21 浏览: 41
### 使用 AlexNet 实现手写数字识别
为了在 PyCharm 中使用 AlexNet 模型完成手写数字识别任务,可以按照以下方法构建完整的解决方案。以下是详细的说明:
#### 1. 数据准备
手写数字识别通常使用的数据集是 MNIST 或 Fashion-MNIST。这些数据集可以直接通过 `torchvision.datasets` 加载[^2]。
加载数据的代码如下:
```python
import torch
from torchvision import datasets, transforms
transform = transforms.Compose([
transforms.Resize((224, 224)), # 调整图像大小以适应 AlexNet 输入 (224x224)
transforms.ToTensor(), # 将 PIL 图像转换为 Tensor
transforms.Normalize([0.5], [0.5]) # 归一化处理
])
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=64, shuffle=True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size=64, shuffle=False)
```
#### 2. 定义 AlexNet 模型
AlexNet 是一种经典的卷积神经网络结构,适用于分类任务。由于 MNIST 的类别数为 10(代表 0 到 9),因此需要调整最后一层全连接层的输出维度为 10[^1]。
定义模型的代码如下:
```python
import torch.nn as nn
import torch.nn.functional as F
class AlexNet(nn.Module):
def __init__(self, num_classes=10):
super(AlexNet, self).__init__()
self.features = nn.Sequential(
nn.Conv2d(1, 64, kernel_size=11, stride=4, padding=2), # 修改输入通道数为 1
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2),
nn.Conv2d(64, 192, kernel_size=5, padding=2),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2),
nn.Conv2d(192, 384, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv2d(384, 256, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.Conv2d(256, 256, kernel_size=3, padding=1),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2),
)
self.avgpool = nn.AdaptiveAvgPool2d((6, 6))
self.classifier = nn.Sequential(
nn.Dropout(),
nn.Linear(256 * 6 * 6, 4096),
nn.ReLU(inplace=True),
nn.Dropout(),
nn.Linear(4096, 4096),
nn.ReLU(inplace=True),
nn.Linear(4096, num_classes),
)
def forward(self, x):
x = self.features(x)
x = self.avgpool(x)
x = torch.flatten(x, 1)
x = self.classifier(x)
return x
```
#### 3. 训练过程
训练过程中需要设置损失函数、优化器以及评估指标。常用的交叉熵损失函数适合多分类问题。
训练代码如下:
```python
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = AlexNet(num_classes=10).to(device)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
def train_model(model, criterion, optimizer, dataloader, epochs=10):
model.train()
for epoch in range(epochs):
running_loss = 0.0
correct_predictions = 0
total_samples = 0
for inputs, labels in dataloader:
inputs, labels = inputs.to(device), labels.to(device)
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
_, preds = torch.max(outputs, 1)
running_loss += loss.item() * inputs.size(0)
correct_predictions += torch.sum(preds == labels.data)
total_samples += labels.size(0)
epoch_loss = running_loss / total_samples
accuracy = correct_predictions.double() / total_samples
print(f'Epoch {epoch+1}/{epochs}, Loss: {epoch_loss:.4f}, Accuracy: {accuracy:.4f}')
train_model(model, criterion, optimizer, train_loader)
```
#### 4. 测试与验证
测试阶段用于评估模型性能。计算准确率或其他评价指标可以帮助了解模型的表现。
测试代码如下:
```python
def test_model(model, dataloader):
model.eval()
correct_predictions = 0
total_samples = 0
with torch.no_grad():
for inputs, labels in dataloader:
inputs, labels = inputs.to(device), labels.to(device)
outputs = model(inputs)
_, preds = torch.max(outputs, 1)
correct_predictions += torch.sum(preds == labels.data)
total_samples += labels.size(0)
accuracy = correct_predictions.double() / total_samples
print(f'Test Accuracy: {accuracy:.4f}')
test_model(model, test_loader)
```
---
### 注意事项
动态图机制可能增加调试难度,尤其是对于不熟悉张量形状变化的新手来说。建议逐步检查每一部分代码的功能和输出,确保其正确无误。
---
阅读全文
相关推荐

















