BUPTOJ 407. BLOCKS

该博客主要介绍了BUPTOJ 407问题的解决思路,通过动态规划(dp)记录每行字符'#'到达的最大高度。接着,使用O(n^2)的时间复杂度遍历,检查每一行 '# '字符构成的矩形是否满足条件:各行高度相同且宽度一致,以此来判断有效矩形的存在。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

dp记录每行向上的#的高度

然后O(n^2)直接扫过去即可

判断矩形条件:每行#高度相等且宽度相等


/*
USER_ID: test#ggvalid
PROBLEM: 407
SUBMISSION_TIME: 2014-07-14 17:45:35
*/
#include <iostream>
#include <cstdio>
#include <cstring>
#include <cctype>
#include <cstdlib>
#include <cmath>
#include <algorithm>
 
using namespace std;
 
int t, n, m;
int dp[1005][1005];
char s[2][1005];
 
int judge()
{
    int ret=0;
    for(int i=1; i<=n; i++){
        for(int j=1; j<=m; j++){
            if(dp[i][j]){
                int tmp=j;
                int ans=dp[i][j], ans2=dp[i+1][j];
                while(j<=m){
                    if(dp[i][j]!=0&&dp[i][j]!=ans) return -1;
                    if(dp[i][j]!=0&&dp[i+1][j]!=ans2) return -1;
                    j++;
                    if(!dp[i][j]) break;
                }
                j--;
                if(!dp[i+1][j]) ret++;
            }
        }
    }
    return ret;
}
 
int main()
{
    while(~scanf("%d%d",&n,&m)){
        int ans=-1, mm=0;
        memset(dp,0,sizeof dp);
        for(int i=1; i<=n; i++){
            scanf("%s",s[mm]);
            mm^=1;
            for(int j=1; j<=m; j++){
                if(i!=1&&s[mm][j-1]=='#') dp[i][j]=(s[mm^1][j-1]=='#'?(dp[i-1][j]+1):0);
                else dp[i][j]=s[mm^1][j-1]=='#'?1:0;
            }
        }
        ans=judge();
        if(ans==-1) puts("So Sad");
        else printf("There are %d ships.\n",ans);
    }
    return 0;
}


RuntimeError: Error(s) in loading state_dict for ActionModel: Missing key(s) in state_dict: "net.blocks.12.attn.qkv.weight", "net.blocks.12.attn.qkv.bias", "net.blocks.12.attn.proj.weight", "net.blocks.12.attn.proj. bias", "net.blocks.12.mlp.fc1.weight", "net.blocks.12.mlp.fc1.bias", "net.blocks.12.mlp.fc2.weight", "net.blocks.12.mlp.fc2.bias", "net.blocks.13.attn.qkv.weight ", "net.blocks.13.attn.qkv.bias", "net.blocks.13.attn.proj.weight", "net.blocks.13.attn.proj.bias", "net.blocks.13.mlp.fc1.weight", "net.blocks.13.mlp.fc1.bias", "net.blocks.13.mlp.fc2.weight", "net.blocks.13.mlp.fc2.bias", "net.blocks.14.attn.qkv.weight", "net.blocks.14.attn.qkv.bias", "net.blocks.14.attn.proj.weight", "net.blocks.14.attn.proj.bias", "net.blocks.14.mlp.fc1.weight", "net.blocks.14.mlp.fc1.bias", "net.blocks.14.mlp.fc2.weight", "net.blocks.14.mlp.fc2.bias", "net. blocks.15.attn.qkv.weight", "net.blocks.15.attn.qkv.bias", "net.blocks.15.attn.proj.weight", "net.blocks.15.attn.proj.bias", "net.blocks.15.mlp.fc1.weight", "net .blocks.15.mlp.fc1.bias", "net.blocks.15.mlp.fc2.weight", "net.blocks.15.mlp.fc2.bias", "net.blocks.16.attn.qkv.weight", "net.blocks.16.attn.qkv.bias", "net.bloc ks.16.attn.proj.weight", "net.blocks.16.attn.proj.bias", "net.blocks.16.mlp.fc1.weight", "net.blocks.16.mlp.fc1.bias", "net.blocks.16.mlp.fc2.weight", "net.block s.16.mlp.fc2.bias", "net.blocks.17.attn.qkv.weight", "net.blocks.17.attn.qkv.bias", "net.blocks.17.attn.proj.weight", "net.blocks.17.attn.proj.bias", "net.blocks .17.mlp.fc1.weight", "net.blocks.17.mlp.fc1.bias", "net.blocks.17.mlp.fc2.weight", "net.blocks.17.mlp.fc2.bias", "net.blocks.18.attn.qkv.weight", "net.blocks.18. attn.qkv.bias", "net.blocks.18.attn.proj.weight", "net.blocks.18.attn.proj.bias", "net.blocks.18.mlp.fc1.weight", "net.blocks.18.mlp.fc1.bias", "net.blocks.18.ml p.fc2.weight", "net.blocks.18.mlp.fc2.bias", "net.blocks.19.attn.qkv.weight", "net.blocks.19.attn.qkv.bias", "net.blocks.19.attn.proj.weight", "net.blocks.19.att n.proj.bias", "net.blocks.19.mlp.fc1.weight", "net.blocks.19.mlp.fc1.bias", "net.blocks.19.mlp.fc2.weight", "net.blocks.19.mlp.fc2.bias", "net.blocks.20.attn.qkv .weight", "net.blocks.20.attn.qkv.bias", "net.blocks.20.attn.proj.weight", "net.blocks.20.attn.proj.bias", "net.blocks.20.mlp.fc1.weight", "net.blocks.20.mlp.fc1 .bias", "net.blocks.20.mlp.fc2.weight", "net.blocks.20.mlp.fc2.bias", "net.blocks.21.attn.qkv.weight", "net.blocks.21.attn.qkv.bias", "net.blocks.21.attn.proj.we ight", "net.blocks.21.attn.proj.bias", "net.blocks.21.mlp.fc1.weight", "net.blocks.21.mlp.fc1.bias", "net.blocks.21.mlp.fc2.weight", "net.blocks.21.mlp.fc2.bias" , "net.blocks.22.attn.qkv.weight", "net.blocks.22.attn.qkv.bias", "net.blocks.22.attn.proj.weight", "net.blocks.22.attn.proj.bias", "net.blocks.22.mlp.fc1.weight ", "net.blocks.22.mlp.fc1.bias", "net.blocks.22.mlp.fc2.weight", "net.blocks.22.mlp.fc2.bias", "net.blocks.23.attn.qkv.weight", "net.blocks.23.attn.qkv.bias", "n et.blocks.23.attn.proj.weight", "net.blocks.23.attn.proj.bias", "net.blocks.23.mlp.fc1.weight", "net.blocks.23.mlp.fc1.bias", "net.blocks.23.mlp.fc2.weight", "ne t.blocks.23.mlp.fc2.bias". size mismatch for net.positional_embedding: copying a param with shape torch.Size([17, 768]) from checkpoint, the shape in current model is torch.Size([1 7, 1024]). size mismatch for net.history_embedder.linear.weight: copying a param with shape torch.Size([768, 7]) from checkpoint, the shape in current model is torc h.Size([1024, 7]). size mismatch for net.history_embedder.linear.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Siz e([1024]). size mismatch for net.x_embedder.linear.weight: copying a param with shape torch.Size([768, 7]) from checkpoint, the shape in current model is torch.Size ([1024, 7]). size mismatch for net.x_embedder.linear.bias: copying a param with shape torch.Size([768]) from checkpoint, the shape in current model is torch.Size([102 4]). size mismatch for net.t_embedder.mlp.0.weight: copying a param with shape torch.Size([768, 256]) from checkpoint, the shape in current model is torch.Siz e([1024, 256]).怎么解决
最新发布
07-01
<think>我们正在解决加载模型时出现的两个常见问题:键缺失(missingkeys)和参数尺寸不匹配(size mismatch)。根据引用[1][2][3][4],我们可以总结以下解决方案:1.**键缺失问题**:通常是因为模型保存时的结构和加载时的结构不一致。可能的原因包括:-模型在保存时使用了多GPU训练(如`DataParallel`或`DistributedDataParallel`),导致参数名前带有`module.`前缀,而加载时没有使用多GPU,因此参数名不匹配。-自定义模型时,保存和加载的模型结构有差异(如层名称改变)。 2. **参数尺寸不匹配**:通常是因为模型结构发生变化,比如某一层的输入输出维度改变,导致保存的权重与当前模型对应层的尺寸不一致。 ###解决方案####1.处理键缺失问题- **方法一:去除参数名前缀**(适用于多GPU训练保存的模型在单卡加载)如果保存的模型是在多GPU环境下训练的,那么参数名会带有`module.`前缀。在单卡加载时,我们需要去掉这个前缀。可以通过以下代码实现:```python#加载保存的state_dictstate_dict= torch.load('model.pth')#去除`module.`前缀from collectionsimport OrderedDictnew_state_dict= OrderedDict()fork,v instate_dict.items():name= k[7:]if k.startswith('module.')else k#去掉`module.`new_state_dict[name] =v#然后加载model.load_state_dict(new_state_dict)```- **方法二:使用`strict=False`参数**(允许部分加载,忽略缺失的键)如果缺失的键不是关键层(例如,缺失的是分类层的权重,而你打算重新训练分类层),那么可以设置`strict=False`来忽略缺失的键:```pythonmodel.load_state_dict(state_dict, strict=False)```但注意,这样会忽略所有缺失的键,同时也会忽略尺寸不匹配的键(尺寸不匹配会报错,但`strict=False`会忽略这些错误,只加载匹配的键)。因此,使用这个方法需要确保缺失的键不会影响模型的主要功能。- **方法三:调整模型为多GPU模式**(如果加载环境也是多GPU)如果加载环境也是多GPU,那么可以先将模型用`DataParallel`包装,然后再加载:```pythonmodel= ActionModel(...)#你的模型iftorch.cuda.device_count() >1:model =torch.nn.DataParallel(model,device_ids=[0,1,2,...]).cuda()model.load_state_dict(torch.load('model.pth'))``` ####2.处理参数尺寸不匹配问题-**检查模型结构**:确认保存模型时的结构和当前加载模型的结构是否一致。特别是检查报错信息中提到的具体层,比较其输入输出维度。- **部分加载**:如果只是部分层的尺寸不匹配(比如分类层的输出类别数变化),那么可以使用`strict=False`来加载除了这些层以外的权重。然后重新初始化不匹配的层。- **自定义加载**:如果只有少数层不匹配,可以手动选择加载匹配的层:```pythonmodel_dict= model.state_dict()#过滤掉不匹配的键pretrained_dict ={k: vfor k, vin pretrained_dict.items()if kin model_dict andv.size() ==model_dict[k].size()}model_dict.update(pretrained_dict)model.load_state_dict(model_dict)```###综合建议1.首先,根据错误信息确定是键缺失还是尺寸不匹配,或者两者都有。2.如果是键缺失,判断是否是多GPU训练导致的前缀问题,如果是则使用方法一或三。3.如果缺失的键不是关键层,或者你允许部分加载,则使用`strict=False`(方法二)。4.对于尺寸不匹配,需要根据具体情况调整模型结构或进行部分加载。 ###示例代码假设你遇到以下错误: ```RuntimeError: Error(s)in loadingstate_dictfor ActionModel:Missing key(s)in state_dict:"layer_norm.weight","layer_norm.bias" Unexpected key(s)in state_dict:"module.layer_norm.weight", "module.layer_norm.bias" Sizemismatch forfc.weight:copyinga paramwith shapetorch.Size([100,256])from checkpoint, theshape incurrent modelis torch.Size([10,256]). ```我们可以这样处理:```python#加载原始state_dictcheckpoint =torch.load('model.pth', map_location='cpu')#去除多GPU前缀new_checkpoint ={} fork,v incheckpoint.items():if k.startswith('module.'):new_checkpoint[k[7:]]= velse:new_checkpoint[k]= v#加载模型model= ActionModel(...)#注意:这里构建的模型结构要和原始训练时一致(除了分类层可能改变)#获取当前模型的状态字典model_dict= model.state_dict() #处理尺寸不匹配:假设分类层(fc)的尺寸改变了,我们只加载匹配的键pretrained_dict= {k:v fork,v innew_checkpoint.items() ifkin model_dict andv.size() ==model_dict[k].size()} #更新当前模型的状态字典model_dict.update(pretrained_dict)model.load_state_dict(model_dict) #或者,如果允许部分加载,可以直接用strict=False#model.load_state_dict(new_checkpoint,strict=False) ```###注意事项-在加载模型之前,最好先打印出保存的state_dict的键和当前模型state_dict的键,以便对比差异。 -使用`torch.load`时,注意设置`map_location`参数,确保模型加载到正确的设备上(如CPU或GPU)。###相关问题1.如何检查PyTorch模型的state_dict?2.使用DataParallel训练模型时有哪些注意事项?3.如何自定义PyTorch模型的部分权重加载?</think>### 解决加载模型时的 `state_dict` 键缺失和尺寸不匹配问题 #### 问题分析 当出现 `RuntimeError: Error(s) in loading state_dict for ActionModel` 错误时,通常由以下原因导致: 1. **键缺失(Missing keys)**:模型权重文件中的键名与当前模型结构不一致 2. **尺寸不匹配(Size mismatch)**:权重参数的维度与模型层要求的维度不一致 3. **多GPU训练遗留问题**:使用 `DataParallel` 训练后保存的模型包含 `module.` 前缀 --- ### 解决方案 #### 方法1:处理多GPU训练前缀(最常见) 当模型使用 `DataParallel` 训练时,保存的权重键名会带有 `module.` 前缀。加载时需去除前缀: ```python import torch from collections import OrderedDict # 加载原始权重 checkpoint = torch.load('model.pth', map_location='cpu') # 去除module前缀 new_state_dict = OrderedDict() for k, v in checkpoint.items(): name = k.replace('module.', '') # 关键操作 new_state_dict[name] = v # 加载修正后的权重 model = ActionModel() # 初始化模型 model.load_state_dict(new_state_dict, strict=True) # strict=True 严格匹配 ``` #### 方法2:允许部分加载(忽略缺失键) 若某些层无需加载预训练权重(如新增层),可设置 `strict=False`: ```python model.load_state_dict(torch.load('model.pth'), strict=False) ``` 此时会: - 加载匹配的权重 - 忽略缺失的键 - **但不会解决尺寸不匹配问题** #### 方法3:手动对齐尺寸不匹配层 对尺寸不匹配的层进行特殊处理: ```python pretrained_dict = torch.load('model.pth') model_dict = model.state_dict() # 筛选可加载参数 pretrained_dict = { k: v for k, v in pretrained_dict.items() if k in model_dict and v.shape == model_dict[k].shape } # 更新模型参数 model_dict.update(pretrained_dict) model.load_state_dict(model_dict) # 打印缺失/不匹配信息 missing = [k for k in model_dict if k not in pretrained_dict] print(f"未加载层: {missing}") ``` #### 方法4:模型结构一致性检查 确保训练和推理时的模型结构完全一致: 1. 检查模型版本是否匹配 2. 确认自定义层命名一致性 3. 验证输入/输出维度是否变化 --- ### 关键注意事项 1. **设备一致性**:加载前用 `map_location` 指定设备 ```python torch.load('model.pth', map_location=torch.device('cuda:0')) ``` 2. **层归一化处理**:若报错缺失 `layer_norm.weight/bias`,检查模型是否包含归一化层 3. **版本控制**:使用相同框架版本保存和加载模型 4. **调试建议**:打印权重键名对比差异 ```python print("Loaded keys:", list(checkpoint.keys())) print("Model keys:", list(model.state_dict().keys())) ``` ### 典型错误场景解决方案 | 错误类型 | 现象 | 解决方案 | |---------|------|---------| | **多GPU前缀** | 键名含 `module.conv1.weight` | 使用 `k.replace('module.', '')` | | **新增层** | 缺失新增层权重 | 设置 `strict=False` | | **分类层维度变化** | `fc.weight` 尺寸不匹配 | 手动筛选匹配参数 | | **结构不一致** | 关键层缺失(如 `layer_norm`) | 检查模型定义代码 | > 通过上述方法可解决95%以上的 `state_dict` 加载错误。若问题仍存在,需检查模型构建代码与权重保存时的结构一致性[^1][^3][^4]。 --- ### 相关问题 1. 如何正确保存PyTorch模型以避免加载时的兼容性问题? 2. 使用 `DataParallel` 和 `DistributedDataParallel` 训练时保存模型有何区别? 3. 模型微调时如何处理预训练权重与新结构不匹配的问题? 4. PyTorch中 `strict=False` 参数会带来哪些潜在风险?
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值