csapp perflab

本文探讨了在图像处理任务中通过优化旋转和光滑算法,显著提高了工作效率并减少了错误率。详细分析了改进策略,包括写入顺序逻辑调整、采用分块策略充分利用缓存、以及使用查表法存储重复计算结果。同时,揭示了算法实现过程中的常见问题及解决方案,如类型转换错误和除法误差,并提供了改进后的代码实现。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

第四个lab,纠结了好久,仍然提高不多,欢迎拍砖.

解决思路:

rotate:

  1.个人感觉写不命中的惩罚会更高一点,所以把按顺序读的逻辑改成了写按顺序;

  2.为了充分利用一级缓存(32KB), 采用分块策略, 每一个块大小为32.

  提升约6.5倍

/* 
 * rotate - Your current working version of rotate
 * IMPORTANT: This is the version you will be graded on
 * size of cache 1: 32K
 * size of pixel :  6B
 */
char rotate_descr[] = "rotate: Current working version";
void rotate(int dim, pixel *src, pixel *dst) 
{
    int i, j, i1, j1, im, jm;
    int block=32;//blocking the Matrix
    for(i=0; i<dim; i+=block)
        for(j=0; j<dim; j+=block)
        {
            //block*block mini matrix
            im = i+block;
        for(i1=i; i1<i+block; i1++) {
                jm = j+block;
            for(j1=j; j1<j+block; j1++)
            dst[RIDX(i1, j1, dim)] = src[RIDX(j1, dim-i1-1, dim)];
            }
        }
}

smooth:

  1.保存需要重复利用的计算的结果, 查表法

  提升约12倍

char smooth_descr1[] = "smooth: Storing reused results.";
void smooth1(int dim, pixel *src, pixel *dst) 
{
    pixel_sum rowsum[530][530];
    int i, j, snum;
    for(i=0; i<dim; i++)
    {
        rowsum[i][0].red = (src[RIDX(i, 0, dim)].red+src[RIDX(i, 1, dim)].red);
        rowsum[i][0].blue = (src[RIDX(i, 0, dim)].blue+src[RIDX(i, 1, dim)].blue);
        rowsum[i][0].green = (src[RIDX(i, 0, dim)].green+src[RIDX(i, 1, dim)].green);
        rowsum[i][0].num = 2;
        for(j=1; j<dim-1; j++)
        {
            rowsum[i][j].red = (src[RIDX(i, j-1, dim)].red+src[RIDX(i, j, dim)].red+src[RIDX(i, j+1, dim)].red);
            rowsum[i][j].blue = (src[RIDX(i, j-1, dim)].blue+src[RIDX(i, j, dim)].blue+src[RIDX(i, j+1, dim)].blue);
            rowsum[i][j].green = (src[RIDX(i, j-1, dim)].green+src[RIDX(i, j, dim)].green+src[RIDX(i, j+1, dim)].green);
            rowsum[i][j].num = 3;
        }
        rowsum[i][dim-1].red = (src[RIDX(i, dim-2, dim)].red+src[RIDX(i, dim-1, dim)].red);
        rowsum[i][dim-1].blue = (src[RIDX(i, dim-2, dim)].blue+src[RIDX(i, dim-1, dim)].blue);
        rowsum[i][dim-1].green = (src[RIDX(i, dim-2, dim)].green+src[RIDX(i, dim-1, dim)].green);
        rowsum[i][dim-1].num = 2;
    }
    for(j=0; j<dim; j++)
    {
        snum = rowsum[0][j].num+rowsum[1][j].num;
        dst[RIDX(0, j, dim)].red = (unsigned short)((rowsum[0][j].red+rowsum[1][j].red)/snum);
        dst[RIDX(0, j, dim)].blue = (unsigned short)((rowsum[0][j].blue+rowsum[1][j].blue)/snum);
        dst[RIDX(0, j, dim)].green = (unsigned short)((rowsum[0][j].green+rowsum[1][j].green)/snum);
        for(i=1; i<dim-1; i++)
        {
            snum = rowsum[i-1][j].num+rowsum[i][j].num+rowsum[i+1][j].num;
            dst[RIDX(i, j, dim)].red = (unsigned short)((rowsum[i-1][j].red+rowsum[i][j].red+rowsum[i+1][j].red)/snum);
            dst[RIDX(i, j, dim)].blue = (unsigned short)((rowsum[i-1][j].blue+rowsum[i][j].blue+rowsum[i+1][j].blue)/snum);
            dst[RIDX(i, j, dim)].green = (unsigned short)((rowsum[i-1][j].green+rowsum[i][j].green+rowsum[i+1][j].green)/snum);
        }
        snum = rowsum[dim-1][j].num+rowsum[dim-2][j].num;
        dst[RIDX(dim-1, j, dim)].red = (unsigned short)((rowsum[dim-2][j].red+rowsum[dim-1][j].red)/snum);
        dst[RIDX(dim-1, j, dim)].blue = (unsigned short)((rowsum[dim-2][j].blue+rowsum[dim-1][j].blue)/snum);
        dst[RIDX(dim-1, j, dim)].green = (unsigned short)((rowsum[dim-2][j].green+rowsum[dim-1][j].green)/snum);
    }
}

让我纠结的问题:

1.第二题昨天一开始就想到要把重复计算的部分保存起来,但是算法实现后一直是segmentation fault, 在这个系统里又没法调试,我实在不知道如何解决了,纠结了好长时间,不得不暂时放弃了.  今天突然发现原来的类型转换没有把后面整个表达式括起来,可能导致后面运算后结果仍为int型,赋值时发生了错误...

2第二题segmentation fault解决之后,又发现一个问题就是总是有好多算出来的结果是错误的,而且与正确结果只相差1. 想了好久终于明白了,是因为我把除法分到了两个部分分别计算(比如/4变成了两次/2),导致舍入的时候出现了误差.


评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值