【Mask-RCNN源码解读】RPN_BBOX_STD_DEV和BBOX_STD_DEV的作用

NaN轲

于 2019-10-28 11:17:44 发布

阅读量1.1k

点赞数 5

CC 4.0 BY-SA版权

分类专栏：源码解读文章标签： MaskRCNN 源码

本文链接：https://blog.csdn.net/hankexin1314/article/details/102777285

本文探讨了在Mask-RCNN源码中，BBOX_STD_DEV和RPN_BBOX_STD_DEV参数在计算bbox偏移量时的作用。这两个参数与回归目标的标准化有关，确保训练目标具有均值为零和标准差为一的分布。在训练后，通过乘以标准差和加上均值来将预测的标准化值转换为实际的偏移值，用于调整和缩放anchor框。这些值源于Faster RCNN，并且在不同阶段分别用于转换RPN的输出和损失计算中的bbox偏差。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

最近正在阅读Mask rcnn一个变种的源码，遇到了一个问题，就是在计算deltas，也就是bbox的偏移量的时候，总是会用到参数RPN_BBOX_STD_DEV或者BBOX_STD_DEV进行或乘或除的计算，一直不知道这个的作用是什么，搜索了一下貌似没有看到很好的中文解释，在github的issue上看到了很好的解释，索性搬过来

这个是github地址：https://github.com/matterport/Mask_RCNN/issues/270

问题描述

This is really great project - love the level of comments, the working notebooks, and the simple shapes dataset…all makes it exceptionally easy to read.
One thing that is not clear - In the RPN proposal layer the deltas are multiplied by BBOX_STD_DEV. There was a previous issue asking about this #85. The answer given refers to fastrcnn paper section on normalising the regression targets for the loss function.
However this is applied after the RPN loss function; and before the bbox spec is normalised to 0-1 range. Also two of the delta numbers are log(delta). I can possibly envisage that