[ECCV 2024]Visible and Clear: Finding Tiny Objects in Difference Map

夏莉莉iy

于 2025-06-29 04:16:48 发布

阅读量576

点赞数 18

CC 4.0 BY-SA版权

分类专栏：论文精读文章标签：深度学习人工智能机器学习计算机视觉目标检测 python yolo

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/Sherlily/article/details/148985830

论文精读专栏收录该内容

192 篇文章

订阅专栏

论文网址：[2405.11276] Visible and Clear: Finding Tiny Objects in Difference Map

论文代码：Hiyuur/SR-TOD: This is the official code for the paper Visible and Clear: Finding Tiny Objects in Difference Map.

英文是纯手打的！论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误，若有发现欢迎评论指正！文章偏向于笔记，谨慎食用

目录

2. 论文逐段精读

2.2. Introduction

2.3. Related Work

2.3.1. Object Detection

2.3.2. Tiny Object Detection

2.3.3. Anti-UAV Dataset

2.4.1. Overall Architecture

2.4.2. Difference Map

2.4.3. Difference Map Guided Feature Enhancement

2.4.4. DroneSwarms Dataset

2.5. Experiment

2.5.1. Experimental Setting

2.5.2. Results on DroneSwarms

2.5.3. Results on VisDrone2019 and AI-TOD

2.5.4. Ablation Study and Discussion

2.5.5. Visualization Analysis

2.6. Conclusion

1. 心得

（1）大晚上睡不着觉看个短篇怡情吧

（2）想起来了是个对我来说还挺新颖的idea，已经不是初见了比比

（3）到底是哪些老师在没有爷爷先有孙子？感觉论文相关工作还是得从最开始的拉到现在吧，直接一竿子打死说经典的论文时间上太久的老师是真难评啊。建议多看看论文而不是公众号

（4）ECCV风格真的很明显的简单易懂但很有新意！适合当作睡前读物

2. 论文逐段精读

2.1. Abstract

①The solution of tiny object detection in existing works: feature enhancement. While the limitations are spurious textures and artifacts

2.2. Introduction

①Definition of object:

	very tiny	tiny	small
MS COCO	-	-	≤ 32 × 32
AI-TOD	2~8	8~16	16~32

②Problem: downsampling swallows tiny objects

③Feature map example of "disappeared" tiny drone:

④创新性就是如上图，用小目标存在的图减去小目标被背景吞掉的图（很容易下采样没了）就是小目标本身。然后还发了个数据集，贡献力度还是挺大的

2.3. Related Work

2.3.1. Object Detection

①Lists traditional two-stage and one-stage object detections

②介绍完之后没有评价什么，就说都很成熟了让他们很容易集成。其实不总需要批判，到底是哪些老师非要现有不足不足不足然后学生一个人开天辟地啊（作者这样写是因为真的集成了啊其他人不用乱抄）

2.3.2. Tiny Object Detection

①Existing works: focus on data augmentation, scale awareness, context modeling, feature imitation, label assignment

2.3.3. Anti-UAV Dataset

①Current UAV datasets: MAV-VID, Drone-vs-Bird, and DUT Anti-UAV

2.4. Method

2.4.1. Overall Architecture

①Framework of their work:

2.4.2. Difference Map

①The up block is constructed by:

$Up(X)=\delta(Conv2(\delta(Conv1(TranConv(X)))))$

where $\delta$ denotes ReLU, $Conv$ is convolution with kernel size of $C \times C \times 3\times 3$ , $TranConv$ denotes the Transpose Convolution（again，写视觉的公式略显无聊，主要是图上都有了就把名字搬下来。后面就不写了只搬运形状吧）

②RH里面Conv的核大小是 $3 \times C \times 3\times 3$

③Difference map $D$ comes from:

$D=Mean_{channel}(Abs(I_r-I_o))$

where $Mean_{channel}$ denotes computing the mean value along the channel dimension, and $Abs$ denotes computing the absolute value of each element.

④Reconstruction loss: MSE

2.4.3. Difference Map Guided Feature Enhancement

①Element-wise attention matrix in Difference Map Guided Feature Enhancement (DGFE) is $M\in\mathbb{R}^{C\times H\times W}$

②Filtration block:

$Filtration(D)=Resize(D_b)+1=Resize((Sign(D-t)+1)\times0.5)+1$

where $Sign$ denotes the Sign function（是什么具体的信号变换方程就叫这个名字吗？）, $t$ denotes learnable threshold

2.4.4. DroneSwarms Dataset

①Including 9,109 images and 242,218 annotated UAV instances, with 2,532 used for testing and 6,577 used for training. On average, each image contains 26.59 drone instances. The images are in the size of 1920 × 1080, manually labeled with high precision.

②Enviroment: urban environments, mountainous terrain, and skies, among others

③Contains 241,249 tiny objects with size of 32 pixels or below, accounting for approximately 99.60%, and the average size is only about 7.9 pixels. The drones are dispersed across the entirety of the image.

2.5. Experiment

2.5.1. Experimental Setting

①Datasets: DroneSwarms, VisDrone2019, and AI-TOD

（1）DroneSwarms

①Initial learning rate: 0.0025

②Optimizer: Stochastic Gradient Descent (SGD) with 0.9 momenta, 0.0001 weight decay

③Epoch: 20（作者是不是采用了非常中式的英语啊...我比较少有看到论文里面这么写的：

类似中文“两个batch size”什么的...但实际上要写成with a batch size of 2什么的吧？主要是2也不是形容词）

④Batch size: 2

⑤Anchor scale

（2）VisDrone2019 and AI-TOD

①Initial learning rate: 0.005

②Optimizer: Stochastic Gradient Descent (SGD) with 0.9 momenta, decays at the 8th and 11th epochs

2.5.2. Results on DroneSwarms

①Performance table:

2.5.3. Results on VisDrone2019 and AI-TOD

①Performance table on VisDrone2019:

②Performance on AI-TOD:

2.5.4. Ablation Study and Discussion

①Module ablation:

②Ablation study on threshold:

③Ablation of feature enhancement methods:

④Performance of different designs of difference map:

⑤Different types of difference map:

2.5.5. Visualization Analysis

①Visualization on DroneSwarms:

2.6. Conclusion

~

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。