[Few-shot]2020年few-shot detection综述_few shot detection-CSDN博客

本文链接：https://blog.csdn.net/Fire_to_cheat_/article/details/103833609

本文探讨了少样本目标检测的挑战与进展，介绍了Episode训练策略，并精选了几篇近期高质量研究论文，涵盖Zero-Shot及Few-shot场景下目标检测的创新方法。论文涉及特征重加权、元学习、对比网络等技术，旨在提升模型在少量样本下的泛化能力。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1. Background

Few-Shot Detection的任务目的在于，在预训练的基础上，通过少量新样本，学会检测到新样本的目标的能力。

1.1 Episode 训练策略 [from Karpathy]

As the authors amusingly point out in the conclusion (and this is the duh of course part), “one-shot learning is much easier if you train the network to do one-shot learning”. Therefore, we want the test-time protocol (given N novel classes with only k examples each (e.g. k = 1 or 5), predict new instances to one of N classes) to exactly match the training time protocol.

To create each “episode” of training from a dataset of examples then:

Sample a task T from the training data, e.g. select 5 labels, and up to 5 examples per label (i.e. 5-25 examples).
To form one episode sample a label set L (e.g. {cats, dogs}) and then use L to sample the support set S and a batch B of examples to evaluate loss on.

The idea on high level is clear but the writing here is a bit unclear on details, of exactly how the sampling is done.

2. 选择几篇相关领域近期高质量论文

[ICCV 2019] (paper) Transductive Learning for Zero-Shot Object Detection

Proposed generalized ZSD: both seen and unseen objects co-occur during inference.
transductive ZSD setting, unlabeled test examples are available
during model training.
Self-learning mechanism that uses a novel hybrid pseudo-labeling technique keeping the source domain knowledge not forgotten.

[ICCV 2019] (paper code) Few-shot Object Detection via Feature Reweighting

“Feature Reweighting” as weights & useful loss function
Meta Feature Learner
Reweighting Module
One-Stage detection

在这里插入图片描述

[2019 Arxiv] (paper) Comparison Network for One-Shot Conditional Object Detection

在这里插入图片描述

[NIPS 2019] (paper code) One-Shot Object Detection with Co-Attention and Co-Excitation
在这里插入图片描述

Non-local Operate & Squeeze and Excitation

3. Related Papers

1. Co-Attention and Co-Excitation related

One-stage detection

[2] Tao Kong, Fuchun Sun, Huaping Liu, Yuning Jiang, and Jianbo Shi. Foveabox: Beyond
anchor-based object detector. CoRR, abs/1904.03797, 2019.
[3] Hei Law and Jia Deng. Cornernet: Detecting objects as paired keypoints. In Computer
Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018,
Proceedings, Part XIV, pages 765–781, 2018.

Two-stage detection

[10] Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross B. Girshick. Mask R-CNN. In IEEE
International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017,
pages 2980–2988, 2017.

Few-shot Classification

[16] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly
learning to align and translate. In 3rd International Conference on Learning Representations,
ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.

Few-shot Classification四大流行网络：

Siamese Network
[14] Gregory R. Koch. Siamese neural networks for one-shot image recognition. 2015.
Matching Network
[15] Oriol Vinyals, Charles Blundell, Tim Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. Matching networks for one shot learning. In Advances in Neural Information Processing Systems 29:
Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016,
Barcelona, Spain, pages 3630–3638, 2016.
Prototype Network
[18] Jake Snell, Kevin Swersky, and Richard S. Zemel. Prototypical networks for few-shot learning.
In Advances in Neural Information Processing Systems 30: Annual Conference on Neural
Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, pages
4080–4090, 2017.
Relation Network
[19] Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip H. S. Torr, and Timothy M. Hospedales.
Learning to compare: Relation network for few-shot learning. In 2018 IEEE Conference on
Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22,
2018, pages 1199–1208, 2018.

Few-shot Detection

[22] Claudio Michaelis, Ivan Ustyuzhaninov, Matthias Bethge, and Alexander S. Ecker. One-shot
instance segmentation. CoRR, abs/1811.11507, 2018.

Siamese Mask RCNN
Provide a first strong baseline for one-shot instance segmentation.
For Simplification, 作者将数据处理为默认测试图片中一定至少有一个目标对象

[23] Eli Schwartz, Leonid Karlinsky, Joseph Shtok, Sivan Harary, Mattias Marder, Sharathchandra
Pankanti, Rogério Schmidt Feris, Abhishek Kumar, Raja Giryes, and Alexander M. Bronstein.
Repmet: Representative-based metric learning for classification and one-shot object detection.
CoRR, abs/1806.04728, 2018.

Others

[25] Xiaolong Wang, Ross B. Girshick, Abhinav Gupta, and Kaiming He. Non-local neural networks.
In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake
City, UT, USA, June 18-22, 2018, pages 7794–7803, 2018.

[26] Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. 2018.

2. Feature Reweighting

Meta-Learning

[12] Chelsea Finn, Pieter Abbeel, and Sergey Levine. Modelagnostic meta-learning for fast adaptation of deep networks.
ICML, 2017.

Few-shot Classification

[9] Matthijs Douze, Arthur Szlam, Bharath Hariharan, and
Herve J ´ egou. Low-shot learning with large-scale diffusion. ´
In Computer Vision and Pattern Recognition (CVPR), 2018.

Few-shot in Meta-learning

Optimization for fast adaptation

[28] Sachin Ravi and Hugo Larochelle. Optimization as a model
for few-shot learning. In ICLR, 2017.
[12] Chelsea Finn, Pieter Abbeel, and Sergey Levine. Modelagnostic meta-learning for fast adaptation of deep networks.
ICML, 2017

Parameter prediction

[2] Luca Bertinetto, Joao F Henriques, Jack Valmadre, Philip ˜
Torr, and Andrea Vedaldi. Learning feed-forward one-shot
learners. In Advances in Neural Information Processing Systems, pages 523–531, 2016.

Others
[16] Bharath Hariharan and Ross Girshick. Low-shot visual
recognition by shrinking and hallucinating features. In 2017 IEEE International Conference on Computer Vision (ICCV),
pages 3037–3046. IEEE, 2017.

[40] Yu-Xiong Wang, Ross Girshick, Martial Hebert, and Bharath
Hariharan. Low-shot learning from imaginary data. In
CVPR, 2018.

[26] Hang Qi, Matthew Brown, and David G Lowe. Lowshot learning with imprinted weights. arXiv preprint
arXiv:1712.07136, 2017.

[13] Spyros Gidaris and Nikos Komodakis. Dynamic few-shot
visual learning without forgetting. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, pages 4367–4375, 2018.

Object detection with limited limited labels

weakly-supervised setting

[3] Hakan Bilen and Andrea Vedaldi. Weakly supervised deep
detection networks. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, pages 2846–
2854, 2016.
[7] Ali Diba, Vivek Sharma, Ali Mohammad Pazandeh, Hamed
Pirsiavash, and Luc Van Gool. Weakly supervised cascaded
convolutional networks. In CVPR, 2017.
[36] Hyun Oh Song, Yong Jae Lee, Stefanie Jegelka, and Trevor
Darrell. Weakly-supervised discovery of visual pattern configurations. In Advances in Neural Information Processing
Systems, pages 1637–1645, 2014.