ICCV2017PersonRe-Id资源-CSDN文库资源-CSDN文库

共9个文件

pdf：9个

ICCV2017

5星 · 超过95%的资源需积分: 50 172 浏览量 2017-09-06 09:37:07 上传评论 2 收藏 20.93MB ZIP 举报

**ICCV2017 Person Re-Id** 是一个与计算机视觉领域相关的资源集合，尤其聚焦于行人重识别（Person Re-Identification，简称Re-ID）技术。ICCV，即国际计算机视觉大会，是该领域最高级别的学术会议之一，每两年举办一次，汇聚了全球顶尖的研究成果。在2017年的会议上，有9篇关于行人重识别的论文被收录，这些论文涵盖了Cross View方法、Unsupervised方法以及改进的Triplet Loss方法等多个方向。行人重识别是一个极具挑战性的任务，其目标是在不同的摄像头视角或非同质监控系统中识别同一行人的身份。这一技术对于智能安全监控、追踪犯罪行为和提升公共安全有着重要的应用价值。 1. **Cross View 方法**：行人重识别中的Cross View问题通常指的是不同摄像头视角下的行人匹配。由于视角差异，行人外观特征可能会发生显著变化，如身体比例、遮挡程度等。Cross View方法研究如何处理这种视角差异，建立跨视角的对应关系。这可能涉及到视角不变性特征学习、多视角建模以及视角转换技术。 2. **Unsupervised 方法**：在实际场景中，获取大量带有标签的训练数据往往十分困难。因此，Unsupervised方法研究如何在无监督或弱监督的环境下进行行人重识别。这类方法通常利用聚类、自编码器、对抗学习等手段，从大量无标签数据中挖掘有效特征，实现行人身份的自动分类。 3. **改进的Triplet Loss 方法**：在深度学习中，Triplet Loss是一种用于度量学习的损失函数，旨在拉近同一类别内的样本距离，同时推远不同类别的样本距离。在行人重识别任务中，它有助于学习具有辨别力的特征表示。改进的Triplet Loss可能包括优化的样本采样策略、损失函数的变体或者结合其他损失函数来提高性能。在ICCV2017收录的这9篇论文中，作者们可能探索了上述方法的创新点、实验结果以及实际应用情况。通过深入阅读这些论文，我们可以了解最新的行人重识别技术进展，包括特征提取、表示学习、模型优化等多个方面，为今后的相关研究提供理论支持和实践指导。由于具体的论文内容并未提供，这里只是对行人重识别领域的常见技术和ICCV2017年会的焦点进行了概述。如果你能获取到这9篇论文的详细资料，将会有更深入的收获，例如了解具体算法的实现细节、实验设计、评估指标以及所取得的突破。

资源推荐

资源详情

资源评论

收起资源包目录

ICCV17 for re-id.zip （9个子文件）

ICCV17

ICCV17_Deeply-Learned Part-Aligned Representations for Person Re-Identification.pdf 753KB

ICCV17_SVDNet for Pedestrian Retrieval.pdf 1.06MB

ICCV17_Group Re-Id via Unsupervised Transfer of Sparse Features Encoding.pdf 5.96MB

ICCV17_Pose-driven Deep Convolutional Model for Person Re-identification.pdf 938KB

ICCV17_In Defense of the Triplet Loss for Person Re-Identification.pdf 7.63MB

ICCV17_RGB-Infrared Cross-Modality Person Re-Identification.pdf 2.12MB

ICCV17_Cross-view Asymmetric Metric Learning for Unsupervised Re-id.pdf 1.06MB

ICCV17_Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro.pdf 740KB

ICCV17_Jointly Attentive Spatial-Temporal Pooling Networks for Video-based Person Re-Identification.pdf 2.15MB

In Defense of the Triplet Loss for Person Re-Identiﬁcation

Alexander Hermans

∗

, Lucas Beyer

∗

and Bastian Leibe

Visual Computing Institute

RWTH Aachen University

[email protected]

Abstract

In the past few years, the ﬁeld of computer vision has

gone through a revolution fueled mainly by the advent

of large datasets and the adoption of deep convolutional

neural networks for end-to-end learning. The person re-

identiﬁcation subﬁeld is no exception to this, thanks to the

notable publication of the Market-1501 and MARS datasets

and several strong deep learning approaches. Unfortu-

nately, a prevailing belief in the community seems to be that

the triplet loss is inferior to using surrogate losses (classi-

ﬁcation, veriﬁcation) followed by a separate metric learn-

ing step. We show that, for models trained from scratch as

well as pretrained ones, using a variant of the triplet loss to

perform end-to-end deep metric learning outperforms any

other published method by a large margin.

1. Introduction

In recent years, person re-identiﬁcation (ReID) has at-

tracted signiﬁcant attention in the computer vision commu-

nity. Especially with the rise of deep learning, many new

approaches have been proposed to achieve this task [18, 1,

32, 6, 34, 24, 31, 8, 41, 3, 37, 42]

. In many aspects per-

son ReID is similar to image retrieval, where signiﬁcant

progress has been made and where deep learning has re-

cently introduced a lot of changes. One prominent example

in the recent literature is FaceNet [23], a convolutional neu-

ral network (CNN) used to learn an embedding for faces.

The key component of FaceNet is to use the triplet loss, as

introduced by Weinberger and Saul [33], for training the

CNN as an embedding function. The triplet loss optimizes

the embedding space such that data points with the same

identity are closer to each other than those with different

identities. A visualization of such an embedding is shown

in Fig. 1.

Several approaches to person ReID have already used

some variant of the triplet loss to train their models [15,

7, 22, 6, 32, 24, 26, 20, 5]. However, the recently most suc-

∗

Equal contribution. Ordering determined by a last minute coin ﬂip.

A nice overview of the ﬁeld is given by a recent survey paper [40].

Figure 1: A small crop of the Barnes-Hut t-SNE [30] of

our learned embeddings for the Market-1501 test-set. The

triplet loss learns semantically meaningful features.

cessful person ReID approaches argue that a classiﬁcation

loss, possibly combined with a veriﬁcation loss, is superior

for the task [5, 40, 8, 41]. Typically, these approaches train

a deep CNN using one or multiple of these surrogate losses

and subsequently use a part of the network as a feature ex-

tractor, combining it with a metric learning approach to gen-

erate ﬁnal embeddings. Both of these losses have their prob-

lems, though. The classiﬁcation loss necessitates a growing

number of learnable parameters as the number of identities

increases, most of which will be discarded after training. On

the other hand, many of the networks trained with a veriﬁ-

cation loss have to be used in a cross-image representation

mode, only answering the question “How similar are these

two images?”. This makes using them for any other task,

arXiv:1703.07737v2 [cs.CV] 27 Mar 2017

such as clustering or retrieval, prohibitively expensive, as

each probe has to go through the network paired up with

every gallery image.

In this paper we show that, contrary to current opin-

ion, a plain CNN with a triplet loss can outperform current

state-of-the-art approaches on both the Market-1501 [39]

and MARS [38] datasets. The triplet loss allows us to per-

form end-to-end learning between the input image and the

desired embedding space. This means we directly optimize

the network for the ﬁnal task, which renders an additional

metric learning step obsolete. Instead, we can simply com-

pare persons by computing the Euclidean distance of their

embeddings.

A possible reason for the unpopularity of the triplet loss

is that, when applied na

ıvely, it will indeed often produce

disappointing results. An essential part of learning using

the triplet loss is the mining of hard triplets, as otherwise

training will quickly stagnate. However, mining such hard

triplets is time consuming and it is unclear what deﬁnes

“good” hard triplets [23, 24]. Even worse, selecting too

hard triplets too often makes the training unstable. We show

how this problem can be alleviated, resulting in both faster

training and better performance. We systematically analyze

the design space of triplet losses, and evaluate which one

works best for person ReID. While doing so, we place two

previously proposed variants [7, 25] into this design space

and discuss them in more detail in Section 2. Speciﬁcally,

we ﬁnd that the best performing version has not been used

before. Furthermore we also show that a margin-less formu-

lation performs slightly better, while removing one hyper-

parameter.

Another clear trend seems to be the use of pretrained

models such as GoogleNet [28] or ResNet-50 [12]. In-

deed, pretrained models often obtain great scores for person

ReID [8, 41], while ever fewer top-performing approaches

use networks trained from scratch [18, 1, 6, 34, 24, 31, 3].

Some authors even argue that training from scratch is

bad [8]. However, using pretrained networks also leads to

a design lock-in, and does not allow for the exploration

of new deep learning advances or different architectures.

We show that, when following best practices in deep learn-

ing, networks trained from scratch can perform compete-

tively for person ReID. Furthermore, we do not rely on

network components speciﬁcally tailored towards person

ReID, but train a plain feed-foward CNN, unlike many other

approaches that train from scratch [1, 31, 18, 34, 27]. In-

deed, our networks using pretrained weights obtain the best

results, but our far smaller architecture obtains respectable

scores, providing a viable alternative for applications where

person ReID needs to be performed on resource-constrained

hardware, such as embedded devices.

In summary our contribution is twofold: Firstly we intro-

duce variants of the classic triplet loss which render mining

of hard triplets unnecessary and we systematically evalu-

ate these variants. And secondly, we show how, contrary

to the prevailing opinion, using a triplet loss and no special

layers, we achieve state-of-the-art results both with a pre-

trained CNN and with a model trained from scratch.

2. Learning Metric Embeddings, the Triplet

Loss, and the Importance of Mining

The goal of metric embedding learning is to learn a func-

tion f

(x) : R

→ R

which maps semantically sim-

ilar points from the data manifold in R

onto metrically

close points in R

. Analogously, f

should map semanti-

cally different points in R

onto metrically distant points

in R

. The function f

is parametrized by θ and can be

anything ranging from a linear transform [33, 19, 36, 22] to

complex non-linear mappings usually represented by deep

neural networks [7, 6, 8]. Let D(x, y) : R

× R

→ R

be a metric function measuring distances in the embedding

space. For clarity we use the shortcut notation D

i,j

D(f

), f

)), where we omit the indirect dependence

of D

i,j

on the parameters θ. As is common practice, all

loss-terms are divided by the number of summands in a

batch; we omit this term in the following equations for con-

ciseness.

Weinberger and Saul [33] explore this topic with the ex-

plicit goal of performing k-nearest neighbor classiﬁcation in

the learned embedding space and propose the “Large Mar-

gin Nearest Neighbor loss” for optimizing f

LMNN

(θ) = (1 − µ)L

pull

(θ) + µL

push

(θ), (1)

which is comprised of a pull-term, pulling data points i to-

wards their target neighbor T (i) from the same class, and

a push-term, pushing data points from a different class k

further away:

pull

(θ) =

i,j∈T (i)

i,j

, (2)

push

(θ) =

a,n

6=y



m + D

a,T (a)

− D

a,n



. (3)

Because the motivation was nearest-neighbor classiﬁcation,

allowing disparate clusters of the same class was an explicit

goal, achieved by choosing ﬁxed target neighbors at the on-

set of training. Since this property is harmful for retrieval

tasks such as face and person ReID, FaceNet [23] proposed

a modiﬁcation of L

LMNN

(θ) called the “Triplet loss”:

tri

(θ) =

a,p,n

6=y

[m + D

a,p

− D

a,n

]

. (4)

This loss makes sure that, given an anchor point x

, the

projection of a positive point x

belonging to the same class

(person) y

is closer to the anchor’s projection than that of

a negative point belonging to another class y

, by at least a

margin m. If this loss is optimized over the whole dataset

for long enough, eventually all possible pairs (x

, x

) will

be seen and be pulled together, making the pull-term re-

dundant. The advantage of this formulation is that, while

eventually all points of the same class will form a single

cluster, they are not required to collapse to a single point;

they merely need to be closer to each other than to any point

from a different class.

A major caveat of the triplet loss, though, is that as the

dataset gets larger, the possible number of triplets grows

cubically, rendering a long enough training impractical. To

make matters worse, f

relatively quickly learns to correctly

map most trivial triplets, rendering a large fraction of all

triplets uninformative. Thus mining hard triplets becomes

crucial for learning. Intuitively, being told over and over

again that people with differently colored clothes are dif-

ferent persons does not teach one anything, whereas seeing

similarly-looking but different people (hard negatives), or

pictures of the same person in wildly different poses (hard

positives) dramatically helps understanding the concept of

“same person”. On the other hand, being shown only the

hardest triplets would select outliers in the data unpropor-

tionally often and make f

unable to learn “normal” asso-

ciations, as will be shown in Table 1. Examples of typi-

cal hard positives, hard negatives, and outliers are shown

in the Supplementary Material. Hence it is common to

only mine moderate negatives [23] and/or moderate pos-

itives [24]. Regardless of which type of mining is being

done, it is a separate step from training and adds consider-

able overhead, as it requires embedding a large fraction of

the data with the most recent f

and computing all pairwise

distances between those data points.

In a classical implementation, once a certain set of B

triplets has been chosen, their images are stacked into a

batch of size 3B, for which the 3B embeddings are com-

puted, which are in turn used to create B terms contributing

to the loss. Given the fact that there are up to 6B

− 4B

possible combinations of these 3B images that are valid

triplets, using only B of them seems wasteful. With this

realization, we propose an organizational modiﬁcation to

the classic way of using the triplet loss: the core idea is to

form batches by randomly sampling P classes (person iden-

tities), and then randomly sampling K images of each class

(person), thus resulting in a batch of P K images.

Now, for

each sample a in the batch, we can select the hardest posi-

tive and the hardest negative samples within the batch when

forming the triplets for computing the loss, which we call

In all experiments we choose B, P , and K in such a way that 3B is

close to P K, e.g. 3 · 42 ≈ 32 · 4.

Batch Hard:

(θ; X) =

all anchors

z }| {

i=1

a=1

m +

hardest positive

z }| {

max

p=1...K



), f

)



(5)

− min

j=1...P

n=1...K

j6=a



), f

)



| {z }

hardest negative

which is deﬁned for a mini-batch X and where a data point

corresponds to the j-th image of the i-th person in the

batch.

This results in P K terms contributing to the loss, a

threefold

increase over the traditional formulation. Addi-

tionally, the selected triplets can be considered moderate

triplets, since they are the hardest within a small subset of

the data, which is exactly what is best for learning with the

triplet loss.

This new formulation of sampling a batch immediately

suggests another alternative, that is to simply use all possi-

ble P K(P K − K)(K − 1) combinations of triplets, which

corresponds to the strategy chosen in [7] and which we call

Batch All:

(θ; X) =

all anchors

z }| {

i=1

a=1

all pos.

z}|{

p=1

p6=a

all negatives

z }| {

j=1

j6=i

n=1

m − d

i,a,p

j,a,n

, (6)

i,a,p

j,a,n

= D



), f

)



− D



), f

)



At this point, it is important to note that both L

and

still exactly correspond to the standard triplet loss

in the limit of inﬁnite training. Both the max and min

functions are continuous and differentiable almost every-

where, meaning they can be used in a model trained by

stochastic (sub-)gradient descent without concern. In fact,

they are already widely available in popular deep-learning

frameworks for the implementation of max-pooling and the

ReLU [9] non-linearity.

Most similar to our batch hard and batch all losses is

the Lifted Embedding loss [25], which ﬁlls the batch with

triplets but considers all but the anchor-positive pair as neg-

atives:

(θ; X) =

(a,p)∈X

a,p

+log

n∈X

n6=a,n6=p



m−D

a,n

+ e

m−D

p,n



While [25] motivates a “hard”-margin loss similar to L

and L

, they end up optimizing the smooth bound of it

given in the above equation. Additionally, traditional 3B

Because P K ≈ 3B, see footnote 2

batches are considered, thus using all possible negatives,

but only one positive pair per triplet. This leads us to pro-

pose a generalization of the Lifted Embedding loss based

on P K batches which considers all anchor-positive pairs as

follows:

(θ; X) =

all anchors

z }| {

i=1

a=1

log

all positives

z }| {

p=1

p6=a

(

),f

)

(7)

− log

j=1

j6=i

n=1

m−D

(

),f

)

| {z }

all negatives

Distance Measure. Throughout this section, we have

referred to D(a, b) as the distance function between a

and b in the embedding space. In most related works,

the squared Euclidean distance D (f

), f

)) =

) − f

is used as metric, although nothing

in the above loss deﬁnitions precludes using any other

(sub-)differentiable distance measure. While we do not

have a side-by-side comparison, we noticed during initial

experiments that using the squared Euclidean distance made

the optimization more prone to collapsing, whereas using

the actual (non-squared) Euclidean distance was more sta-

ble. We hence used the Euclidean distance throughout all

our experiments presented in this paper. In addition, squar-

ing the Euclidean distance makes the margin parameter less

interpretable, as it does not represent an absolute distance

anymore. Note that when forcing the embedding’s norm to

one, using the squared Euclidean distance corresponds to

using the cosine-similarity, up to a factor of two.

Soft-margin. The role of the hinge function [m + •]

to avoid correcting “already correct” triplets. But in per-

son ReID, it can be beneﬁcial to pull together samples from

the same class as much as possible [36, 6], especially when

working on tracklets such as in MARS [38]. For this pur-

pose, it is possible to replace the hinge function by a smooth

approximation using the softplus function:

s(x) = ln(1 + exp(x)), (8)

for which numerically stable implementations are com-

monly available as log1p. The softplus function has similar

behavior to the hinge, but it decays exponentially instead of

having a hard cut-off, we hence refer to it as the soft-margin

formulation.

Summary. In summary, the novel contributions proposed

in this paper are the batch hard loss and its soft margin ver-

sion. In the following section we evaluate them experimen-

tally and show that, for ReID, they achieve superior perfor-

mance compared to both the traditional triplet loss and the

previously published variants of it [7, 25].

3. Experiments

Our experimental evaluation is split up into three main

parts. The ﬁrst section evaluates different variations of the

triplet loss, including some hyper-parameters, and identiﬁes

the setting that works best for person ReID. This evaluation

is performed on a train/validation split we create based on

the MARS training set. The second section shows the per-

formance we can attain based on the selected variant of the

triplet loss. We show state-of-the-art results on both the

Market-1501 and MARS test sets, based on a pretrained

network and a network trained from scratch. Finally, the

third section discusses advantages of training models from

scratch with respect to real-world use cases.

3.1. Datasets

We focus on the Market-1501 [39] and MARS [38]

datasets, the two largest person ReID datasets currently

available. The Market-1501 dataset contains bounding

boxes from a person detector which have been selected

based on their intersection-over-union overlap with manu-

ally annotated bounding boxes. It contains 32 668 images

of 1501 persons, split into train/test sets of 12 936/19 732

images as deﬁned by [39]. The dataset uses both single-

and multi-query evaluation, we report numbers for both.

The MARS dataset originates from the same raw data as

the Market-1501 dataset; however, a signiﬁcant difference

is that the MARS dataset does not have any manually an-

notated bounding boxes, reducing the annotation overhead.

MARS consist of “tracklets” which have been grouped into

person IDs manually. It contains 1 191 003 images split

into train/test sets of 509 914/681 089 images, as deﬁned

by [38]. Here, person ReID is no longer performed on a

frame-to-frame level, but instead on a tracklet-to-tracklet

level, where feature embeddings are pooled across a track-

let, thus it is inherently a multi-query setup.

We use the standard evaluation metrics for both datasets,

namely the mean average precision score (mAP) and the

cumulative matching curve (CMC) at rank-1 and rank-5. To

compute these scores we use the evaluation code provided

by [43].

3.2. Training

Unless speciﬁcally noted otherwise, we use the same

training procedure across all experiments and on all

datasets. We performed all our experiments using the

Theano [4] framework, all code is available at https://

github.com/VisualComputingInstitute/triplet-reid.

We use the Adam optimizer [16] with the default hyper-

parameter values ( = 10

−3

, β

= 0.9, β

= 0.999) for

margin 0.1 margin 0.2 margin 0.5 margin 1.0 soft margin

mAP rank-1 mAP rank-1 mAP rank-1 mAP rank-1 mAP rank-1

Triplet (L

tri

) 40.80 59.23 41.71 60.78 43.51 60.87 43.61 61.63 48.40 66.37

Triplet (L

tri

) + OHM 16.6* 36.6* 61.40 82.95 32.0* 57.1* 41.45 59.42 46.63 65.43

Batch hard (L

) 65.09 83.51 65.27 84.55 65.12 83.39 63.78 82.48 65.77 84.69

Batch hard (L

BH6=0

) 63.10 83.04 64.19 83.42 63.71 82.29 64.06 84.50 - -

Batch all (L

) 59.43 79.24 60.48 79.99 60.30 79.52 62.08 80.55 61.04 80.65

Batch all (L

BA6=0

) 63.29 83.65 64.31 83.37 64.41 83.98 64.06 82.90 - -

Lifted 3-pos. (L

) 64.00 82.71 63.87 82.86 63.61 84.55 64.02 84.17 - -

Lifted 1-pos. (L

) [25] 61.95 81.35 63.68 81.73 63.01 82.48 62.28 82.34 - -

Table 1: LuNet scores on our MARS validation split. The best performing loss at a given margin is bold, the best margin for a

given loss is italic, and the overall best combination is highlighted in green. A * denotes runs trapped in a bad local optimum.

most experiments. During initial experiments on our own

MARS validation split (see Sec. 3.4), we ran multiple ex-

periments for a very long time and monitored the loss and

mAP curves. With this information, we decided to ﬁx the

following exponentially decaying training schedule, which

does not disadvantage any setup, for all experiments pre-

sented in this paper:

(t) =

(



if t ≤ t



0.001

t−t

−t

if t

≤ t ≤ t

(9)

with 

= 10

−3

, t

= 15 000, and t

= 25 000, stopping

training when reaching t

. We also set β

= 0.5 when

entering the decay schedule at t

, as is common practice [2].

3.3. Network Architectures

For our main results we use two different architectures,

one based on a pretrained network and one which we train

from scratch.

Pretrained. We use the ResNet-50 architecture and the

weights provided by He et al. [12]. We discard the last

layer and add two fully connected layers for our task. The

ﬁrst has 1024 units, followed by batch normalization [14]

and ReLU [9], the second goes down to 128 units, our ﬁnal

embedding dimension. Trained with our batch hard triplet

loss, we call this model TriNet. Due to the size of this net-

work (25.74 M parameters), we had to limit our batch size

to 72, containing P = 18 persons with K = 4 images each.

For these pretrained experiments, 

= 10

−3

proved to be

too high, causing the models to diverge within few itera-

tions. We thus reduced 

to 3 · 10

−4

which worked ﬁne on

both datasets.

Trained from Scratch. To show that training from scratch

does not necessarily result in poor performance, we also de-

signed a network called LuNet which we train from scratch.

LuNet follows the style of ResNet-v2, but uses leaky ReLU

nonlinearities, multiple 3 × 3 max-poolings with stride 2

instead of strided convolutions, and omits the ﬁnal average-

pooling of feature-maps in favor of a channel-reducing ﬁ-

nal res-block. An in-depth description of the architecture

is given in the Supplementary Material and the accompany-

ing source-code. As the network is much more lightweight

(5.00 M parameters) than its pretrained sibling, we sample

batches of size 128, containing P = 32 persons with K = 4

images each.

3.4. Triplet Loss

Our initial experiments test the different variants of

triplet training that we discussed in Sec. 2. In order not to

perform model-selection on the test set, we randomly sam-

ple a validation set of 150 persons from the MARS training

set, leaving the remaining 475 persons for training. In order

to make this exploration tractable, we run all of these exper-

iments using the smaller LuNet trained from scratch on im-

ages downscaled by a factor of two. Since our goal here is

to explore triplet loss formulations, as opposed to reaching

top performance, we do not perform any data augmentation

in these experiments.

Table 1 shows the resulting mAP and rank-1 scores for

the different formulations at multiple margin values, and

with a soft-margin (Eq. 8) where applicable. Consistent

with results reported in several recent papers [8, 7, 5],

the vanilla triplet loss with randomly sampled triplets per-

forms poorly. When performing simple ofﬂine hard-mining

(OHM), the scores sometimes increase dramatically, but the

training also fails to learn useful embeddings for multiple

margin values. This problem is well-known [23, 24] and

has been discussed in Sec. 2. While the idea of learning em-

beddings using triplets is theoretically pleasing, this practi-

cal ﬁnnickyness, coupled with the considerable increase in

training time due to non-parallelizable ofﬂine mining (from

7 h to 20 h in our experiments), makes learning with vanilla

triplets rather unattractive.

Considering the long training times, it is nice to see that

评论收藏

内容反馈

CC_Sonmus

2018-08-27

行人重识别相关论文，很好，很强大

linolzhang

粉丝: 2131

ICCV2017 Person Re-Id

part_reid:ICCV2017论文代码

ICCV2017_Oral_and_Best_Paper

iccv论文集

2016CVPR Person Re-id

2015ICCV_person_re-id

Su_Pose-Driven_Deep_Convolutional_ICCV_2017_paper-耿韶松1

ICCV2017.zip

Scalable Person Re-identification: A Benchmark 外文翻译

ICCV2017 (6).zip

深度学习三维重建 SurfaceNet-ICCV-2017（源码+原文）

期权matlab代码-Spatial-Temporal-Pooling-Networks-ReID:ICCV2017论文的代码-联合专注的时空

ICCV2017-plenoptic-camera-calibration:该存储库包含为我们的ICCV 2017论文开发的全光摄像机校准软件

grad-cam：[ICCV 2017] Grad-CAM的火炬代码

DSOD：从头开始学习深度监督的对象检测器。 在ICCV 2017中。-Python开发

ICCV2017 (4).zip

CVWC2019-Amur-Tiger-Re-ID:在ICCV19研讨会上，在平地赛道的Tiger Re-ID和在野外赛道（CVWC）的Tiger Re-ID中均获得第一名的解决方案代码

Facebook_ICCV2017

matlab代码abs-PPR-FCN:这是我们ICCV2017论文PPR-FCN的代码库

ICCV2017 (2).zip

DeepSeek从入门到精通-清华大学-202502.pdf

YOLOv8-deepsort 实现智能车辆目标检测+车辆跟踪+车辆计数

cursor-auto-free-Cursor无限PRO免费用

DEEP SEEK 本地部署（Ollama + ChatBox）+ 私有知识库（cherry studio）教程

YOLOv8网络结构图，自制visio文件，yolov8.vsds，需要的自取，在原有的基础上直接改就行了

yolov8(2023年8月版本),已经下好yolov8s.pt和yolov8n.pt

Transformer模型实现长期预测并可视化结果（附代码+数据集+原理介绍）

DeepSeek从入门到精通-清华大学

社交平台上经济类话题的文章热度信息，数据是真实的，但不是真实日期

清华deepseek入门到精通文档 夸克网盘资源下载

最新资源

DSOD：从头开始学习深度监督的对象检测器。在ICCV 2017中。-Python开发

清华deepseek入门到精通文档夸克网盘资源下载