Geometric SMOTE: Effective oversampling for imbalanced learning through a geometric extension of SMOTE

Douzas, Georgios; Bacao, Fernando

Computer Science > Machine Learning

arXiv:1709.07377 (cs)

[Submitted on 21 Sep 2017]

Title:Geometric SMOTE: Effective oversampling for imbalanced learning through a geometric extension of SMOTE

Authors:Georgios Douzas, Fernando Bacao

View PDF

Abstract:Classification of imbalanced datasets is a challenging task for standard algorithms. Although many methods exist to address this problem in different ways, generating artificial data for the minority class is a more general approach compared to algorithmic modifications. SMOTE algorithm and its variations generate synthetic samples along a line segment that joins minority class instances. In this paper we propose Geometric SMOTE (G-SMOTE) as a generalization of the SMOTE data generation mechanism. G-SMOTE generates synthetic samples in a geometric region of the input space, around each selected minority instance. While in the basic configuration this region is a hyper-sphere, G-SMOTE allows its deformation to a hyper-spheroid and finally to a line segment, emulating, in the last case, the SMOTE mechanism. The performance of G-SMOTE is compared against multiple standard oversampling algorithms. We present empirical results that show a significant improvement in the quality of the generated data when G-SMOTE is used as an oversampling algorithm.

Comments:	22 pages, 15 figures
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:1709.07377 [cs.LG]
	(or arXiv:1709.07377v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1709.07377

Submission history

From: Fernando Bacao [view email]
[v1] Thu, 21 Sep 2017 15:33:33 UTC (1,381 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2017-09

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Georgios Douzas
Fernando Baçao
Fernando Bação

export BibTeX citation

Computer Science > Machine Learning

Title:Geometric SMOTE: Effective oversampling for imbalanced learning through a geometric extension of SMOTE

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Geometric SMOTE: Effective oversampling for imbalanced learning through a geometric extension of SMOTE

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators