Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Pubblicazioni

Solving imbalanced learning with outlier detection and features reduction

Articolo
Data di Pubblicazione:
2023
Abstract:
A critical problem for several real world applications is class imbalance. Indeed, in contexts like fraud detection or medical diagnostics, standard machine learning models fail because they are designed to handle balanced class distributions. Existing solutions typically increase the rare class instances by generating synthetic records to achieve a balanced class distribution. However, these procedures generate not plausible data and tend to create unnecessary noise. We propose a change of perspective where instead of relying on resampling techniques, we depend on unsupervised features engineering approaches to represent records with a combination of features that will help the classifier capturing the differences among classes, even in presence of imbalanced data. Thus, we combine a large array of outlier detection, features projection, and features selection approaches to augment the expressiveness of the dataset population. We show the effectiveness of our proposal in a deep and wide set of benchmarking experiments as well as in real case studies.
Tipologia CRIS:
01.01 Articolo in rivista
Keywords:
Imbalanced data learning; Outlier detection; Features reduction; Features selection; Classification framework
Elenco autori:
Guidotti, Riccardo
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/452168
Link al Full Text:
https://iris.cnr.it//retrieve/handle/20.500.14243/452168/137786/prod_490298-doc_204277.pdf
Pubblicato in:
MACHINE LEARNING
Journal
  • Dati Generali

Dati Generali

URL

https://link.springer.com/article/10.1007/s10994-023-06448-0
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)