Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Pubblicazioni

Scalable k-NN based text clustering

Contributo in Atti di convegno
Data di Pubblicazione:
2015
Abstract:
Clustering items using textual features is an important problem with many applications, such as root-cause analysis of spam campaigns, as well as identifying common topics in social media. Due to the sheer size of such data, algorithmic scalability becomes a major concern. In this work, we present our approach for text clustering that builds an approximate kNN graph, which is then used to compute connected components representing clusters. Our focus is to understand the scalability / accuracy tradeoff that underlies our method: we do so through an extensive experimental campaign, where we use real-life datasets, and show that even rough approximations of k-NN graphs are sufficient to identify valid clusters. Our method is scalable and can be easily tuned to meet requirements stemming from different application domains.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
Clustering Algorithm; Distributed architectures
Elenco autori:
Ricci, Laura; Lulli, Alessandro
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/315394
Link al Full Text:
https://iris.cnr.it//retrieve/handle/20.500.14243/315394/157427/prod_347340-doc_156947.pdf
  • Dati Generali

Dati Generali

URL

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7363845
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)