Data di Pubblicazione:
2018
Abstract:
We propose an approach to clustering XML-based
corpora of healthcare documents by their latent topic similarity.
Our approach is a two-step process. Initially, the latent topic
distributions of the input healthcare documents are inferred, by
performing collapsed Gibbs sampling and parameter estimation
under an XML topic model. Subsequently, the inferred distributions
are grouped through established clustering techniques.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
Topical Clusters; Semistructured Healthcare Data Analysis
Elenco autori:
Ortale, Riccardo; Costa, Giovanni
Link alla scheda completa: