Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • People
  • Outputs
  • Organizations
  • Expertise & Skills

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • People
  • Outputs
  • Organizations
  • Expertise & Skills
  1. Outputs

Hierarchical Bayesian text modeling for the unsupervised joint analysis of latent topics and semantic clusters

Academic Article
Publication Date:
2022
abstract:
Topic modeling can be unified synergically with document clustering. In this manuscript, we propose two innovative unsupervised approaches for the combined modeling and interrelated accomplishment of the two tasks. Both approaches rely on respective Bayesian generative models of topics, contents and clusters in textual corpora. Such models treat topics and clusters as linked latent factors in document wording. In particular, under the generative model of the second approach, textual documents are characterized by topic distributions, that are allowed to vary around the topic distributions of their membership clusters. Within the devised models, algorithms are designed to implement Rao-Blackwellized Gibbs sampling together with parameter estimation. These are derived mathematically for carrying out topic modeling with document clustering in a simultaneous and interrelated manner. A comparative empirical evaluation demonstrates the effectiveness of the presented approaches, over different families of state-of-the-art competitors, in clustering real-world benchmark text collections and, also, uncovering their underlying semantics. Besides, a case study is developed as an insightful qualitative analysis of results on real-world text corpora.
Iris type:
01.01 Articolo in rivista
Keywords:
Bayesian text analysis; Topic modeling; Document clustering; Hierarchical priors
List of contributors:
Ortale, Riccardo; Costa, Giovanni
Authors of the University:
COSTA GIOVANNI
ORTALE RICCARDO
Handle:
https://iris.cnr.it/handle/20.500.14243/417075
Published in:
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING
Journal
  • Overview

Overview

URL

http://www.scopus.com/record/display.url?eid=2-s2.0-85130191403&origin=inward
  • Use of cookies

Powered by VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)