Publication Date:
2019
abstract:
We present the first work to our knowledge on automatic age identification for Italian texts. For this work we built a dataset consisting of more than 2.400.000 posts extracted from publicly available forums and containing authorship attribution metadata, such as age and gender. We developed an age classifier and performed a set of experiments with the aim of evaluating the possibility of assigning the correct age of an user and which information is useful to tackle this task: lexical or linguistic information spanning across different levels of linguistic descriptions. The performed experiments show the importance of lexical information in age classification, but also that exists writing style that relates to the age of an user.
Iris type:
04.01 Contributo in Atti di convegno
Keywords:
authorship profiling
List of contributors:
Cimino, Andrea; Dell'Orletta, Felice
Published in: