Publication Date:
2016
abstract:
In this study we ask the question whether simplifying the data in dialectometrical studies by removing infrequent forms is advantageous to uncovering the geographical structure in dialect data. By investigating lexical variation in a large corpus of Tuscan dialect data via hierarchical bipartite spectral graph partitioning, we are able to identify the main geographical areas together with their linguistic basis. In order to assess the influence of infrequent forms, we conduct two analyses: one which includes only lexical variants used by at least 0.5% of the informants, and another which includes all lexical variants in the data. Using this approach we show that using all data enables us to find a geographical characterization with a more adequate linguistic basis than by using the trimmed data.
Iris type:
02.01 Contributo in volume (Capitolo o Saggio)
Keywords:
dialectometrical studies; dialectology; dialect data; lexical variation; Tuscan
List of contributors:
Montemagni, Simonetta
Book title:
The Future of Dialects