Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Strutture

Ain't that sweet. Reflections on scene level indexing and annotation in the House Corpus Project

Capitolo di libro
Data di Pubblicazione:
2019
Abstract:
This paper outlines the strategies, rationale and potential uses motivating the construction of the House Corpus, a one-million-word corpus that can be accessed by authorised users through the MWSWeb site (Taibi et al. 2015a) at http://openmws.itd.cnr.it. Part 1 illustrates the tools and techniques used to index the corpus data - transcriptions of all 177 episodes in the House M.D. series (original US version). In particular, it describes the commercially available Elasticsearch (https://www.elastic.co), used as an indexing, annotational and search tool. Part 2 explains that this is a multimedia corpus allowing viewings of different types of scene. The 6000-plus scenes in the corpus have been annotated in terms of their typological features: Location type (e.g. patient's hospital room; medical lab etc.); Event type (handover; differential diagnosis; precipitating medical event; patient examination etc.) and Character Group type (doctor/doctor; doctor/patient; doctor/caregiver; patient/caregiver etc.). The project envisages the development of various retrieval interfaces, initially Words, Scenes and Dialogues. This will make it possible to carry out searches in terms of types of scene and their distribution across the corpus without necessarily involving any other form of searching. Part 3 suggests the value of multimedia corpora in encouraging students to advance their critical discourse analysis (CDA) skills. As an example, it shows how the corpus can illustrate the priority of (inter)textual over lexicogrammatical considerations when formulating tag questions in oral discourse. Finally, the Discussion section argues that a typology of scenes appears to be an essential prerequisite for the construction of other types of access to the corpus data in subsequent stages of the project.
Tipologia CRIS:
02.01 Contributo in volume (Capitolo o Saggio)
Keywords:
House Corpus; indexing; scene annotation; functionality planning; CDA
Elenco autori:
Taibi, Davide
Autori di Ateneo:
TAIBI DAVIDE
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/424414
  • Dati Generali

Dati Generali

URL

http://siba-ese.unisalento.it/index.php/lispett/article/view/21417
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)