Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Pubblicazioni

Garbled-word embeddings for jumbled text

Contributo in Atti di convegno
Data di Pubblicazione:
2021
Abstract:
"Aoccdrnig to a reasrech at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny itmopnrat tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe". We investigate the extent to which this phenomenon applies to computers as well. Our hypothesis is that computers are able to learn distributed word representations that are resilient to character reshuffling, without incurring a significant loss in performance in tasks that use these representations. If our hypothesis is confirmed, this may form the basis for a new and more efficient way of encoding character-based representations of text in deep learning, and one that may prove especially robust to misspellings, or to corruption of text due to OCR. This paper discusses some fundamental psycho-linguistic aspects that lie at the basis of the phenomenon we investigate, and reports on a preliminary proof of concept of the above idea.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
Garbled-Word Embeddings; Garbled Words; Misspellings; Distributional Semantic Models
Elenco autori:
Sperduti, Gianluca; MOREO FERNANDEZ, ALEJANDRO DAVID; Sebastiani, Fabrizio
Autori di Ateneo:
MOREO FERNANDEZ ALEJANDRO DAVID
SEBASTIANI FABRIZIO
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/398920
Link al Full Text:
https://iris.cnr.it//retrieve/handle/20.500.14243/398920/115688/prod_457946-doc_177824.pdf
Titolo del libro:
IIR 2021 - 11th Italian Information Retrieval Workshop
Pubblicato in:
CEUR WORKSHOP PROCEEDINGS
Series
  • Dati Generali

Dati Generali

URL

http://ceur-ws.org/Vol-2947/
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.2.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)