Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Pubblicazioni

SILA: A spatial instance learning approach for deep webpages

Contributo in Atti di convegno
Data di Pubblicazione:
2011
Abstract:
Deep Web pages convey very relevant information for different application domains like e-government, e-commerce, social networking. For this reason there is a constant high interest in efficiently, effectively and automatically extracting data from Deep Web data sources. In this paper we present SILA, a novel Spatial Instance Learning Approach, that allows for extracting data records from Deep Web pages by exploiting both the spatial arrangement and the presentation features of data items/fields produced by layout engines of Web browsers in visualizing Deep Web pages on the screen. SILA is independent from the internal HTML encodings of Web pages, and allows for recognizing data records in pages having multiple data regions in which data items are arranged by many different presentation layouts. Experimental results show that SILA has very high precision and recall and that it works much better than MDR and ViNTs approaches. © 2011 ACM.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
deep web; instance learning; web information extraction; web wrapping
Elenco autori:
Ruffolo, Massimo; Oro, Ermelinda
Autori di Ateneo:
ORO ERMELINDA
RUFFOLO MASSIMO
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/253596
Titolo del libro:
CIKM '11 Proceedings of the 20th ACM international conference on Information and knowledge management
  • Dati Generali

Dati Generali

URL

http://www.scopus.com/inward/record.url?eid=2-s2.0-83055161475&partnerID=q2rCbXpz
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)