Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Strutture

Learning robust web wrappers

Contributo in Atti di convegno
Data di Pubblicazione:
2005
Abstract:
A main challenge in wrapping web data is to make wrappers robust w.r.t. variations in HTML sources, reducing human effort as much as possible. In this paper we develop a new approach to speed up the specification of robust wrappers, allowing the wrapper designer to not care about detailed definition of extraction rules. The key-idea is to enable a schema-based wrapping system to automatically generalize an original wrapper w.r.t. a set of example HTML documents. To accomplish this objective, we propose to exploit the notions of extraction rule and wrapper subsumption for computing a most general wrapper which still shares the extraction schema with the original wrapper, while maximizes the generalization of extraction rules w.r.t. the set of example documents. © Springer-Verlag Berlin Heidelberg 2005.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
Wrapping systems
Elenco autori:
Fazzinga, Bettina
Autori di Ateneo:
FAZZINGA BETTINA
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/306260
  • Dati Generali

Dati Generali

URL

http://www.scopus.com/inward/record.url?eid=2-s2.0-26844472769&partnerID=q2rCbXpz
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)