Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Pubblicazioni

Classifying Websites by industry sector: a study in feature design

Contributo in Atti di convegno
Data di Pubblicazione:
2015
Abstract:
Classifying companies by industry sector is an important task in finance, since it allows investors and research analysts to analyse specific subsectors of local and global markets for investment monitoring and planning purposes. Traditionally this classification activity has been performed manually, by dedicated specialists carrying out in-depth analysis of a company's public profile. However, this is more and more unsuitable in nowadays's globalised markets, in which new companies spring up, old companies cease to exist, and existing companies refocus their efforts to different sectors at an astounding pace. As a result, tools for performing this classification automatically are increasingly needed. We address the problem of classifying companies by industry sector via the automatic classification of their websites, since the latter provide rich information about the nature of the company and market segment it targets. We have built a website classification system and tested its accuracy on a dataset of more than 20,000 company websites classified according to a 2-level taxonomy of 216 leaf classes explicitly designed for market research purposes. Our experimental study provides interesting insights as to which types of features are the most useful for this classification task.
Tipologia CRIS:
04.01 Contributo in Atti di convegno
Keywords:
Website classification
Elenco autori:
Berardi, Giacomo; Esuli, Andrea; Fagni, Tiziano; Sebastiani, Fabrizio
Autori di Ateneo:
ESULI ANDREA
FAGNI TIZIANO
SEBASTIANI FABRIZIO
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/310006
Link al Full Text:
https://iris.cnr.it//retrieve/handle/20.500.14243/310006/161018/prod_346021-doc_159204.pdf
  • Dati Generali

Dati Generali

URL

http://dl.acm.org/citation.cfm?id=2695722&CFID=734381158&CFTOKEN=34893976
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)