Data di Pubblicazione:
2015
Abstract:
In the object recognition community, much effort has been spent on devising expressive object representations
and powerful learning strategies for designing effective classifiers, capable of achieving high accuracy
and generalization. In this scenario, the focus on the training sets has been historically weak; by and
large, training sets have been generated with a substantial human intervention, requiring considerable
time. In this paper, we present a strategy for automatic training set generation. The strategy uses semantic
knowledge coming from WordNet, coupled with the statistical power provided by Google Ngram, to
select a set of meaningful text strings related to the text class-label (e.g., ''cat''), that are subsequently fed
into the Google Images search engine, producing sets of images with high training value. Focusing on the
classes of different object recognition benchmarks (PASCAL VOC 2012, Caltech-256, ImageNet, GRAZ and
OxfordPet), our approach collects novel training images, compared to the ones obtained by exploiting
Google Images with the simple text class-label. In particular, we show that the gathered images are better
able to capture the different visual facets of a concept, thus encoding in a more successful manner the
intra-class variance. As a consequence, training standard classifiers with this data produces performances
not too distant from those obtained from the classical hand-crafted training sets. In addition, our datasets
generalize well and are stable, that is, they provide similar performances on diverse test datasets. This
process does not require manual intervention and is completed in a few hours.
Tipologia CRIS:
01.01 Articolo in rivista
Keywords:
Internet search; Object recognition; Semantics; Training dataset; WordNet
Elenco autori:
Zeni, Nicola; Setti, Francesco; Ferrario, Roberta
Link alla scheda completa:
Pubblicato in: