Data di Pubblicazione:
2018
Abstract:
Research in information science and scholarly communication strongly relies on the availability of openly accessible datasets of metadata and, where possible, their relative payloads. To this end, CrossRef plays a pivotal role by providing free access to its entire metadata collection, and allowing other initiatives to link and enrich its information. Therefore, a number of key pieces of information result scattered across diverse datasets and resources freely available online. As a result of this fragmentation, researchers in this domain end up struggling with daily integration problems producing a plethora of ad-hoc datasets, therefore incurring in a waste of time, resources, and infringing open science best practices.
This software package the spark scripts to generate DOIBoost, a metadata collection that enriches CrossRef with inputs from Microsoft Academic Graph, ORCID, and UnPayWall for the purpose of supporting high-quality and robust research.
Tipologia CRIS:
05.11 Software
Keywords:
Python; Spark; Dataset CrossRef; Unpaywall; ORCID; Microsoft Academic Graph; Enrichment; Metadata; Aggregation
Elenco autori:
LA BRUZZO, SANDRO FABRIZIO
Link alla scheda completa: