Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • People
  • Outputs
  • Organizations
  • Expertise & Skills

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • People
  • Outputs
  • Organizations
  • Expertise & Skills
  1. Outputs

Iterative Annotation of Biomedical NER Corpora with Deep Neural Networks and Knowledge Bases

Academic Article
Publication Date:
2022
abstract:
The large availability of clinical natural language documents, such as clinical narratives or diagnoses, requires the definition of smart automatic systems for their processing and analysis, but the lack of annotated corpora in the biomedical domain, especially in languages different from English, makes it difficult to exploit the state-of-art machine-learning systems to extract information from such kinds of documents. For these reasons, healthcare professionals lose big opportunities that can arise from the analysis of this data. In this paper, we propose a methodology to reduce the manual efforts needed to annotate a biomedical named entity recognition (B-NER) corpus, exploiting both active learning and distant supervision, respectively based on deep learning models (e.g., Bi-LSTM, word2vec FastText, ELMo and BERT) and biomedical knowledge bases, in order to speed up the annotation task and limit class imbalance issues. We assessed this approach by creating an Italian-language electronic health record corpus annotated with biomedical domain entities in a small fraction of the time required for a fully manual annotation. The obtained corpus was used to train a B-NER deep neural network whose performances are comparable with the state of the art, with an F1-Score equal to 0.9661 and 0.8875 on two test sets.
Iris type:
01.01 Articolo in rivista
Keywords:
Biomedical NER; Corpus annotation; Distant supervision; Active learning; Deep Learning
List of contributors:
Ciampi, Mario; Gargiulo, Francesco; Silvestri, Stefano
Authors of the University:
CIAMPI MARIO
GARGIULO FRANCESCO
SILVESTRI STEFANO
Handle:
https://iris.cnr.it/handle/20.500.14243/446511
Published in:
APPLIED SCIENCES
Journal
  • Overview

Overview

URL

https://www.mdpi.com/2076-3417/12/12/5775
  • Use of cookies

Powered by VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)