Novel Data Science Methodologies for Essential Genes Identification Based on Network Analysis
Capitolo di libro
Data di Pubblicazione:
2023
Abstract:
Essential genes (EGs) are fundamental for the growth and survival of a cell
or an organism. Identifying EGs is an important issue in many areas of biomedical
research, such as synthetic and system biology, drug development, mechanistic and
therapeutic investigations. The essentiality is a context-dependent dynamic attribute
of a gene that can vary in different cells, tissues, or pathological conditions, and wetlab
experimental procedures to identify EGs are costly and time-consuming. Commonly
explored computational approaches are based on machine learning techniques
applied to protein-protein interaction networks, but they are often unsuccessful, especially
in the case of human genes. From a biological point of view, the identification
of the node essentiality attributes is a challenging task. Nevertheless, from a data
science perspective, suitable graph learning approaches still represent an open problem.
Node classification in graph modeling/analysis is a machine learning task to
predict an unknown node property based on defined node attributes. The model is
trained based on both the relationship information and the node attributes. Here, we
propose the use of a context-specific integrated network enriched with biological
and topological attributes. To tackle the node classification task we exploit different
machine and deep learning models. An extensive experimental phase demonstrates
the effectiveness of both network structure and attributes associated with the nodes
for EGs identification.
or an organism. Identifying EGs is an important issue in many areas of biomedical
research, such as synthetic and system biology, drug development, mechanistic and
therapeutic investigations. The essentiality is a context-dependent dynamic attribute
of a gene that can vary in different cells, tissues, or pathological conditions, and wetlab
experimental procedures to identify EGs are costly and time-consuming. Commonly
explored computational approaches are based on machine learning techniques
applied to protein-protein interaction networks, but they are often unsuccessful, especially
in the case of human genes. From a biological point of view, the identification
of the node essentiality attributes is a challenging task. Nevertheless, from a data
science perspective, suitable graph learning approaches still represent an open problem.
Node classification in graph modeling/analysis is a machine learning task to
predict an unknown node property based on defined node attributes. The model is
trained based on both the relationship information and the node attributes. Here, we
propose the use of a context-specific integrated network enriched with biological
and topological attributes. To tackle the node classification task we exploit different
machine and deep learning models. An extensive experimental phase demonstrates
the effectiveness of both network structure and attributes associated with the nodes
for EGs identification.
Tipologia CRIS:
02.01 Contributo in volume (Capitolo o Saggio)
Keywords:
Data science; Node classification; Essential genes identification; Integrated network
Elenco autori:
Maddalena, Lucia; Giordano, Maurizio; Granata, Ilaria
Link alla scheda completa:
Titolo del libro:
Data Science in Applications
Pubblicato in: