## Warning: package 'knitr' was built under R version 3.1.3

AUTHORS: Conrad Droste and Javier De Las Rivas

Bioinformatics and Functional Genomics Group

Cancer Research Center (CiC-IBMCC, CSIC/USAL/IBSAL)

37007 Salamanca, SPAIN.

CONTACT: jrivas(AT)usal(.)es - AND - conrad.droste(AT)usal(.)es

1. Introduction

Large-scale and small-scale experiments regarding the interaction of genes as DNA and RNA units, proteins and other molecular components within the cells are producing large amounts of interaction data, which are stored in many different biomolecular databases. The human interactome, for example, is composed of around 20,000 protein-coding genes, around 1,000 metabolites and a still undefined number of distinct proteins and functional RNA molecules (Vidal, Cusick & Barabasi, 2011). In total, this sums up to more than 100,000 cellular components expected to form the biomolecular interactome in human cells. These components are related to each other in different ways. The number of relations and functional associations exceeds substantially the number of components, making the interactome a complex system difficult to depict and analyze. Despite this complexity, the nature of the cellular interactomes allows to render or transcribe them into biomolecular networks, that can integrate different layers of information to generate comprehensive relational spaces and provide better representation and view of the cellular systems. Moreover, the networks can be analyzed with computers to explore and quantify the centrality and the weight of the different partners, as well as to find clusters or modules of highly related elements. This is the main objective of developing the bioinformatic application tool here presented, called Path2enet.

Path2enet is a tool to generate biomolecular networks derived from biological pathways, integrated with protein-protein interaction data (PPI) and expression data in a cell specific context (i.e. using data from specific tissues, cell-types or experimental conditions).

Path2enet is developed as a R package, open source and open access, to allows the construction of pathway-expression-networks reading and integrating the information from biological pathways, protein interaction data and expression cell specific data.

The user can easily search for interactions in the pathways translated in networks by Path2enet. It is also possible to combine various pathways into meta-pathways presented as unified networks, providing a global view of the biological interactions that can occurr in a cell. Such global views avoid the separation between pathways that many times is done following arbitrary criteria.

After generating these specific pathway-expression-networks, the tool provides several visualization and analyzing features. In fact, Path2enet uses the R package igraph (Gabor et al., 2006) to generate networks that can be loaded into Cytoscape software (Saito et al., 2012). In conclusion, a clear goal of the tool is to help biological researchers to create individual networks which fullfill their specific cellular conditions and interpret their studies in a network based context.

Translating biological pathways into protein networks

We have developed a bioinformatic tool called Path2enet that provides a translation of biological pathways in protein networks integrating several layers of information about the biomolecular nodes in a multiplex view. Path2enet is a R package that reads the relations and links between proteins stored in a comprehensive database of biological pathways, (using KEGG - Enciclopedia de Kyoto de Genes), and integrates it with expression data from various resources and with data about protein-protein physical interactions (using APID - Agile Protein Interactomes DataServer). Path2enet tool uses the expression data to decide if a given protein in a network (i.e. a node) is active (ON) or not (OFF) in a specific cellular context or sample type. In this way, path2enet reduces the complexity of the networks and reveals the proteins that are active (expressed) in specific conditions.

As a proof of concept, this work presents a practical “case of use” generating the pathway-expression-networks corresponding to the Notch Signaling Pathway in human B- and T-lymphocytes. This case is produced by the analysis and integration in path2enet of a experimental dataset of genome-wide expression microarrays produced with these cell types (i.e. B cells and T cells):