Bioinformatics and Functional Genomics Research Group
Cancer Research Center (CiC-IBMCC, CSIC/USAL/IBSAL)
Campos-Laborie FJ, Risueño A, Ortiz-Estevez M, Roson-Burgo B, Droste C, Fontanillo C, Loos R, Sanchez-Santos JM, Trotter MW and De Las Rivas J
DECO: DEcomposing heterogeneous Cohorts using Omic data profiling
ABSTRACT: Patient diversity is one of the main challenges when dealing with large cohorts of clinical studies.
Here we propose a method to analyse and understand heterogeneous data avoiding classical normalization approaches of reducing
or removing variation. Our method, called DECO (DEcomposing heterogeneous Cohorts using Omic data profiling) finds and describes
existing dependent relationships among biological features (genes) and samples (individuals) analysing large-scale omic-wide data.
DECO identifies the best biomarkers related to specific phenotypic conditions and possible hidden factors.
The method is based on a recursive heuristic algorithm that assigns marker features (i.e. genes, miRNAs, proteins or other biomarkers identified by
the omic technique) to subsets of samples depending on their patterns. In this way, it identifies closely related states or
subclasses within the studied cohorts.
The method performs a recursive exploration of differential signal changes between samples, finding variables assigned to:
(i) the main classes or groups of samples that are in the studied cohorts;
(ii) significant variation or alteration among certain individuals (related or not to an a-priori known class);
(iii) possible errors in the class or the label given to certain samples;
(iv) sample outliers (i.e. individuals that behave in a different way to the main groups and have specific markers).
We demonstrate that DECO performs better than classical and current methods when it is applied to complex gene expression
datasets from several cancer clinical cohorts. DECO identifies the specific omic signature of individuals, making it especially
suited to perform deep and accurate patient stratification in large-scale clinical studies.
DECO algorithm: available as R package in Bioconductor: bioconductor.org/packages/deco/
[ARTICLE submitted for publication in Bioinformatics - January.2019]