Bioinformatics and Functional Genomics Research Group
Cancer Research Center (CiC-IBMCC, CSIC/USAL/IBSAL)
Campos-Laborie FJ, Risueño A, Ortiz-Estevez M, Roson-Burgo B, Droste C, Fontanillo C, Loos R, Sanchez-Santos JM, Trotter MW and De Las Rivas J
DECO: DEcomposing heterogeneous Cohorts by Omic data profiling
ABSTRACT: Patient diversity is one of the main challenges when dealing with large cohorts of clinical studies.
Here we propose a method to analyse and understand heterogeneous data avoiding classical normalization approaches of reducing
or removing variation. Our method, called DECO (DEcomposing heterogeneous Cohorts by Omic data profiling) finds and describes
existing dependent relationships among biological features (genes) and samples (individuals) analysing large-scale omic-wide data.
DECO identifies the best biomarkers related to specific phenotypic conditions and possible hidden factors. The method is based
on a recursive heuristic algorithm that assigns marker features (i.e. genes, miRNAs, proteins or other biomarkers identified by
the omic technique) to subsets of samples depending on their patterns. In this way, it identifies closely related states or
subclasses within the studied cohort. The method explores differential signal finding: (i) variation among individuals related
or not to a priori known phenotypic classes; (ii) possible errors in class label assignment; (iii) possible outliers.
We demonstrate that DECO performs better than classical and current methods when it is applied to complex gene expression
datasets from several cancer clinical cohorts. DECO identifies the specific omic signature of individuals, making it especially
suited for deep and accurate patient stratification.
Additional File 1 - DECO R package
Additional File 2 - DECO R vignette and tutorial to use the method
[ARTICLE submitted for publication - September.2017]