PROMISE: a tool to identify genomic features with specific biological associations with multiple endpoint variables

PROMISE: a tool to identify genomic features with a specific biologically interesting pattern of associations with multiple endpoint variables - Motivation: In some applications, prior biological knowledge can be used to define a specific pattern of association of multiple endpoint variables with a genomic variable that is biologically most interesting. However, to our knowledge, there is no statistical procedure designed to detect specific patterns of association with multiple endpoint variables.
Results: Projection onto the most interesting statistical evidence (PROMISE) is proposed as a general procedure to identify genomic variables that exhibit a specific biologically interesting pattern of association with multiple endpoint variables. Biological knowledge of the endpoint variables is used to define a vector that represents the biologically most interesting values for statistics that characterize the associations of the endpoint variables with a genomic variable. A test statistic is defined as the dot-product of the vector of the observed association statistics and the vector of the most interesting values of the association statistics. By definition, this test statistic is proportional to the length of the projection of the observed vector of correlations onto the vector of most interesting associations. Statistical significance is determined via permutation. In simulation studies and an example application, PROMISE shows greater statistical power to identify genes with the interesting pattern of associations than classical multivariate procedures, individual endpoint analyses or listing genes that have the pattern of interest and are significant in more than one individual endpoint analysis.
Availability: Documented R routines are freely available from www.stjuderesearch.org/depts/biostats and will soon be available as a Bioconductor package from www.bioconductor.org.
Supplementary information: Supplementary data are available at Bioinformatics online.