Identifying molecular markers associated with classification of genotypes by External Logistic Biplots

Identifying molecular markers associated with classification of genotypes by External Logistic Biplots - For characterization of genetic diversity in genotypes several molecular techniques, usually resulting in a binary data matrix, have been used. Despite the fact that in Cluster Analysis (CA) and Principal Coordinates Analysis (PCoA) the interpretation of the variables responsible for grouping is not straightforward, these methods are commonly used to classify genotypes using DNA molecular markers. In this article, we present a novel algorithm that uses a combination of PCoA, CA and Logistic Regression (LR), as a better way to interpret the variables (alleles or bands) associated to the classification of genotypes. The combination of three standard techniques with some new ideas about the geometry of the procedures, allows constructing an External Logistic Biplot (ELB) that helps in the interpretation of the variables responsible for the classification or ordination. An application of the method to study the genetic diversity of four populations from Africa, Asia and Europe, using the HapMap data is included.
Availability: The Matlab code for implementing the methods may be obtained from the web site: http://biplot.usal.es.
Supplementary information: Supplementary data are available at Bioinformatics online.