Printer-friendly versionSend by emailPDF version

Merged consensus clustering to assess and improve class discovery with microarray data.

TitleMerged consensus clustering to assess and improve class discovery with microarray data.
Publication TypeJournal Article
Year of Publication2010
AuthorsSimpson, TIan, Armstrong, JDouglas, Jarman, AP
JournalBMC Bioinformatics
Volume11
Pagination590
Date Published2010
ISSN1471-2105
KeywordsCluster Analysis, Databases, Genetic, Gene Expression Profiling, Oligonucleotide Array Sequence Analysis, Pattern Recognition, Automated
Abstract

BACKGROUND: One of the most commonly performed tasks when analysing high throughput gene expression data is to use clustering methods to classify the data into groups. There are a large number of methods available to perform clustering, but it is often unclear which method is best suited to the data and how to quantify the quality of the classifications produced.

RESULTS: Here we describe an R package containing methods to analyse the consistency of clustering results from any number of different clustering methods using resampling statistics. These methods allow the identification of the the best supported clusters and additionally rank cluster members by their fidelity within the cluster. These metrics allow us to compare the performance of different clustering algorithms under different experimental conditions and to select those that produce the most reliable clustering structures. We show the application of this method to simulated data, canonical gene expression experiments and our own novel analysis of genes involved in the specification of the peripheral nervous system in the fruitfly, Drosophila melanogaster.

CONCLUSIONS: Our package enables users to apply the merged consensus clustering methodology conveniently within the R programming environment, providing both analysis and graphical display functions for exploring clustering approaches. It extends the basic principle of consensus clustering by allowing the merging of results between different methods to provide an averaged clustering robustness. We show that this extension is useful in correcting for the tendency of clustering algorithms to treat outliers differently within datasets. The R package, clusterCons, is freely available at CRAN and sourceforge under the GNU public licence.

DOI10.1186/1471-2105-11-590
Alternate JournalBMC Bioinformatics
PubMed ID21129181
PubMed Central IDPMC3002369
Grant List077266 / / Wellcome Trust / United Kingdom
077266 / / Wellcome Trust / United Kingdom

Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.