GSA-PCA : gene set generation by principal component analysis of the Laplacian matrix of a metabolic network

dc.contributor.authorJacobson, Dan
dc.contributor.authorEmerton, Guy
dc.date.accessioned2013-06-26T12:41:06Z
dc.date.available2013-06-26T12:41:06Z
dc.date.issued2012-08
dc.date.updated2013-04-15T11:08:53Z
dc.descriptionThe original publication is available at http://www.biomedcentral.com/1471-2105/13/197en_ZA
dc.descriptionPublication of this article was funded by the Stellenbosch University Open Access Fund.
dc.description.abstractAbstract Background Gene Set Analysis (GSA) has proven to be a useful approach to microarray analysis. However, most of the method development for GSA has focused on the statistical tests to be used rather than on the generation of sets that will be tested. Existing methods of set generation are often overly simplistic. The creation of sets from individual pathways (in isolation) is a poor reflection of the complexity of the underlying metabolic network. We have developed a novel approach to set generation via the use of Principal Component Analysis of the Laplacian matrix of a metabolic network. We have analysed a relatively simple data set to show the difference in results between our method and the current state-of-the-art pathway-based sets. Results The sets generated with this method are semi-exhaustive and capture much of the topological complexity of the metabolic network. The semi-exhaustive nature of this method has also allowed us to design a hypergeometric enrichment test to determine which genes are likely responsible for set significance. We show that our method finds significant aspects of biology that would be missed (i.e. false negatives) and addresses the false positive rates found with the use of simple pathway-based sets. Conclusions The set generation step for GSA is often neglected but is a crucial part of the analysis as it defines the full context for the analysis. As such, set generation methods should be robust and yield as complete a representation of the extant biological knowledge as possible. The method reported here achieves this goal and is demonstrably superior to previous set analysis methods.en_ZA
dc.description.versionPublishers' Versionen_ZA
dc.format.extent23 p. : ill.
dc.identifier.citationJacobson, D. & Emerton, G. 2012. GSA-PCA: gene set generation by principal component analysis of the Laplacian matrix of a metabolic network. BMC Bioinformatics, 13(1):197, doi.org/10.1186/1471-2105-13-197.en_ZA
dc.identifier.issn1471-2105 (online)
dc.identifier.issn1471-2105 (print)
dc.identifier.otherdoi.org/10.1186/1471-2105-13-197
dc.identifier.urihttp://hdl.handle.net/10019.1/80938
dc.language.isoen_ZAen_ZA
dc.language.rfc3066en
dc.publisherBioMed Centralen_ZA
dc.rights.holderDan Jacobson et al.; licensee BioMed Central Ltd.en_ZA
dc.subjectGene Set Analysis (GSA)en_ZA
dc.subjectSet analysis methodsen_ZA
dc.subjectMicroarray analysisen_ZA
dc.titleGSA-PCA : gene set generation by principal component analysis of the Laplacian matrix of a metabolic networken_ZA
dc.typeArticleen_ZA
Files
Original bundle
Now showing 1 - 5 of 13
Loading...
Thumbnail Image
Name:
1471-2105-13-197.xml
Size:
74.42 KB
Format:
Extensible Markup Language
Description:
Loading...
Thumbnail Image
Name:
1471-2105-13-197.pdf
Size:
1.66 MB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
1471-2105-13-197-S8.XLSX
Size:
18.58 KB
Format:
Microsoft Excel XML
Description:
Loading...
Thumbnail Image
Name:
1471-2105-13-197-S4.PDF
Size:
211.99 KB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
1471-2105-13-197-S10.PDF
Size:
166.13 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.95 KB
Format:
Item-specific license agreed upon to submission
Description: