|
Clustering of the SOM easily reveals distinct gene expression patterns |
|
[编者的话] BMC上新鲜出炉的文章,其中SOM方法的应用值得学习。
Background A
method to evaluate and analyze the massive data generated by series of
microarray experiments is of utmost importance to reveal the hidden
patterns of gene expression. Because of the complexity and the high
dimensionality of microarray gene expression profiles, the dimensional
reduction of raw expression data and the feature selections necessary for
e.g. classification of disease samples remains a challenge. To solve the
problem we propose a two-level analysis. First self-organizing map (SOM)
is used, SOM is a vector quantization method that simplifies and reduces
the dimensionality of original measurements and visualizes individual
tumor sample in a SOM component plane. Hierarchical clustering and K-means
clustering is then further used to identify patterns of gene expression
useful for classification of samples.
Results We
tested the two-level analysis on public data from diffuse large B-cell
lymphomas. The analysis easily distinguished major gene expression
patterns without the need for supervision: a germinal center-related, a
proliferation, an inflammatory and a plasma cell differentiation-related
gene expression pattern. The first three patterns matched the patterns
described on the original publication by supervised clustering analysis,
whereas the fourth one was novel.
Conclusions Our study shows that using SOM as an intermediate step to analyze genome-wide gene expression data, the gene expression patterns can more easily be revealed. The summarized “expression display” allows the clinician to evaluate the classification options rather than giving a fixed decision, was reflecting the real-world situation.
|
|
|
|
1999-2005 中国科学院上海生命科学研究院生物信息中心 |