新闻 | 论坛 | 生物信息学专题 | 新思路 | 软件下载 | 相关数据库 | 免费主页

网站首页 BioSino Databese BioSino Lab BioSino Navigator 关于本站

 
站内搜索:  

Computational comparison of two mouse draft genomes and the human golden path

 

[编者的话]

自从人类基因组序列发表以来,科学界一直在急切期盼着老鼠基因组测序研究计划的结果。最近,《自然》杂志发表了来自“老鼠基因组测序国际合作项目”(Mouse Genome Sequencing Consortium)具有里程碑意义的研究成果,许多人认为,老鼠基因组对于人类未来的意义甚至要超过人类基因组本身。因为实验鼠被誉为是通过实验来了解人类基因组的钥匙。研究老鼠模型,可让我们对每个基因进行操纵,以确定它们的功能,这将使我们能够对人类疾病的很多方面以及关于人体的基础生物学有详细的、深入的了解。下面这篇文章是来自冷泉港Michael Q Zhang教授的相关研究文章。

 

Background

The availability of both mouse and human draft genomes has marked the beginning of a new era of comparative mammalian genomics. The two available mouse genome assemblies, namely those from the public mouse genome sequencing consortium and Celera Genomics, were obtained by using different clone libraries, as well as different assembling methods.

Result

To take advantage of both public and private mouse genome sequencing efforts, we present a critical comparison of the two latest mouse genome assemblies. The utility of the combined mouse genomes is further demonstrated by comparing them with the human golden path and through a subsequent analysis of the resulting Conserved Sequence Element (CSE) database. In particular, the CSE information allows us to identify over 6,000 potential novel genes and to derive independent estimates of the number of the human protein-coding genes.

Conclusions

Although the Celera and public mouse assemblies agree to a great extent, they differ in about 10% of the mouse genome. Each assembly has advantages over the other. Among the advantages, the Celera assembly has higher accuracy in base pairs and overall higher coverage of the genome. The public mouse assembly, however, has higher sequence quality in some newly finished BAC regions and the data are freely accessible. Perhaps most importantly, by combining both assemblies, we can get a better annotation of the human genome. In particular, we can obtain the most complete set of CSEs. One third of those CSEs are related to known genes. Some CSEs are related to other functional regions in the genome. More than half of the CSEs are still functional unknown. The CSEs allow us to estimate the total number of the human protein-coding genes to be about 40,000. Since CSEs can shed additional light on the functional regions of the genome as we have demonstrated, making this searchable CSEdb publicly available on-line will expedite new discoveries through comparative genomics.

 

想要了解更多,请见原文

 


1999-2005 中国科学院上海生命科学研究院生物信息中心  
版权所有 All rights reserved.