新闻 | 论坛 | 生物信息学专题 | 新思路 | 软件下载 | 相关数据库 | 免费主页

网站首页 BioSino Databese BioSino Lab BioSino Navigator 关于本站

 
站内搜索:  

Efficient Boolean implementation of universal sequence maps

 

[编者的话]

有一些生物信息学家一直在努力尝试着发展一种生物序列的新的表示方法,以利于序列的分析与比较。在1990年,Jeffery发表了一种名为Chaos Game Representation (CGR),该方法受到广泛的关注。本文是在CGR方法上做出的新的突破。

 

Recently, Almeida and Vinga offered a new approach for the representation of arbitrary discrete sequences, referred to as Universal Sequence Maps (USM), and discussed its applicability to genomic sequence analysis. Their work generalizes and extends Chaos Game Representation (CGR) of DNA for arbitrary discrete sequences.

We have considered issues associated with the practical implementation of USMs and offer a variation on the algorithm that 1) eliminates the overestimation of similar segment lengths 2) permits the identification of arbitrarily long similar segments in the context of finite word length coordinate representations, 3) uses more computationally efficient operations, and 4) provides a simple conversion for recovering the USM coordinates. Computational performance comparisons and examples are provided.

We have shown that the desirable properties of the USM encoding of nucleotide sequences can be retained in a practical implementation of the algorithm. In addition, the proposed implementation enables determination of local sequence identity at increased speed.

 

想要了解更多,请见原文

 


1999-2005 中国科学院上海生命科学研究院生物信息中心  
版权所有 All rights reserved.