essential amino acids non-essential amino acids one-letter amino acid code three-letter amino acid code
Sequence format transformation is a computing process to transform peptide sequences between one-letter codes, IUPAC condensed, amino acid chain, graph representation, and sequence graph formats. The details of this algorithm are available at dfwlab/cyclicpepedia on GitHub.
Format | Example | Detail |
---|---|---|
One letter code | FGIKPPQR | The simplest representation of peptide sequences, ignoring all loops. |
IUPAC condensed | cyclo[DL-N(Me)Ala-DL-Leu-N(Me)Phe(a,b-dehydro)-Gly] | Developed by the International Union of Pure and Applied Chemistry (IUPAC). The prefix"Cyclo" indicates a head-to-tail cyclization. The sequence of amino acids is represented by standard three-letter codes, separated by '-'. Modifications to the amino acids are indicated in the sequence, such as "D" and "L" refer to the chirality of the amino acid, and ring closure bonds are represented by "(num)". It can represent multiple chains through separator '.' , for example: D-N(1)Ala-Arg(CONHMe)-N(Me)Phe-Asp(2)-OH.N(2)Asp(1)-OH. |
Amino acid chain | Gly(1)--Cys(2)--Asn--4OH-Pro--Ile--Trp(2)--Gly--Ile(1) | Define by CyclicPepdia, basically consistent with IUPAC condensed. The separator changes to "--" to adapt to situations where the amino acid unit (monomer) has a "-" |
Graph representation | aThr,Tyr,dhAbu,bOH-Gln,Gly,Gln,His,Dab,C13:2(t4.t6)-OH(2.3),Lyx,dhAbu @1,5 @6,10 @0,8 | Inspired by the NOR format from the Norine database, monomers are divided by comma, and ring closure bonds are represented by '@idx,idy.' |
Sequence graph | G(nodes=[]; edges=[]) | The sequence graph is built by networkx through a list of nodes and edges. For example: nodes = [(0, 'Gly'), (1, '4OH-Pro'), (2, 'Ala')], edges = [(0, 1), (1, 2), (0, 2)] |