Receiving an accession number for your manuscript
Most journals now expect that DNA and amino acid sequences
that appear in articles will be submitted to a sequence database before
publication. Soon after submission, you will receive an accession number
from the database which you will be able to use in your article to refer
to the sequence. Please be aware that it is only necessary to submit
the sequence to one database, whichever one is most convenient, without
regard for where the sequence may be published. Data exchange between
GenBank, EMBL and DDBJ occurs daily. Sequence data submitted in advance
of publication can be kept confidential if requested.
Below are
described various ways of submitting DNA sequences to GenBank.
Essentially, there are two principal ways, BankIt and Sequin. BankIt is
a Web submission tool and recommended for simple submissions. With BankIt
you can indicate coding regions on an mRNA along with a product and gene
name. For more control over annotating your entry, segmented records, or
very long entries, Sequin, a stand-alone submission tool, is suggested.
GenBank will provide you with an accession number to
identify your sequence, usually within two working days, if the submission
is received via electronic mail. This accession number serves as
confirmation that you have submitted your data, and allows the community
to retrieve the data upon reading the journal article.
The accession number should be included in your
manuscript, preferably in a footnote on the first page of the article, or
as required by individual journal procedures.
BankIt - submitting via the WWW
NCBI has developed a WWW form, called
BankIt, for convenient
and quick submission of sequence data.
BankIt allows you to enter sequence information into
a form, edit as necessary, and add biological annotation (e.g., coding
regions, mRNA features). BankIt transforms your data into GenBank format
for your review and when your record is completed, it can be submitted
directly to GenBank. You have the option of adding information by using
text boxes to describe in your own words the source of the sequence
and its biological features. The GenBank annotation staff reviews the
submitted textual information, incorporates it into the appropriate
structured fields, and returns the record by e-mail for your review.
BankIt is compatible with Netscape clients for Unix,
Macs, and PCs. In addition, Internet Explorer for the PC and Mac have
successfully been used.
Sequin - stand-alone software for the Mac, PC/Windows, and UNIX
If you do not have access to the WWW, NCBI introduces
a stand-alone submission program called Sequin.
Sequin is an interactive, graphically-oriented program
based on screen forms and controlled vocabularies that guides you through
the process of entering your sequence and providing biological and bibliographic
annotation. Sequin is designed to simplify the sequence submission process
and to provide graphical viewing and editing options. It incorporates
robust error checking and accommodates very long sequences and complex
annotations.
Special submissions - genomes, batch sequences, alignments
Sequin
can be used for the submission of individual or small numbers of sequences.
However, it was also designed to facilitate special types of submissions,
and should be used instead of BankIt for the following types of submissions:
genomes and other very long sequences; multiple sequences such as batch
submissions and segmented sets; and population/phylogenetic/mutation studies.
When preparing the submission of a genome, you can import
the complete genome sequence into Sequin as well as a file containing
the amino acid translations in FASTA format, if available. Sequin will
automatically annotate the coding regions intervals based on the translations,
and you can use Sequin to make further complex annotations. Sequin can
also accept feature annotations in tab-delineated tables. Since the final
submission file (*.sqn) will be quite large, please send it to the GenBank
staff via FTP rather than by e-mail. To request a temporary FTP directory,
please contact genomes@ncbi.nlm.nih.gov.
When preparing a submission that contains multiple sequences,
you can import a single file containing all the sequences in FASTA format,
or as alignments in FASTA+GAP, PHYLIP, or NEXUS format. In addition, for
population/phylogenetic/mutation studies, you can annotate one sequence
and propagate the features onto the other sequences. When you complete
the submission and select the 'prepare submission' option in the 'File'
menu, Sequin will prepare a single *.sqn file that contains all the sequences.
Send the *.sqn file by e-mail to:
gb-sub@ncbi.nlm.nih.gov .
If you are submitting two or more Sequin files, each
of which contains multiple sequences, send each *.sqn file in a separate
e-mail message.
Please refer to the Sequin Quick Guide and documentation
for additional information, both of which are accessible from the Sequin
Web page.
Sending the Data to GenBank
When using BankIt, the prepared sequence entries
are submitted directly to GenBank through the WWW.
When using Sequin, the output files for direct submission
should be sent to GenBank by electronic mail to:
gb-sub@ncbi.nlm.nih.gov
As an alternative, the submission file can be copied
to floppy disk and mailed to GenBank Submissions at:
GenBank Submissions
National Center for Biotechnology
Information
National Library of Medicine
Bldg. 38A, Room 8N-803
Bethesda, MD 20894
Please label the disk with your name and file name
and indicate whether it is a PC or MAC disk.
Updates
NCBI processes update requests as well as new submissions.
You can provide additional annotation, correct errors or omissions, or
request the release of a "hold-until-published" record. BankIt or Sequin
may be used for updates, or you can request changes in a narrative e-mail
message. Be sure to give the accession number of the sequence to be updated
along with all update, correction, or publication information. Send it
to:
update@ncbi.nlm.nih.gov
Submission of ESTs, STSs and GSSs
Batches of ESTs (expressed sequence tags),
STSs (sequence tagged
sites), and GSSs (genome
survey sequences) can be submitted via special streamlined procedures.
Submission of HTGS Records
The NCBI has developed a protocol for high throughput
genome sequencing centers to use when they submit large genomic records
(usually Cosmids or BACs). Specialized tools, including fa2htgs and a
"genome center version" of Sequin, have been created to help such centers
produce these submission files in a convenient way. The HTG
page not only provides detailed submission instructions to genome
centers, but also informs GenBank users how to access the HTG sequences.
Confidentiality
Some authors are concerned that the appearance of
their data in GenBank prior to publication will compromise their work.
GenBank will, upon request, withhold release of new submissions until
the paper is published.
We encourage authors to inform us of the appearance
of the published data; failure to do so could result in delays in making
your data available in GenBank. Please send the full publication data
- all authors, title, journal, volume, pages and date - to the following
address:
update@ncbi.nlm.nih.gov
Submission of SNPs and other polymorphism data
Data on genetic variation in humans and other organisms
can be submitted to the NCBI Database
of Single Nucleotide Polymorphisms (dbSNP). Entries include single
nucleotide polymorphisms (SNPs), small-scale insertion/deletions, polymorphic
repetitive elements, and microsatellite variation. dbSNP is a separate
resource from the GenBank database, and submissions do not receive GenBank
accessions as noted above. However, dbSNP entries do receive dbSNP identifiers
and contain links to associated GenBank records. Further information about
submitting data is accessible from the sidebar of the dbSNP home page.
|