The key task of this campaign is populating the reference library of DNA barcodes for polar species. A reference barcode record is composed of three vital elements:
Researchers interested in submitting polar samples for DNA barcoding are encouraged to contact the campaign coordinators to get the latest information on sampling protocols and negotiate transaction details.
- A high-quality bidirectional sequence of at least 500 base pairs of the barcode fragment of COI, complete with sequence trace files;
- 2. A collection voucher specimen preserved in an internationally recognized collection repository and available for re-examination by interested experts;
- 3. A comprehensive data record linking the sequence to its source specimen via a unique collection number, containing vital collection information, taxonomy, and a digital photo of the source specimen.
In contrast to all other DNA depository databases, the Barcode of Life Data System (BOLD) holds detailed information for all specimens that are used for barcoding, such as collection date, locality, taxonomy, sex, geographic coordinates and elevation, as well as specimen images, which facilitate first-pass morphological identification. Additionally, the DNA sequence and the original sequence trace files (the raw electropherograms from which the sequence was read) are linked to the specimen record after a DNA barcode is generated. Naturally, this implies strict quality standards for both the voucher specimen and accompanying data. Special emphasis is made on unambiguous association of DNA barcodes with individual specimens and corresponding data records. Thus the specimens that are used for DNA barcoding need to be preserved individually, each with a unique identifying number, or Sample ID.
Specimen Numbering Conventions
Once of the key aspects about building the reference library of DNA barcodes is maintaining a solid link between the DNA barcode and the specimen from which it originates and avoiding unwanted duplication of barcode data. This becomes a particularly sensitive issue with respect to vertebrates, because not uncommonly tissue originating from the same individual of a rare species may be stored at different locations (sometimes in different countries) and catalogued under different numbers. It is therefore important that standard numbering conventions are followed when assigning individual specimen identifiers (BOLD Sample ID’s). It is preferred that a Sample ID refers directly to the catalogue number of a museum collection voucher and is prefixed by the standard acronym of the institution housing it. If no catalogue number is available, then a field number may be used, prefixed by the collector’s initials (usually a 3-4 letter abbreviation). Museum abbreviations should follow standard registers for biorepositories, e.g., http://www.biorepositories.org. This allows the Sample ID number to be self-explanatory if used in publications citing a given specimen. It is important that, whichever numbering convention is chosen, it is followed precisely in all data submission forms, including specimen data, plate records, and images, to allow cross-linkages between different pieces of data.
To ensure quick and efficient turnaround in the lab, samples submitted to the Biodiversity Institute core sequencing facility should be compatible with the high-throughput analytical protocols used in the lab. As a key element, samples should be pre-formatted in arrays corresponding to the 96-well plates used in the lab (with one well left empty as a negative control). The BIO Core Analytical Facility has several standard sampling kits in place which aid external collaborators in formatting their samples in a compliant fashion. Please contact the campaign coordinator for more details.
All submitted samples should be accompanied by label data containing information on the source specimen: Sample ID (voucher specimen identifier - field and/or museum number), taxonomic identification, collection date, collector’s name, geographic locality, and any relevant comments, in BOLD-compliant data submission format, as defined on the BOLD website.
Full specimen data should be submitted to BOLD prior to the beginning of the analyses. Collection data on the source specimens for DNA barcodes is a critical part of the Barcode of Life Database. However, it does not replace the original specimen documentation. Each BOLD entry has a reference to the museum/institution housing the source specimen or tissue, and researchers using published BOLD data in their studies are expected to acknowledge them in their resulting publications. If repositories have online catalogues of their collections, they are encouraged to provide links to data records for the source specimens of tissue samples, so that respective entries in their online catalogues can be cross-referenced directly from BOLD. External collaborators are encouraged to contact their corresponding project coordinators for details on data submission procedures and for blank data submission forms.
A digital photograph (or a set of photographs) of the source specimen is a valuable asset and is a requirement unless there are special circumstances, which must be discussed with the campaign coordinators. Specimen contributors are encouraged to submit specimen images to BOLD directly. Each specimen entry may be associated with multiple photographs taken from various aspects of the specimen.
Using BOLD to Generate your own DNA Barcode Data
The polar barcoding campaign welcomes contributions from researchers and institutions willing to conduct all or part of the molecular analyses using their own facilities. Standard data exchange protocols exist that facilitate seamless information transfer and ensure high quality standards of submitted data. Please contact campaign coordinator for more details.
Co-authorship, Acknowledgments, and Copyright Issues
Any images or collateral specimen information supplied by External Collaborators remain the intellectual property (and copyright, if applicable) of their original submitters and/or the institutions they represent. It is understood that, once the data submission has been made to BOLD, specimen data and images become partially available to the public online through the BOLD Taxonomy Browser. This information is used to generate summary statistics and illustrative distribution maps. However, there is no disclosure of the contents of individual research projects. Upon the publication of each BOLD project, all specimen data and images become publicly available through the BOLD online interface.
Specimen provenance data and the results of analyses conducted within the framework of each project remain accessible only to contributor(s), donor(s), and BIO staff members directly affiliated with this particular project and are not to be communicated to third parties until project completion, without prior consent from principal project participants. Upon project completion, usually following a co-authored publication, DNA sequences become available to the public on BOLD, subject to approval by all project participants.
It should be understood by all parties that sequence data contained in BOLD projects with restricted access may be used by the BOLD identification engine to provide DNA-based taxonomic identifications to public users submitting DNA barcode sequences. The reports generated by the BOLD identification engine include probability scores and tree-based identification with branch labels containing taxonomic names and broad geographic localization (to province level). However, individual specimen identifiers and sequence data are not being disclosed by the BOLD identification engine.
It is possible for both External Collaborators and their Project Coordinators to use unpublished project data in other simultaneously prepared and submitted publications (e.g., specialized taxonomic revisions or new species descriptions). However, this has to be negotiated on a case-by-case basis with all relevant project participants. It is desirable that this intent be stated clearly at the initial stages of prospective projects.
If the data contained in a closed project remain unchanged for a period of over one year and no manuscript is submitted for publication by participating external collaborator(s) for a period of over two years since the submission of DNA barcodes, the project is designated as “orphaned”. If funding for the analyses was provided through CCDB/BIO, and no response is received from the External Collaborator on the intended publication schedule, BIO/CCDB retains the right to make the online contents of an “orphaned” project publicly available through the BOLD web site, subject to approval by the CCDB/BIO and iBOL administration.
Beginning July 1st, 2009, a new data release policy for sequence data was enacted, in accordance with standard genomics project procedures. One week following sequence generation, the sequence, associated traces files, country of origin of the specimen, and taxonomic identification down to ordinal level will be released to GenBank to permit data quality controls and accountability regarding productivity to be made public. Full release of data will follow publication, as is the case under current practice. Further information can be found in the Lab Procedures section.