Info • Branchiostoma floridae
Project Status

The genome of Branchiostoma floridae is estimated to be approximately 575 Mb contained in 19 pairs of chromosomes, and is being sequenced to approximately 8.1 X depth.

The genome assembly release v.1.0 was annotated using the JGI annotation pipeline. Gene models and associated transcripts/proteins are predicted or mapped using a variety of tools based on cDNA, protein homology and ab initio methods. The current release contains approximately 50817 gene models composed of known Branchiostoma floridae genes as well as support from available Branchiostoma floridae EST and cDNA data.

Approximately 95% of Branchiostoma floridae full-length cDNAs mapped to the v.1.0 assembly. Average gene length is 9.1 kb and average transcript length is 1.4 kb, with the average protein containing 451 amino acids. There are approximately 7 exons per gene averaging 204 bp each with intron spacing of 1.3 kb. Gene functions have been automatically assigned based on homology to known genes. Manual curation of these annotations will start shortly.

Assembly Release

v2.0 (May, 2008): We have created a non-redundant representation (v2.0) of the genome sequence which is a mosaic of the two haplotypes found in assembly v1.0. Both assemblies can be downloaded from the Branchiostoma portal download page. The 1000 longest scaffolds of assembly version
1.5 were aligned to one another using MegaBLAST (Zhang, Schwartz et al. 2000), and manually curated into 398 connected sets of allelic scaffolds. In this process, 132 potential mis-joins were identified in Version 1.5 scaffolds and broken. Each of the 398 sets of allelic scaffolds was merged into a non-redundant representative sequence which is a mosaic of the two haplotypes, created by concatenating segments of the scaffolds in the set. The mosaic was constructed to switch between haplotypes only between gene models, and to minimize the number of transitions between haplotypes. Among the possible tilings with the minimum number of transitions, we selected that which minimizes the total length of sequence gaps in the merged sequence. This method is similar in spirit to that applied to the Ciona savignyi genome by Small (Small, Brudno et al. 2007). Assembly v2.0 spans 522 Mb, with scaffold N/L50 = 62 / 2.6 Mb and contig N/L50 = 4916 / 28kb. The net assembly length is slightly longer than the estimated haploid genome size, which could be accounted for by contributions from internal assembly gaps, residual allelic redundancy and haplotype-unique sequences.

Putnam, N. H., T. Butts, et al. (2008). "The amphioxus genome and the evolution of the chordate karyotype." Nature: in press (doi:10.1038/ nature06967).

Small, K. S., M. Brudno, et al. (2007). "A haplome alignment and reference sequence of the highly polymorphic Ciona savignyi genome." Genome Biol 8(3): R41.

Zhang, Z., S. Schwartz, et al. (2000). "A greedy algorithm for aligning DNA sequences." J Comput Biol 7(1-2): 203-14.

v.1.0 (December 5, 2006): Approximately 6.5 Million shotgun reads were initially assembled using JAZZ. A high allelic polymorphism rate of 5-10% allowed the two haplotypes to be assembled separately at approximately 75% of genomic loci. There are a total of 3,032 scaffolds, with a total length of 923 Mb composed of 81,073 contigs. Half of the assembly is contained in 174 scaffolds, all at least 1.6 Mb in length. The length-weighted mean contig size (L50) is 26kb.

Nicholas H. Putnam, Thomas Butts, David E. K. Ferrier, Rebecca F. Furlong, Uffe Hellsten, Takeshi Kawashima, Marc Robinson-Rechavi, Eiichi Shoguchi, Astrid Terry, Jr-Kai Yu, Elia Benito-Gutiérrez, Inna Dubchak, Jordi Garcia-Fernàndez, Jeremy J. Gibson-Brown, Igor V. Grigoriev, Amy C. Horton, Pieter J. de Jong, Jerzy Jurka, Vladimir V. Kapitonov, Yuji Kohara, Yoko Kuroki, Erika Lindquist, Susan Lucas, Kazutoyo Osoegawa, Len A. Pennacchio, Asaf A. Salamov, Yutaka Satou, Tatjana Sauka-Spengler, Jeremy Schmutz, Tadasu Shin-I, Atsushi Toyoda, Marianne Bronner-Fraser, Asao Fujiyama, Linda Z. Holland, Peter W. H. Holland, Nori Satoh & Daniel S. Rokhsar. The amphioxus genome and the evolution of the chordate karyotype. Nature. 2008 June 19;453:1064-1071.


This work was performed under the auspices of the US Department of Energy's Office of Science, Biological and Environmental Research Program and the by the University of California, Lawrence Livermore National Laboratory under Contract No. W-7405-Eng-48, Lawrence Berkeley National Laboratory under contract No. DE-AC03-76SF00098 and Los Alamos National Laboratory under contract No. W-7405-ENG-36.