Assembly v3 (14 Dec 2009) is a 'hybrid' assembly of 454 and Sanger,gDNA reads improved by the JGI Finishing Pipeline. 963 scaffolds and 36.3 Mbp were assembled :
Main genome scaffold total | 963 |
Main genome contig total | 1346 |
Main genome scaffold sequence total | 36.3 Mbp |
Main genome contig sequence total | 34.2 Mbp |
Estimated % sequence bases in gaps | 5.6 % |
Main genome scaffold N50 / L50 | 8 / 1.8 Mbp |
Main genome contig N50 / L50 | 80 / 131.2 kbp |
Number of scaffolds >50 Kbp | 28 |
% main genome in scaffolds >50 | 93.0% |
Percent GC | 51% |
Coverage | 17.78X |
Annotation v3 (14 Dec 2009) is a consensus gene set predicted by the JGI Annotation Pipeline, using a variety of cDNA-based , protein-based, and ab initio gene modelers. After filtering for homology and expression support, a total of 11624 genes were structurally and functionally annotated:
# of genes | 11624 |
Gene density | 320.3 genes / Mbp scaffold |
Ave.gene length | 2241.3 nt |
Ave. protein length | 460.2 aa |
Ave. exon frequency | 3,5 exons / gene |
% genes with introns | 83% |
% models with start+stop codons | 94% |
% genes with NR hits | 95% |
% genes with Pfam domains | 64% |
% genes with TM domains | 21% |
% genes with ESTs | 91% |
% genes in multigene families | 63% |
This work was performed under the auspices of the US Department of Energy's Office of Science, Biological and Environmental Research Program, and by the University of California, Lawrence Berkeley National Laboratory under contract No. DE-AC02-05CH11231, Lawrence Livermore National Laboratory under Contract No. DE-AC52-07NA27344, and Los Alamos National Laboratory under contract No. DE-AC02-06NA25396 .