Assembly v1 Mar 2010) is a 'hybrid' assembly of 454 and Sanger gDNA reads. 660 scaffolds and 35.4 Mbp were assembled using Newbler v2:
Nuclear Genome Assembly: | v1.0 |
Scaffold count: | 660 |
Contig count: | 851 |
Scaffold sequence bases total: | 35.4 Mbp |
Scaffolded (Large) Contig sequence bases total: | 35.2 Mbp |
Estimated % sequence bases in gaps: | 0.7% |
Scaffold N50/L50: | 8/1.6 Mbp |
Contig N50/L50: | 45/256.1 Kbp |
Number of scaffolds > 50.0 Kb: | 27 |
% in scaffolds > 50.0 Kb: | 96% |
Percent GC: | 50% |
Annotation v1.1 (12 Jan 2011) of the v1.0 assembly was produced by the JGI Annotation Pipeline, using a variety of cDNA-based, protein-based, and ab initio gene predictors. After filtering for protein similarity and cDNA support, a total of 10845 genes were structurally and functionally annotated.
Nuclear Genome Annotation: | v1.1 |
# gene models: | 10845 |
Gene density: | 306 genes/Mbp scaffold |
Ave.gene length: | 1863 nt |
Ave. protein length: | 490 aa |
Ave. exon frequency: | 3.2 exons/gene |
% genes with intron: | 82% |
% complete gene models (with start and stop codons): | 96% |
% genes with NR protein support: | 97% |
% genes with Pfam domains: | 66% |
% genes with signal peptide: | 22% |
% genes with multigene family: | 62% |
% genes with EST support: | 86% |
This work was performed under the auspices of the US Department of Energy's Office of Science, Biological and Environmental Research Program, and by the University of California, Lawrence Berkeley National Laboratory under contract No. DE-AC02-05CH11231, Lawrence Livermore National Laboratory under Contract No. DE-AC52-07NA27344, and Los Alamos National Laboratory under contract No. DE-AC02-06NA25396.