Mycocosm and Phycocosm will be offline on June 17 from 8am until 8pm PDT for essential maintenance. We apologize for inconvenience.

The ever-increasing number of sequenced genomes presents us with an exciting opportunity to discover highly conserved gene families of unknown function, then characterize them experimentally. We have detected these gene families across the kingdom Fungi (see Methods at the bottom of this page), and invite the international research community to functionally characterize their individual members and propagate their annotations across the Fungal Tree of Life. Investigators can register, login, click on any cluster below, and add notes to any protein from the list along with the methods used for functional characterization.

Conserved Genes Families of Unknown Function
Total Families:
142
Total Genes:
79,713
Total Unique Species:
1,282
Total Annotated Genes:
33
Total Unique PFAM Domains:
170
Total Unique Uniprot:
110
Total Unique PDB:
64
Updated:
2020-11-06
##GenesUnique SpeciesProteins with PFAM DomainsUnique PFAM Domains CountProtein PFAM DomainsUniprot HMM HintPDB HMM HintConserved InAlso Found InUser Curated ModelsAvg. Protein Length
11,2983437741Uncharacterized alpha/beta hydrolase domain (DUF2235):
774
AgaricomycotinaBacteria0504
21,2401,1011,2381Protein of unknown function (DUF962):
1,238
FungiEukaryota2197
31,2221,1467032Protein of unknown function (DUF2408):
702
MIF4G domain:
1
FungiViridiplantae0461
41,2191,1851,2191Uncharacterised protein family (UPF0160):
1,219
FungiEukaryota0352
51,1541,1385097HEAT repeat:
494
HEAT-like repeat:
9
Adaptin N terminal region:
4
HEAT repeats:
1
Protein of unknown function DUF89:
1
Alpha-L-rhamnosidase N-terminal domain:
1
DNA/RNA non-specific endonuclease:
1
FungiViridiplantae02,039
61,1271,1101,1273Domain of unknown function (DUF383):
1,126
Domain of unknown function (DUF384):
1,122
SecE/Sec61-gamma subunits of protein translocation complex:
1
FungiOpisthokonta + Viridiplantae0390
71,1201,0931,1191Protein of unknown function (DUF775):
1,119
FungiOpisthokonta + Viridiplantae0214
81,1161,0511,1161Uncharacterised protein family UPF0052:
1,116
FungiEukaryota0496
91,0721,0451,0531Eukaryotic protein of unknown function (DUF866):
1,053
FungiOpisthokonta1162
101,0809615601Protein of unknown function (DUF4449):
560
FungiViridiplantae0866
111,0831,0712096HEAT repeat:
202
Cse1:
4
Ankyrin repeats (3 copies):
1
Ssl1-like:
1
Ankyrin repeat:
1
RNase P subunit Pop3:
1
FungiViridiplantae11,118
121,0331,0159819TPR repeat:
481
Tetratricopeptide repeat:
454
Tetratricopeptide repeat:
336
Tetratricopeptide repeat:
37
Tetratricopeptide repeat:
10
Tetratricopeptide repeat:
7
NUDIX domain:
4
Tetratricopeptide repeat:
4
Tetratricopeptide repeat:
2
FungiEukaryota0936
131,02371451Lipocalin / cytosolic fatty-acid binding protein family:
5
Fungi0194
141,0248205651Domain of unknown function (DUF4149):
565
FungiViridiplantae1182
151,0147581,0131Protein of unknown function (DUF1479):
1,013
FungiUniversal0477
1693692822L27 domain:
1
Ribosomal protein L7Ae/L30e/S12e/Gadd45 family:
1
FungiViridiplantae0510
179276735675Protein of unknown function (DUF3292):
566
Integral peroxisomal membrane peroxin:
1
Plant phosphoribosyltransferase C-terminal:
1
HIUase/Transthyretin family:
1
Profilin:
1
Fungi0674
188578338542NFACT protein RNA binding domain:
854
YacP-like NYN domain:
1
FungiEukaryota0215
198578098571Protein of unknown function (DUF1348):
857
FungiUniversal0157
207897777822Eukaryotic integral membrane protein (DUF1751):
782
Rhomboid family:
1
FungiViridiplantae0370
217946334851Questin oxidase-like:
485
PezizomycotinaViridiplantae0419
227627384462Uncharacterized protein conserved in bacteria (DUF2264):
445
Transcription factor WhiB:
1
FungiProkaryotes0649
237467367096UBA/TS-N domain:
508
TPR repeat:
364
Tetratricopeptide repeat:
52
Tetratricopeptide repeat:
22
DnaJ domain:
4
Chaperonin 10 Kd subunit:
1
FungiOpisthokonta + Viridiplantae1913
2475173300Fungi0119
257487407483Protein adenylyltransferase SelO:
748
Ankyrin repeats (3 copies):
1
Leucine Rich Repeat:
1
FungiUniversal0633
267457017261Protein of unknown function (DUF1769):
726
FungiViridiplantae0323
277257157161Domain of unknown function (DUF1741):
716
FungiOpisthokonta + Viridiplantae0644
28720713141Fragile site-associated protein C-terminus:
14
FungiViridiplantae + Bacteria13,197
297127076373Putative death-receptor fusion protein (DUF2428):
384
HEAT repeat:
255
HEAT-like repeat:
4
Fungi01,599
307077007001Protein of unknown function (DUF1682):
700
FungiViridiplantae + Bacteria1436
316996304123Fungal protein of unknown function (DUF1752):
412
SGF29 tudor-like domain:
5
Protein of unknown function (DUF3295):
1
PezizomycotinaViridiplantae0559
327037012706Ankyrin repeat:
162
Ankyrin repeats (many copies):
106
Ankyrin repeats (3 copies):
1
Zinc finger, C2H2 type:
1
Ubiquitin interaction motif:
1
Ankyrin repeat:
1
FungiViridiplantae1646
336836743872DENN domain-containing protein 11:
387
Domain of unknown function (DUF4484):
387
FungiViridiplantae0606
346776692552Protein of unknown function (DUF3712):
254
BTB/POZ domain:
1
FungiViridiplantae0855
356616586223WD domain, G-beta repeat:
622
Nup133 N terminal like:
1
Nucleoporin Nup120/160:
1
FungiViridiplantae0510
36634600332NAD dependent epimerase/dehydratase family:
25
NAD(P)H-binding:
8
Pucciniomycotina0300
376596536591Protein of unknown function (DUF1295):
659
FungiUniversal0363
38658654511Protein of unknown function (DUF4449):
51
FungiViridiplantae0743
3965964551Transmembrane alpha-helix domain:
5
FungiOpisthokonta0713
4065264653Putative peptidoglycan binding domain:
2
Tim10/DDP family zinc finger:
2
Myosin tail:
1
FungiViridiplantae1876
41655634176Galactose oxidase, central domain:
6
Kelch motif:
6
Glycophorin A:
2
Herpesvirus glycoprotein D/GG/GX domain:
1
Kelch motif:
1
Galactose oxidase, central domain:
1
PezizomycotinaViridiplantae0751
4264964300FungiViridiplantae1966
4365163700Pezizomycotina0351
4464964511SOS response associated peptidase (SRAP):
1
Fungi0390
4565164800FungiViridiplantae0563
4664863821S25 ribosomal protein:
2
PezizomycotinaViridiplantae0768
4762946721Fungal protein of unknown function (DUF1774):
2
Pucciniomycotina Ustilaginomycotina Mucoromycota Agaricomycotina0273
4864664400Fungi1553
4964163748510Tetratricopeptide repeat:
247
Tetratricopeptide repeat:
111
TPR repeat:
61
Tetratricopeptide repeat:
51
Tetratricopeptide repeat:
10
Tetratricopeptide repeat:
9
Tetratricopeptide repeat:
5
Tetratricopeptide repeat:
3
Tetratricopeptide repeat:
2
Tetratricopeptide repeat:
2
PezizomycotinaViridiplantae0322
5064564000Pezizomycotina0274
5164564100Pezizomycotina0302
5264564000PezizomycotinaViridiplantae0328
5363463051ATPase family associated with various cellular activities (AAA):
5
PezizomycotinaViridiplantae0368
5464163332Metallo-beta-lactamase superfamily:
2
Mitochondrial carrier protein:
1
PezizomycotinaViridiplantae + Bacteria01,694
556386313461WD domain, G-beta repeat:
346
PezizomycotinaViridiplantae0423
5663763000PezizomycotinaViridiplantae3453
576386313503Armadillo/beta-catenin-like repeat:
324
HEAT repeat:
56
HEAT repeats:
4
PezizomycotinaOpisthokonta + Viridiplantae01,020
5863462900Pezizomycotina3433
5963362300PezizomycotinaViridiplantae0429
6063562711Ubiquitin fusion degradation protein UFD1:
1
PezizomycotinaViridiplantae + Cryptophyta1841
6163661900Pezizomycotina0946
626386353601Domain of unknown function (DUF1992):
360
PezizomycotinaViridiplantae0522
636296243491PF08217:
349
PezizomycotinaViridiplantae0828
6463162700Pezizomycotina0455
656296262130S ribosomal protein subunit S22 family:
2
PezizomycotinaViridiplantae2448
66625622101SnoaL-like polyketide cyclase:
10
PezizomycotinaViridiplantae0551
676246213571Domain of unknown function (DUF4078):
357
PezizomycotinaViridiplantae0358
6862461211Protein of unknown function (DUF4030):
1
Pezizomycotina0138
69622618301Domain of unknown function (DUF4588):
30
Pezizomycotina0264
7062061700PezizomycotinaViridiplantae0643
7161961600Pezizomycotina0369
7262061700Pezizomycotina Bacteria0357
7362161500Pezizomycotina0441
7462161411PQ loop repeat:
1
Pezizomycotina01,106
7561660622Plethodontid receptivity factor PRF:
1
emp24/gp25L/p24 family/GOLD:
1
PezizomycotinaViridiplantae01,298
7661861500Pezizomycotina0559
7761961300PezizomycotinaViridiplantae + Bacteria0499
786196103611Domain of unknown function (DUF4452):
361
PezizomycotinaViridiplantae0195
796216153611Domain of unknown function (DUF4604):
361
Pezizomycotina0171
8061661062Bombesin-like peptide:
5
Serine incorporator (Serinc):
1
Pezizomycotina0303
816156082023GATA zinc finger:
195
AT hook motif:
11
Fungal Zn(2)-Cys(6) binuclear cluster domain:
6
PezizomycotinaViridiplantae + Bacteria01,233
826146093511Protein of unknown function (DUF2418):
351
PezizomycotinaViridiplantae0461
8361261011AIR carboxylase:
1
Pezizomycotina0598
846146081305Tetratricopeptide repeat:
125
Coatomer epsilon subunit:
2
Tetratricopeptide repeat:
1
Tetratricopeptide repeat:
1
Tetratricopeptide repeat:
1
PezizomycotinaViridiplantae0440
8561160700Pezizomycotina0549
8660860500Pezizomycotina0908
8761060732SAP domain:
2
Rho termination factor, N-terminal domain:
1
Pezizomycotina0353
8854534211Liver-expressed antimicrobial peptide 2 precursor (LEAP-2):
1
Agaricomycotina0335
895294765291Uncharacterized protein family UPF0016:
529
Pucciniomycotina AgaricomycotinaEukaryota1276
904904714881Integral membrane protein DUF92:
488
Pucciniomycotina MucoromycotaEukaryota0309
9148531411COPI associated protein:
1
Agaricomycotina0277
924694544691Uncharacterised protein family UPF0047:
469
MucoromycotaUniversal0141
93439427774Ankyrin repeat:
59
Ankyrin repeats (many copies):
11
Ankyrin repeat:
6
Ankyrin repeats (many copies):
1
Pucciniomycotina AgaricomycotinaOpisthokonta1659
944474164471Protein of unknown function (DUF1295):
447
AgaricomycotinaUniversal0359
954334234311Protein of unknown function (DUF1682):
431
Mucoromycota Agaricomycotina1373
964184082221Uncharacterized conserved protein (DUF2340):
222
SaccharomycotinaEukaryota1142
9741438000Agaricomycotina1263
98409350222PhoD-like phosphatase:
21
Cytochrome c oxidase subunit Vb:
1
UstilaginomycotinaOpisthokonta + Viridiplantae + Bacteria0700
99401374642Armadillo/beta-catenin-like repeat:
63
HEAT repeat:
1
Mucoromycota0676
1003953873324WD domain, G-beta repeat:
330
WD40-like Beta Propeller Repeat:
4
Eukaryotic translation initiation factor eIF2A:
1
60s Acidic ribosomal protein:
1
Ustilaginomycotina AgaricomycotinaEukaryota0503
1013833733822Eukaryotic integral membrane protein (DUF1751):
382
Der1-like family:
1
Pucciniomycotina Agaricomycotina2356
1023853741972Putative TOS1-like glycosyl hydrolase (DUF2401):
197
Glycine-rich protein domain (DUF2403):
197
Taphrinomycotina SaccharomycotinaViridiplantae + Archaea1467
103381329432GDSL-like Lipase/Acylhydrolase family:
36
GDSL-like Lipase/Acylhydrolase:
7
Agaricomycotina1567
10437235700Ustilaginomycotina Agaricomycotina1318
105370366921Armadillo/beta-catenin-like repeat:
92
Agaricomycotina0696
1063753653752Protein adenylyltransferase SelO:
375
Rpp14/Pop5 family:
3
AgaricomycotinaUniversal0725
10735535221SPT2 chromatin protein:
2
Agaricomycotina03,248
10835234811AT hook motif:
1
Agaricomycotina0993
10934133500Agaricomycotina0785
11033731711Kelch motif:
1
Agaricomycotina0388
11133831800Agaricomycotina0636
112342328882BTB/POZ domain:
48
MATH domain:
41
Pucciniomycotina Ustilaginomycotina0610
11333330900Agaricomycotina0472
11433131511UreD urease accessory protein:
1
Agaricomycotina0873
11533031900Agaricomycotina1441
11632631200Agaricomycotina0237
1173213191861Protein of unknown function (DUF2418):
186
Agaricomycotina0390
1183183081392Vacuolar ATP synthase subunit S1 (ATP6S1):
138
PF08319:
1
Agaricomycotina0283
11932031500Agaricomycotina0375
120313309122Arrestin (or S-antigen), C-terminal domain:
9
Eukaryotic protein of unknown function (DUF1764):
3
Agaricomycotina0663
1212992711702Protein of unknown function (DUF3712):
170
Ribosomal RNA adenine dimethylase:
1
UstilaginomycotinaBacteria02,401
1223052901522Ykl077w/Psg1 (Pma1 Stabilization in Golgi):
151
Amino acid permease:
1
Ustilaginomycotina0545
12319056221MAPEG family:
22
Mucoromycota0173
1241781551771Protein of unknown function (DUF1640):
177
SaccharomycotinaOpisthokonta1204
1251551001551CYRIA/CYRIB Rac1 binding domain:
155
MucoromycotaEukaryota0328
12610776815Protein of unknown function (DUF3684):
78
Histidine kinase-, DNA gyrase B-, and HSP90-like ATPase:
12
CUE domain:
4
Histidine kinase-, DNA gyrase B-, and HSP90-like ATPase:
2
TLD:
1
MucoromycotaEukaryota01,680
127835500Mucoromycota0262
128928400Ustilaginomycotina0589
1298279821Uncharacterised protein family (UPF0203):
82
MucoromycotaUniversal083
130666200Mucoromycota0281
131666121MIOREX complex component 7:
2
Mucoromycota070
1326855181Uncharacterised protein (DUF2406):
18
Saccharomycotina0190
133592711Cytochrome c oxidase assembly protein PET191:
1
Ustilaginomycotina0537
134443000Ustilaginomycotina0380
135373391Armadillo/beta-catenin-like repeat:
9
Ustilaginomycotina0943
136363611YLP motif:
1
Ustilaginomycotina0557
137343463Ankyrin repeats (many copies):
3
Ankyrin repeat:
2
Glycolipid 2-alpha-mannosyltransferase:
1
UstilaginomycotinaViridiplantae0818
138282661Perilipin family:
6
Ustilaginomycotina0298
1392726162Domain of unknown function (DUF1708):
12
RhoGAP domain:
4
Ustilaginomycotina01,410
140272543Leucine Rich Repeat:
2
Leucine Rich Repeat:
1
Leucine Rich repeat:
1
Ustilaginomycotina0905
1412626101PF13345:
10
Ustilaginomycotina0652
142252500Ustilaginomycotina0346

Methods

Over 18 millions proteins encoded in 1282 fungal genomes from Mycocosm were clustered into families using cascaded MMseqs2 with default parameters (Steinegger et al, 2017). Our subset of 142 clusters have the following 3 properties. Each is:

  • conserved across large phylogenetic distances, i.e. present in either (i) >50% of all fungal genomes, or (ii) >90% of genomes in the clade named in the 'Conserved in' column.
  • of unknown function, i.e. have few or no (i) known Pfam domains except domains without specific function, like those starting with DUF or UPF, nor (ii) functionally annotated Blastp hits against Swissprot.
  • encoded by expressed genes, i.e. >20% genes are supported by transcriptomics data.

An individual family member may have manual curations retrieved from MycoCosm or functional domains not shared with the rest of its family. Families as a whole may also have similarity to distant protein families in Uniprot or Protein Data Bank (PDB), as found by pairwise HMM-based HHblits searches (Steinnegger et al, 2019) against the non-redundant Uniprot20_2016 (defined by <20% sequence identity) and PDB70 (defined by <70% sequence identity) sets of protein sequences. Such distantly related proteins are presented in the list as "hints" (‘Uniprot HMM Hint’ and ‘PDB HMM Hint’ columns).

References

  1. Steinegger M, Söding J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol. 2017 Nov;35(11):1026-1028. doi: 10.1038/nbt.3988. Epub 2017 Oct 16. PMID: 29035372.
  2. Steinegger M, Meier M, Mirdita M, Vöhringer H, Haunsberger SJ, Söding J. HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics. 2019 Sep 14;20(1):473. doi: 10.1186/s12859-019-3019-7. PMID: 31521110; PMCID: PMC6744700.