Introduction to dbCAN-PUL

dbCAN-PUL is a data repository of prokaryotic CAZyme-containing gene clusters that have been experimentally validated to act on a carbohydrate substrate (also known as polysaccharide utilization loci or PULs). In contrast to similar resources such as PULDB, this repository serves as a database containing the most experimentally verified PULs with a confirmed carbohydrate substrate, as well as range from different phyla and comprised of different metabolism systems (as opposed to only the Bacteroidetes and the Starch utilization system (Sus) gene homologs). dbCAN-PUL has the following features:

PULs from 10 different phyla and 173 number of species as well as loci from metagenomic sequences
Contains both degradative and synthetic CAZyme-containing loci
Metadata for PULs such as substrate and method of experimental verification
Users can retrieve protein sequence, enzyme commission number and dbCAN2 annotations for CAZymes and other proteins
View homologous loci in NCBI GenBank though the integration of the MultiGeneBlast tool
Users can query their own sequences against proteins in PULs in dbCAN-PUL using a BLASTX search
Batch download of data for PULs

dbCAN-PUL will be updated once a year to include new experimentally validated CAZyme containing gene clusters.

Feb-2025 update: 633 experimentally verified PULs in dbCAN-PUL. 122 new PULs curated and added by Feb 2025. Repeated PULs and PULs for capsule polysaccaride synthesis were removed. PUL0460 was changed to two loci: PUL0460 and PUL0796. For more change details, please see change_log.xlsx in Download page.

05-10-2023 update: 671 experimentally verified PULs in dbCAN-PUL.

03-02-2023 update: 654 experimentally verified PULs in dbCAN-PUL. PUL0185 and PUL0293 are removed due to repetition.

05-09-2021 update: 612 experimentally verified PULs in dbCAN-PUL.

Top 20 distribution of PUL according to substrate00101020203030404050506060Get_Substrate_entries?substrate=xylanxylan6257.865384615384606217.0Get_Substrate_entries?substrate=pectinpectin5682.82692307692307237.1923076923077Get_Substrate_entries?substrate=host glycanhost glycan44107.78846153846152277.5769230769231Get_Substrate_entries?substrate=starchstarch42132.75284.3076923076923Get_Substrate_entries?substrate=beta-glucanbeta-glucan38157.71153846153845297.7692307692308Get_Substrate_entries?substrate=alginatealginate29182.6730769230769328.0576923076923Get_Substrate_entries?substrate=human milk oligosaccharidehuman milk oligosaccharide23207.63461538461542348.25Get_Substrate_entries?substrate=fructanfructan21232.59615384615387354.9807692307692Get_Substrate_entries?substrate=cellulosecellulose19257.5576923076923361.71153846153845Get_Substrate_entries?substrate=arabinanarabinan18282.5192307692307365.0769230769231Get_Substrate_entries?substrate=glycosaminoglycanglycosaminoglycan17307.48076923076917368.44230769230774Get_Substrate_entries?substrate=fucoidanfucoidan16332.4423076923076371.80769230769226Get_Substrate_entries?substrate=chitinchitin16357.40384615384613371.80769230769226Get_Substrate_entries?substrate=mucinmucin15382.3653846153846375.1730769230769Get_Substrate_entries?substrate=cellobiosecellobiose15407.32692307692304375.1730769230769Get_Substrate_entries?substrate=carrageenancarrageenan14432.2884615384615378.53846153846155Get_Substrate_entries?substrate=pectic polysaccharidepectic polysaccharide12457.24999999999994385.2692307692307Get_Substrate_entries?substrate=beta-mannanbeta-mannan12482.2115384615384385.2692307692307Get_Substrate_entries?substrate=xyloglucanxyloglucan11507.17307692307685388.63461538461536Get_Substrate_entries?substrate=galactomannangalactomannan11532.1346153846154388.63461538461536Top 20 distribution of PUL according to substratexylanpectinhost glycanstarchbeta-glucanalginatehuman milk oli…human milk oligosaccharidefructancellulosearabinanglycosaminogly…glycosaminoglycanfucoidanchitinmucincellobiosecarrageenanpectic polysac…pectic polysaccharidebeta-mannanxyloglucangalactomannan
The characterized gene clusters in dbCAN-PUL have 127 different types of substrates. The most abundant substrate is characterized is capsule polysaccharide. Displayed here are the top 20 most frequent substrates in dbCAN-PUL, and an extended version of this barplot that illustrates the substrates across all PULs can be found on the Statistics page. Clicking a bar in the barplot links to a table of all PULs that share that type of substrate.
Top 20 distribution of PUL according to taxonomy rank (genus)00101020203030404050506060707080809090100100110110120120130130140140Get_Organism_entries?organism=BacteroidesBacteroides14757.27692307692307216.99999999999997Get_Organism_entries?organism=BifidobacteriumBifidobacterium4581.98461538461538361.7802197802198Get_Organism_entries?organism=uncultureduncultured32106.69230769230768380.23260073260076Get_Organism_entries?organism=PrevotellaPrevotella22131.4394.4267399267399Get_Organism_entries?organism=LactobacillusLactobacillus19156.1076923076923398.68498168498166Get_Organism_entries?organism=StreptococcusStreptococcus19180.8153846153846398.68498168498166Get_Organism_entries?organism=BacillusBacillus16205.5230769230769402.9432234432234Get_Organism_entries?organism=FlavobacteriumFlavobacterium15230.23076923076925404.3626373626373Get_Organism_entries?organism=SegatellaSegatella13254.93846153846155407.2014652014652Get_Organism_entries?organism=PseudoalteromonasPseudoalteromonas13279.6461538461539407.2014652014652Get_Organism_entries?organism=RuminiclostridiumRuminiclostridium12304.3538461538462408.62087912087907Get_Organism_entries?organism=ZobelliaZobellia11329.0615384615385410.040293040293Get_Organism_entries?organism=EscherichiaEscherichia11353.7692307692308410.040293040293Get_Organism_entries?organism=ChitinophagaChitinophaga10378.4769230769231411.459706959707Get_Organism_entries?organism=XanthomonasXanthomonas10403.1846153846154411.459706959707Get_Organism_entries?organism=NeorhodopirellulaNeorhodopirellula9427.8923076923077412.8791208791209Get_Organism_entries?organism=VibrioVibrio9452.6412.8791208791209Get_Organism_entries?organism=RoseburiaRoseburia9477.3076923076923412.8791208791209Get_Organism_entries?organism=ClostridiumClostridium8502.0153846153846414.2985347985348Get_Organism_entries?organism=AlteromonasAlteromonas8526.723076923077414.2985347985348Top 20 distribution of PUL according to taxonomy rank (genus)BacteroidesBifidobacteriumunculturedPrevotellaLactobacillusStreptococcusBacillusFlavobacteriumSegatellaPseudoalteromo…PseudoalteromonasRuminiclostrid…RuminiclostridiumZobelliaEscherichiaChitinophagaXanthomonasNeorhodopirell…NeorhodopirellulaVibrioRoseburiaClostridiumAlteromonas
dbCAN-PUL features PULs from 97 prokaryotic genera as well as metagenomically derived organisms. Here we show the 20 most abundant genera/taxonomic groups, with Bacteroides being the most frequent. The extended version of this barplot illustrates the genera and taxonomic groups across all PULs can be found on the Statistics page. Clicking a bar in the barplot links to a table of all PULs that belong the same genus or taxonomic group.
Top 20 distribution of PUL according to characterization method002020404060608080100100120120140140160160180180200200Query_Entry_By_Experimental?method=enzyme activity assayenzyme activity assay21657.27692307692307217.0Query_Entry_By_Experimental?method=RNA-seqRNA-seq14081.98461538461538290.41524216524215Query_Entry_By_Experimental?method=microarraymicroarray84106.69230769230768344.5106837606837Query_Entry_By_Experimental?method=recombinant protein expressionrecombinant protein expression58131.4369.6264245014245Query_Entry_By_Experimental?method=gene deletion mutant and growth assaygene deletion mutant and growth assay56156.1076923076923371.5584045584045Query_Entry_By_Experimental?method=qRT-PCRqRT-PCR54180.8153846153846373.49038461538464Query_Entry_By_Experimental?method=growth assaygrowth assay53205.5230769230769374.4563746438746Query_Entry_By_Experimental?method=mass spectrometrymass spectrometry53230.23076923076925374.4563746438746Query_Entry_By_Experimental?method=thin-layer chromatographythin-layer chromatography52254.93846153846155375.42236467236467Query_Entry_By_Experimental?method=qPCRqPCR51279.6461538461539376.3883547008547Query_Entry_By_Experimental?method=sequence homology analysissequence homology analysis48304.3538461538462379.28632478632477Query_Entry_By_Experimental?method=clone and expressionclone and expression47329.0615384615385380.2523148148148Query_Entry_By_Experimental?method=RT-PCRRT-PCR39353.7692307692308387.980235042735Query_Entry_By_Experimental?method=differential gene expressiondifferential gene expression38378.4769230769231388.94622507122506Query_Entry_By_Experimental?method=SDS-PAGESDS-PAGE33403.1846153846154393.7761752136752Query_Entry_By_Experimental?method=fosmid library screenfosmid library screen33427.8923076923077393.7761752136752Query_Entry_By_Experimental?method=high-performance anion-exchange chromatographyhigh-performance anion-exchange chromatography32452.6394.74216524216524Query_Entry_By_Experimental?method=reducing-sugar assayreducing-sugar assay25477.3076923076923401.50409544159544Query_Entry_By_Experimental?method=RT-qPCRRT-qPCR23502.0153846153846403.4360754985755Query_Entry_By_Experimental?method=NMRNMR23526.723076923077403.4360754985755Top 20 distribution of PUL according to characterization methodenzyme activit…enzyme activity assayRNA-seqmicroarrayrecombinant pr…recombinant protein expressiongene deletion …gene deletion mutant and growth assayqRT-PCRgrowth assaymass spectrome…mass spectrometrythin-layer chr…thin-layer chromatographyqPCRsequence homol…sequence homology analysisclone and expr…clone and expressionRT-PCRdifferential g…differential gene expressionSDS-PAGEfosmid library…fosmid library screenhigh-performan…high-performance anion-exchange chromatographyreducing-sugar…reducing-sugar assayRT-qPCRNMR
All of the PULs in dbCAN-PUL have been experimentally characterized as either degrading or synthesizing glycan substrates. There is a total of 77 characterization methods used to verify PULs. The barplot on the left displays the top 20 characterization methods among PULs, with enzyme activity assay being the most frequent. The extended version of this barplot that illustrates the characterization methods across all PULs can be found on the Statistics page. Clicking a bar in the barplot links to a table of all PULs that share that type of characterization method.