HMM database of protein families found abundant in known anti-CRISPR operons

From the UHGG, Refseq, MGV, IMG_VR, and GPD database, we scanned for a total of 12582 short-gene operons containing at least one known Acr homolog, and considered them to be known anti-CRISPR operons (AOs). Using these AOs, we eventually constructed an AO database, dbAO. All proteins within the known AOs of dbAO were dereplicated at 95% identity with 90% target coverage, then clustered at 40% identity with 60% target coverage using usearch. Clusters with greater than 5 proteins were considered as AO abundant protein families. For each AO abundant protein family, all proteins were aligned and constructed into HMMs. This resulted in a total of 2023 HMMs, and they play a vital importance in both the understanding and further discovery of new AOs.

  • 2023 protein families found abundant in known AOs, download their HMMs here
  • 12582 AOs of dbAO, download their information table here