Pfam Database

Last updated February 13, 2023
Table of Contents

The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). Proteins are generally composed of one or more functional regions, commonly termed domains. Different combinations of domains give rise to the diverse range of proteins found in nature. The identification of domains that occur within proteins can therefore provide insights into their function.

Pfam also generates higher-level groupings of related entries, known as clans. A clan is a collection of Pfam entries which are related by similarity of sequence, structure, or profile-HMM.

The data presented for each entry is based on the UniProt Reference Proteomes but information on individual UniProtKB sequences can still be found by entering the protein accession. Pfam full alignments are available from searching a variety of databases, either to provide different accessions (e.g., all UniProt and NCBI GI) or different levels of redundancy.

Use the widget below to access the data in this database. Use the filter in the top right corner to narrow down the results. Once you have found the item you need, simply select it and the directory path will be copied to your clipboard. Paste the path in your submission scripts to use in your analysis. If you would like to see the path displayed in the widget, select the Detailed View option.