Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014;5(1):446-461.
doi: 10.1039/C3SC52951G.

Inteins: Nature's Gift to Protein Chemists

Affiliations

Inteins: Nature's Gift to Protein Chemists

Neel H Shah et al. Chem Sci. 2014.

Abstract

Inteins are auto-processing domains found in organisms from all domains of life. These proteins carry out a process known as protein splicing, which is a multi-step biochemical reaction comprised of both the cleavage and formation of peptide bonds. While the endogenous substrates of protein splicing are specific essential proteins found in intein-containing host organisms, inteins are also functional in exogenous contexts and can be used to chemically manipulate virtually any polypeptide backbone. Given this, protein chemists have exploited various facets of intein reactivity to modify proteins in myriad ways for both basic biological research as well as potential therapeutic applications. Here, we review the intein field, first focusing on the biological context and phylogenetic diversity of inteins, followed by a description of intein structure and biochemical function. Finally, we discuss prevalent inteinbased technologies, focusing on their applications in chemical biology, followed by persistent caveats of intein chemistry and approaches to alleviate these shortcomings. The findings summarized herein describe two and a half decades of research, leading from a biochemical curiosity to the development of powerful protein engineering tools.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Protein splicing in nature
a. Protein cis-splicing by the more prevalent contiguous inteins. b. Protein trans-splicing by the rarer split inteins. IntN refers to the N-intein and IntC refers to the C-intein. c. The phylogenetic distribution of intein-containing organisms. Roughly 300 organisms containing one or more intein are shown in this phylogenetic tree. The bars at the periphery of the tree denote the number of inteins in each organism. The smallest bar indicates one intein, and the largest bar indicates 19 inteins. Black bars indicate inteins identified based on their gene sequence whose splicing capacity has not yet been determined experimentally. Red bars indicate inteins that have been shown experimentally to facilitate protein splicing. Data was extracted from the NEB InBase, which was last updated in 2010. Several inteins from the recent literature were added to this dataset, but we acknowledge that this tree may not reflect all discovered or characterized inteins. The phylogenetic tree was generated using the Interactive Tree of Life (iTOL) online tool. Phylogenetic relationships were automatically inferred by the iTOL software based on the NCBI taxonomic identifiers for each organism. Based on this classification system, in some cases different strains of the same organism were combined into one data entry.
Figure 2
Figure 2. Intein spreading and splitting through homing endonucleases
a. Homing endonuclease activity to convert an intein-free allele into an intein-containing allele. b. Proposed mechanism for intein splitting due to aberrant homing endonuclease invasion followed by chromosomal rearrangements.
Figure 3
Figure 3. The chemical mechanism of protein splicing and common side reactions
a. Conserved sequence motifs that facilitate protein splicing. b. The canonical mechanism of protein splicing. c. N-extein cleavage, also known as N-terminal cleavage, from the linear or branched (thio)ester intermediate. d. C-extein cleavage, also known as C-terminal cleavage, from the precursor protein or the N-extein cleaved protein.
Figure 4
Figure 4. Comparison of various Hint domain structures
a. Thermococcus kodakaraensis Pol-2 intein (PDB 2CW7). bMycobacterium xenopi GyrA intein (PDB 1AM2). c Methanococcus jannaschii KlbA intein (PDB 2JNQ). d Synechocystis sp. PCC6803 DnaE split intein (PDB 1ZD7). e Nostoc punctiforme DnaE split intein (PDB 2KEQ). f Clostridium thermocellum BIL domain 4 (PDB 2LWY). gDrosophila melanogaster Hog domain (PDB 1AT0). In panels a to g, the Hint domain is shown as green ribbon with the Block A nucleophile and Block G asparagine positions highlighted as orange spheres. In panel a, the homing endonuclease domain is shown in blue. For the split inteins in panels d and e (which are artificially fused in the solved structures), the C-intein region is light green. h. A close up of the MxeGyrA active site, highlighting key residues as sticks. The Block A Cys is mutated to Ala in this structure.
Figure 5
Figure 5. In vitro protein semi-synthesis
a. Expressed Protein Ligation (EPL). A C-terminal thioester is generated by cleavage from the intein followed by condensation with an N-terminal cysteine-containing peptide or protein. b. Semi-synthesis by protein trans-splicing (PTS) with a synthetically accessible C-intein. c. Semi-synthesis by protein trans-splicing (PTS) with a synthetically accessible N-intein. d. Affinity capture and protein modification using streamlined EPL.
Figure 6
Figure 6. Segmental isotopic labeling using split inteinsin vivo
Figure 7
Figure 7. Protein and peptide cyclization
a. Cyclization of a protein using EPL. The rendering is based on the N-terminal SH3 domain of c-Crk-II (PDB 1M30), which has been head-to-tail cyclized using this method. b. Split Intein-mediated Circular Ligation Of Peptides and ProteinS (SICLOPPS).
Figure 8
Figure 8. Conditional protein splicing
a. Intein activation through conformational change induced by ligand binding to a fused ligand binding domain. b. Intein activation through deprotection of a photo-caged active site residue. c. Activation of an artificially split intein through chemically induced dimerization.

Similar articles

Cited by

References

    1. Anfinsen CB. Science. 1973;181:223–230. - PubMed
    1. Hartl FU, Hayer-Hartl M. Nat. Struct. Mol. Biol. 2009;16:574–581. - PubMed
    1. Uversky VN, Gillespie JR, Fink AL. Protein Struct. Funct. Genet. 2000;41:415–427. - PubMed
    1. Walsh CT, Garneau-Tsodikova S, Gatto GJ. Angew. Chem. Int. Ed. 2005;44:7342–7372. - PubMed
    1. Waugh DS. Protein Expr. Purif. 2011;80:283–293. - PMC - PubMed