Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jun 8;149(6):1368-80.
doi: 10.1016/j.cell.2012.04.027. Epub 2012 May 17.

Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome

Affiliations

Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome

Miao Yu et al. Cell. .

Abstract

The study of 5-hydroxylmethylcytosines (5hmC) has been hampered by the lack of a method to map it at single-base resolution on a genome-wide scale. Affinity purification-based methods cannot precisely locate 5hmC nor accurately determine its relative abundance at each modified site. We here present a genome-wide approach, Tet-assisted bisulfite sequencing (TAB-Seq), that when combined with traditional bisulfite sequencing can be used for mapping 5hmC at base resolution and quantifying the relative abundance of 5hmC as well as 5mC. Application of this method to embryonic stem cells not only confirms widespread distribution of 5hmC in the mammalian genome but also reveals sequence bias and strand asymmetry at 5hmC sites. We observe high levels of 5hmC and reciprocally low levels of 5mC near but not on transcription factor-binding sites. Additionally, the relative abundance of 5hmC varies significantly among distinct functional sequence elements, suggesting different mechanisms for 5hmC deposition and maintenance.

PubMed Disclaimer

Figures

Figure 1
Figure 1. TAB-Seq Strategy and Validation
(A)Schematic diagram of TAB-Seq. 5 hmCs in genomic DNA are protected by glucosylation, and then 5mCs are converted to 5caCs by Tet-mediated oxidation. After bisulfite treatment, both 5caC (generated from 5mC) and C display as T while 5gmC (generated from original 5hmC) displays as C. (B) TAB-Seq of 76-mer dsDNA with 5mC or 5hmC. The 76-mer dsDNA with 5mC (left) or 5hmC (right) modification was subject to TAB-Seq as described in Figure 1A. Sanger sequencing results showed that 5mC was completely converted to T (left) and 5hmC still read as C (right). (C) Mass spectrometry characterization of the products from TAB-Seq with a model DNA. The dsDNA contains a 5mC (left) or 5hmC (right) on a 9mer strand annealed to a 11mer complementary strand. The DNA was subject to βGT-mediated glucosylation and mTet1-mediated oxidation. The reactions were monitored by MALDI-TOF/TOF with the calculated and observed molecular weight indicated. (D)Validation of 5mC and 5hmC conversion in genomic DNA (mouse ES) with western blotting. The untreated DNA, βGT-treated DNA, and βGT/mTet1-treated DNA were tested with dot blot assays using antibodies against 5mC, 5hmC, 5fC and 5caC, respectively. No 5hmC could be observed after glucosylation. Almost all 5mCs were converted into 5caCs after the mTet1-mediated oxidation. See also Figure S1.
Figure 2
Figure 2. Generation of Genome-wide Base-Resolution Maps of 5hmC
(A)Snapshot of base-resolution 5hmC maps (red) compared to affinity-based 5hmC maps (grey) in H1 cells near the POU5F1 gene. Also shown are base-resolution maps of traditional bisulfite sequencing in H1 cells (black). Positive values (darker shades) indicate cytosines on the Watson strand, whereas negative values indicate cytosines on the Crick strand. For 5hmC, the vertical axis limits are −50% to +50%. For traditional bisulfite sequencing, the limits are −100% to +100%. Only cytosines sequenced to depth ≥5 are shown. (B) Overlap of 5hmC with 82,221 genomic regions previously identified as enriched with 5hmC by affinity mapping (black), in comparison to randomly chosen 5mC (white) (see Extended Experimental Procedures). (C) Sequence context of 5hmC sites compared to the reference human genome. (D)Heatmap of estimated abundances of 5hmC and 5mC for modified cytosines significantly enriched with 5hmC. 5mC was estimated as the rate from traditional bisulfite sequencing (5hmC + 5mC) minus the measured 5hmC rate. (E) The distribution of estimated abundances of 5hmC (red) and 5mC (green) at 5hmC sites. m: median. See also Figure S2.
Figure 3
Figure 3. Genomic Distribution of 5hmC Sites
(A)Overlap of H1 5hmC with genomic elements. Genic features were extracted from the UCSC Known Genes database (Hsu et al., 2006). Promoter-distal regulatory elements (>5kb from TSS) reflect those experimentally mapped in H1 cells from ChIP-Seq and DNase-Seq experiments. Each 5hmC base is counted once: the overlap of a genomic element excludes all previously overlapped cytosines counterclockwise to the arrow. Green: promoter-proximal; red: promoter-distal regulatory elements; grey: genic regions; white: intergenic regions. (B) The relative enrichment of H1 5hmC (black) and random sites (grey) at genomic elements, normalized to the total coverage of the element type. Random consists of 10 random samplings of 5mC (see Extended Experimental Procedures). (C) The levels of 5hmCG (left) and 5mCG (right) for several classes of genomic elements significantly enriched with 5hmCG in H1 (p = 0.01, binomial). The dotted line indicates the 5mC non-conversion rate. Colors as in (A). (D)The percentage of distal regulatory elements significantly enriched with 5hmCG in H1. (E) In mouse ES cells, the absolute level of 5hmCG for several classes of genomic elements significantly enriched with 5hmCG (p = 0.01, Fisher’s exact test). Colors as in (A). (F) For genomic elements significantly enriched with 5hmCG in H1 ES cells and conserved in mouse, the distribution of 5hmCG in mouse ES cells. Colors as in (A). In all panels, definitions of enhancers, p300, CTCF, and DNase I sites are promoter-distal (>5-kb from TSS). See also Figure S3.
Figure 4
Figure 4. Profiles of 5hmC at Distal Regulatory Elements
(A)Frequency of 5hmC around distal p300 binding sites. (B) Absolute levels of 5hmCG (red) and 5mCG+5hmCG (black) around the distal p300 binding sites containing an OCT4/SOX2/TCF4/NANOG motif (blue bar, center; consensus: ATTTGCATAACAATG). 5mC (green) was estimated as the rate from traditional bisulfite sequencing (5hmC + 5mC) minus the measured 5hmC rate. The top half indicates enrichment on the strand containing the motif, with the bottom half indicating the opposite strand. (C) Frequency of 5hmC around distal CTCF binding sites, relative to the CTCF motif (blue bar, bottom). The different lines represent different strands, oriented with respect to the CTCF motif (consensus: ATAGTGCCACCTGGTGGCCA). Opp, opposite. (D) Absolute levels of 5hmCG, 5mCG, and 5mCG+5hmCG around distal CTCF binding sites anchored at the CTCF motif (blue bar, center). Colors as in (B). See also Figure S4.
Figure 5
Figure 5. Asymmetry around 5hmCG
(A) A schematic of nomenclature. The cytosine with 5hmC (red) designated as “called”, while the cytosine on the opposite strand (green) is designated as “opposite”. (B) The average 5hmC abundance of called 5hmCG residues (red) compared to the opposite cytosine residues (green). called: called cytosine; opp: opposite cytosine. (C) The average 5hmC (black) and 5mC (white) abundance at called and opposite cytosines, for called cytosines having 5hmC (left) or 5mC+5hmC (right). 5mC (white excluding black) was estimated as the rate from traditional bisulfite sequencing (5hmC + 5mC) minus the measured 5hmC rate. Grey line: 5mC non-conversion rate. (D)The distribution of differences in 5hmCG (red) between called and opposite cytosines, in comparison to differences observed from traditional bisulfite sequencing (green, 5mCG + 5hmCG). Called and opposite cytosines are each sequenced to at least depth 10. (E) For 5hmC-called sites, a heatmap of 5hmCG abundance at called and opposite cytosine pairs (left). For the 5mC-called sites from traditional bisulfite sequencing, a heatmap of 5mCG + 5hmCG abundance at called and opposite cytosine pairs (right). See also Figure S5.
Figure 6
Figure 6. Local Sequence Context around 5hmCG
(A)Sequence context ±150bp around 5hmCG sites (left), compared to the same number of randomly chosen mCG sites (right). Shown sequences are on the same strand as 5hmC. Inset: sequence context ±10bp around 5hmCG sites that are on the Watson or Crick strands. Positive coordinates indicate the 3’ direction. (B) For cytosines showing significant difference in 5hmCG between Watson and Crick strands (p = 0.01, Fisher’s exact test), and for which the abundance of guanine ±50bp around the site showing significant strand bias (p = 0.01, Fisher’s exact test), shown is the frequency at which these two events co-occur. See also Figure S6.
Figure 7
Figure 7. 5hmCG is Biased towards Low CpG Regions
Shown are heatmaps of percent 5hmCG (±250bp from TSS or DHS) as a function of CpG density for A) Promoters in H1 ES cells, B) promoters in mouse ES cells, D) DHS sites lacking H3K4me1 and H3K27ac, E) DHS sites with a poised enhancer chromatin signature, and F) DHS sites with an active enhancer chromatin signature. (C) The GC content relative to the CpG content for the 5hmC-enriched versus the 5hmC not enriched promoters. See also Figure S7.

Comment in

  • The sixth base and counting.
    Rusk N. Rusk N. Nat Methods. 2012 Jul;9(7):646. doi: 10.1038/nmeth.2095. Nat Methods. 2012. PMID: 22930833 No abstract available.

Similar articles

Cited by

References

    1. Bernstein BE, Meissner A, Lander ES. The mammalian epigenome. Cell. 2007;128:669–681. - PubMed
    1. Bird A. The dinucleotide CG as a genomic signalling module. J Mol Biol. 2011;409:47–53. - PubMed
    1. Chen ZX, Riggs AD. DNA methylation and demethylation in mammals. J Biol Chem. 2011;286:18347–18353. - PMC - PubMed
    1. Clark SJ, Harrison J, Paul CL, Frommer M. High sensitivity mapping of methylated cytosines. Nucleic Acids Res. 1994;22:2990–2997. - PMC - PubMed
    1. Cokus SJ, Feng S, Zhang X, Chen Z, Merriman B, Haudenschild CD, Pradhan S, Nelson SF, Pellegrini M, Jacobsen SE. Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature. 2008;452:215–219. - PMC - PubMed

Publication types

Associated data