Editor’s note: Normally, creating point mutations with CRISPR is inefficient, with typical mutation rates of 0.1% to 5%, and at best, 20%. These rates are accompanied by an excess of indels at efficiencies higher than the correction rate. Here, Alexis Komor from David Liu’s lab at Harvard University details her latest findings on a base-editing mechanism with an efficiency of up to 75% using a CRISPR framework. Check out the wizard we co-developed with the Liu lab to design gRNA for base editors!
Want to share your research with Benchling? Contact us.
Base editing is a new genome editing technology that enables the direct, irreversible conversion of a specific DNA base into another at a targeted genomic locus. Importantly, this can be achieved without requiring double-stranded DNA breaks (DSB). Since many genetic diseases arise from point mutations, this technology has important implications in the study of human health and disease[1].
Until now, other genome editing techniques, including CRISPR, begin with the introduction of a DSB at a locus of interest [2-4]. Subsequently, cellular DNA repair enzymes mend the break, commonly resulting in random insertions or deletions (indels) of bases at the site of the DSB [5]. However, when the introduction or correction of a point mutation at a target locus is desired rather than stochastic disruption of the entire gene, these traditional genome editing techniques are unsuitable, as correction rates are low (typically 0.1% to 5%), with the major genome editing products being indels.
In order to increase the efficiency of gene correction without simultaneously introducing random indels, we modified the CRISPR/Cas9 system to directly convert one DNA base into another without DSB formation (Figure 1). Thus far, with our technique we have seen correction efficiencies of 15% - 75% alongside indel formation of only 0.1-5%.
First, we used a catalytically dead Cas9 (dCas9) that still binds DNA in a guide RNA-programmed manner via the formation of an R-loop [6,7], but does not cleave the DNA backbone. In addition, we fused a cytidine deaminase enzyme (rAPOBEC1) to the N-terminus of dCas9 to convert the DNA base cytosine to uracil [8] generating the base editing enzyme BE1.
To manipulate the cellular DNA repair processes and increase the yield of our desired outcome (a T:A base pair), we added a uracil DNA glycosylase inhibitor (UGI) [9] to prevent the subsequent U:G mismatch from being repaired back to a C:G base pair (resulting in BE2, rAPOBEC1-XTEN-dCas9-UGI).
Finally, to improve base editing efficiency, we restored the catalytic His residue at position 840 in the Cas9 HNH domain of BE2 (resulting in BE3, APOBEC–XTEN–dCas9(A840H)–UGI) which nicks only the non-edited strand, simulating newly synthesized DNA [10] and leading to the desired U:A product (Figure 1).
We recommend using BE2 for applications that require no indel formation. For applications that require the highest possible base editing efficiency and can tolerate low levels of indel formation (due to the nicking strategy of BE3), we recommend using BE3.
All current base editors are comprised of the cytidine deaminase enzyme rAPOBEC1, and are therefore only capable of C to T or G to A DNA base transformations. Since rAPOBEC1 can only bind to single-stranded DNA, the local denaturation of the target DNA upon dCas9:sgRNA binding (R-loop formation) [11] is essential for effective base editing. As in CRISPR, the specific locus targeted with sgRNA must be followed by a PAM (protospacer adjacent motif) to allow for efficient Cas9 binding [7]. In the structure of the Cas9 R-loop complex, the eleven nucleotides furthest from the PAM on the nontarget strand are disordered, suggesting they are unencumbered and accessible for base editing (Figure 2) [12].
Once the R-loop forms, the cytidine deaminase enzyme directly binds the target nucleotide (C) and chemically converts it to U (Figure 3). The resulting U:G mismatch is then processed by cellular DNA replication or repair, and resolves into a T:A base pair [13,14]. The overall DNA transformation is therefore C:G to T:A (Figure 3).
Designing sgRNAs that will enable efficient and specific base editing is essential when using base editing for your own research. Compared to the 0.5-20% point mutation efficiency by standard CRISPR practices [15-17], base editing typically reaches efficiencies in mammalian cell culture of 35%, and can even reach up to 75% depending on the sequence. The ratio of gene correction to indel formation is on average 23 for BE3, and over 1,000 for BE2. A side-by-side comparison with Cas9 and a donor template yields gene correction to indel formation ratios of 0.17 on average. Base editing efficiency will depend on the sequence surrounding the target nucleotide C, and the location of the target C within the protospacer. The canonical substrate sequence of rAPOBEC1 is TC or CC [18], and we observe higher base editing efficiencies when editing target Cs of these motifs. Editing efficiencies of AC and GC sequences can also be high, but the editing window of these motifs is slightly more narrow.
More specifically, we characterized BE1 in vitro using high-throughput sequencing (HTS) and found the sequence context dependence of base editing to be only dependent on the base directly 5’ to the target C. For the four different NC motifs, we observe, in general, robust and efficient C to T editing when the target C is in positions 4 through 8 within the protospacer (Figure 4). This editing window can be larger (in the case of TC) or smaller (in the case of GC) depending on the identity of the NC motif. It is also important to note that when multiple Cs are present within this base editing window all will be edited (Figure 4).
As for CRISPR technologies, the current base editors are comprised of the wild-type Streptococcus pyogenes dCas9, and thus require an NGG PAM for DNA binding and R-loop formation [19]. In order to position the target C or G for maximum base editing, it should be positioned 12-16 bases upstream of the PAM, as depicted in Figure 5 when inspecting the coding strand of genomic DNA in the 5’ to 3’ direction.
The above characteristics should be kept in mind when designing sgRNAs for base editing.
Here is a step-by-step workflow that you can use to carry out base editing in mammalian cells via plasmid-based delivery methods:
For a full explanation of how we determined these rules, please see our full manuscript in Nature.