UCI > Research > TMF > Construct Design
Construct Design
On this page we present some general guidelines for the design of transgenic constructs and gene targeting constructs.
Transgenes
Transgenes are linear pieces of DNA, usually cloned into a plasmid vector, that generally contain a promoter, a cDNA, an intron, and a polyadenylation signal. When injected into the pronuclei of fertilized mouse eggs, they are incorporated into random loci, usually in head-to-tail concatemers consisting of varying numbers of copies. Most of the time, integration takes place at a single locus and is present in all cells of the resulting mouse. In a small percentage of cases, mice may have multiple integration sites, or may contain the transgene in only a portion of their cells.
When a cDNA is used, an artificial intron is added to stimulate the transport of mRNA out of the nucleus, since this process is coupled to the splicing process. The artificial intron should be placed at the 3-prime end of the cDNA.
Some variations on this theme include:
- BAC transgenes – these allow insertion of entire genes, including all introns and possibly unknown promoter and enhancer elements, as well as “insulator” sequences that allow expression of the gene even if the transgene is integrated into heterochromatin.
- Promoterless transgenes – these can act as gene-trap or promoter-trap vectors and generally contain cDNA for a reporter gene such as lacZ or GFP.
- IRES – Internal Ribosome Entry Sites are added, along with a second cDNA, to achieve expression of two proteins under the control of the same promoter. The second protein could be a reporter molecule such as beta-galactosidase or GFP.
- Fusion proteins – to allow easier detection of the protein product, by fusion with either a reporter gene or with an affinity tag.
Promoters
The best promoter for any given transgene will depend on the exact aims of the research.
Most transgenes are designed to result in over-expression of a protein. Thus, strong constitutive promoters such as CMV or PGK are often used to drive expression in all tissues.
Tissue-specific promoters can be used to limit the spatial expression pattern, while inducible promoters are used to control the timing of expression. Developmentally regulated promoters can be used for control of timing as well.
The most commonly used inducible promoters are turned on or off by tetracycline or its analog, doxycycline. The so-called Tet-On and Tet-Off vector systems are commercially available from Clontech. These systems consist of two transgenes that are injected separately, and the complete system is reconstituted by mating the two lines of transgenic mice.
Gene Targeting Constructs
Gene targeting contructs are designed to undergo homologous recombination into a specific locus chosen by the investigator, usually with the aim of disrupting the gene to prevent transcription of a functional mRNA (a knock-out), or mutating the gene (a knock-in).
The number of ways in which gene targeting constructs can be designed to produce knock-out or knock-in mice is almost limitless. Thus, we can only offer some general guidelines here, and urge people making their first construct to read the literature and consult with the TMF or another more experienced investigator before doing any cloning.
The simplest targeting construct consists of 2 long segments of genomic DNA (gDNA), called homology arms, flanking a selection cassette. The most commonly used selection cassette consists of the cDNA and control elements for the neomycin (G418) resistance gene (others include resistance genes for puromycin, hygromycin, and 6-thioguanine).
When introduced into embryonic stem (ES) cells, the gDNA homology arms will undergo recombination with their matching sequences on one chromosome, carrying the selection cassette with them. The gDNA between the regions of homology on the chromosome is thereby replaced by the selection cassette and any other sequences flanked by the homology arms of the targeting construct. Where a complete knockout is desired, the intervening sequence is usually positioned to replace the TATA box, the start codon, and one or more of the initial exons.
The targeting construct will also integrate into random loci. Any integration event, random or specific, can confer drug resistance to the cell. After growing the transfected cells under selection, the challenge is to screen enough clones to find the rare homologous recombination events in a background of frequent random integrants.
We recommend the use of both positive and negative selection cassettes in all targeting constructs. The most commonly used negative selection cassette contains the gene for thymidine kinase, or tk. The tk gene product allows growing cells to incorporate a toxic nucleotide analog into their DNA, thus selecting against those cells. The tk cassette is cloned into the targeting construct outside of the homology arms, so that it will not be incorporated during homologous recombination. It will be incorporated during random integration and help to select against those clones. Another negatively selectable marker is the gene for diphtheria toxin A. The A subunit inhibits protein synthesis but cannot be taken up by other cells. Its advantage over tk is that it works without having to add a second drug to the culture medium.
Homology Arms
The degree to which the homology arms match the same sequences in the locus of interest will help determine the frequency of homologous recombination. The 3 most important characteristics of homology arms are:
- Length – we recommend an overall length of about 7 kilobases, with one arm being 5-6 kb and the other being 1-2 kb. Longer is better, but one is usually limited by the capacity of the cloning vector and the need to maintain a unique restriction enzyme site that can be used to linearize the construct prior to transfection into ES cells.
- Sequence homology – whenever possible, clone the homology arms from the genome of the ES cells that will be targeted, or from the mouse strain they were derived from. Long-range PCR with a high-fidelity polymerase is an effective method for subcloning the homology arms.
- Limited repetitive sequences – we recommend using the on-line program, RepeatMasker, to search for repetitive sequences in the homology arms. Large regions of repetitive DNA should be avoided, because these will result in a lower frequency of homologous recombination.
Conditional Targeting
One problem with a simple, constitutive targeting scheme is that many genes have multiple functions, or are active in multiple tissues and/or at multiple stages of development. Many knockouts of genes with no known roles in development have resulted in embryonic lethality, preventing the study of the gene’s involvement in the phenotype of an adult animal. To get around this problem, techniques have been developed to allow the investigator to determine when and where the knockout occurs. Conditional targeting constructs employ recombinase recognition sequences, tissue-specific or developmentally-specific promoters, or inducible promoters (or a combination of these) to limit and control the spatial and temporal expression of the knockout or knock-in phenotype.
The following example illustrates one of the more common ways in which a conditional targeting construct is used.
The cartoon below illustrates the 5-prime end of a gene that has been targeted with a construct containing loxP sites in 3 positions. A loxP site is a 34 bp sequence that is recognized by the enzyme Cre recombinase (Cre). The reaction catalyzed by Cre brings two loxP sites together and removes the intervening DNA along with one loxP site, in the form of a circular molecule.
In this cartoon, the loxP sites are represented by arrowheads, the first and second exons by open boxes, introns by lines, and the neomycin resistance cassette by the shaded box.
When Cre is present in a cell with this arrangement of loxP sites, 3 separate recombinations can be catalyzed by the Cre, depending on which pair of loxP sites is acted upon. (Eventually, all 3 sites would be reduced to a single site, with removal of the neomycin cassette and the first exon.)
Initially, Cre is not present and the cells can be selected for their neomycin resistance. Next, Cre is introduced by transient transfection with a Cre-expressing plasmid. Under the right conditions, partial recombination between the loxP sites will occur, and a variety of clones will be found with one, two, or three loxP sites remaining. PCR is then used to find clones that are lacking the neomycin resistance cassette but still have the first and second loxP sites, as illustrated by the next cartoon.
The reason for removing the neomycin resistance cassette is that, even though it resides in an intron, it can have unpredictable effects on the expression of the gene of interest, for example, via inappropriate mRNA splicing events.
The ES cells resulting from the partial recombination reaction illustrated above are said to have a floxed gene, meaning the gene of interest has one or more exons flanked by loxP sites (flox = flanked by loxP).
When a mouse is made from these ES cells, it has an almost completely wildtype allele that can be knocked out by the expression of Cre inside its cells (assuming removal of the first exon results in a knockout). Such a mouse would be termed a heterozygous floxed mouse. Breeding a floxed mouse with a mouse that expresses Cre from a transgene will result in some offspring that inherit both the floxed allele and the transgene. These mice will lack the first exon of the gene of interest on one chromosome, due to the action of Cre on the remaining two loxP sites. Further crosses between floxed mice and flox/Cre mice will result in a homozygous knockout.
The real power of this system becomes apparent when one considers the fact that multiple lines of Cre-expressing mice are available, with different promoters driving expression of Cre. If the promoter is neuron-specific, for example, the knockout will only occur in neurons. Cre expression itself can be made conditional by using a tetracycline-responsive promoter, for example, allowing the exact time at which the gene is knocked out to be determined by the investigator.