Structure of U1 RNA bound to U1A RRM domain
Gene Expression Regulation
|Eubacterial systems||Eukaryotic systems|
|Translation is concurrent with transcription||Transcript must be processed:|
capping, splicing, polyadenylation, export
|No barrier restricts movement of transcript to translational apparatus||mRNA is sequestered as RNP in the nucleus, and must be transported to the cytoplasm through the nuclear pore complex.|
|No barrier restricts access of polypeptides to transcriptional and translational machinery.||Both functional and regulatory factors controlling pre-mRNA production and processing must be imported into the nucleus.|
|1)||m7G-capping at 5' end|
|2)||splicing to remove introns|
|3)||polyadenylation at 3'-end|
|4)||sequestration as RNP|
Mammalian CE catalyzes two reactions:
ppN(pN)25 + GTP GpppN(pN)25 + PPi;
|The mammalian CE is a single bifunctional enzyme wherease yeast CE exists as two separate subunits. Appropriate fragments of the mammalian enzyme can complement defects in the respective yeast subunits. G becomes linked to the triphosphate through its 5'-O, so has the opposite orientation to the other nucleotides in the chain.|
MT methylates the attached G on N7. Methylation in this position gives rise to a quarternary N+, changing G into its enolate form. In many cases, the initiating nucleotide is A, which can be N6 methylated as well.
|Removal of introns from pre-mRNA transcripts involves cleavage
at the 5'- end of the intron by attack of a specific 2'OH group, the
branch site. This forms a phosphodiester bond with the 5'-phosphate
of the intron, creating a lariat structure.|
The intron lariat is then removed, proceeding by attack of 3'-OH on Exon 1 to displace the intron from the 5'-phosphate of Exon 2.
|During the whole process, the number of phosphodiester bonds remains
constant, so this is not an endonuclease cleavage and ligation process as
occurs in tRNA processing, but an ATP independent
At step 1, the phosphodiester bond between Exon1 and intron is converted into the 2'-branch site phosphodiester.
At step 2, the phosphodiester bond between intron and Exon 2 is converted into the Exon 1 - Exon 2 phosphodiester.
Some single-celled eukaryotes, e.g. the cilate Tetrahymena, produce
pre-mRNA with self splicing introns. In these cases the intron forms a
unique tertiary structure promoting self catalysis. The catalytic action is
embodied in the RNA itself.. In some examples, catalysis involves attack by the
3'-O of a separate bound molecule guanosine nucleotide, and in other cases the
2'-O of an in-chain A produces the lariat structure.
Autocatalysis is mediated via a metal ion, Mg2+ or Ca2+, bound to a specific site formed by the tertiary structure of the intron. Self splicing introns have been used as the models for ribozymes, or catalytic RNA. For additional information, see Ribosomes and Ribozymes
In most eukaryotes, splicing is mediated by a large ribonucleoprotein complex comparable in size to the ribosome or polymerase II holoenzyme called the spliceosome.
The spliceosome contains a specific set of U-rich small nuclear ribonucleoproteins or snRNPs.
|The common spliceosome recognizes introns starting with 5'-GU and ending in AG-3'. More recently, a subclass of spliceosome has been found to recognize introns with 5'-AU and AC-3' ends. Since we typically read sequences in DNA, these have been called AT-AC introns. Site recognition involves snRNPs U11, U12 as analogs of U1 and U2, and the transesterase consists of variant snRNPs U4atac-U6atac bound by the conventional U5. (Tarn and Steitz, 1997).|
|Yeast has few introns, with well conserved and consistent splice and branch site sequences.|
|Higher organisms often have multiple introns within a single pre-mRNA, poorly defined splice and branch site sequences, and a more complex regulatory system controlling their selection.|
|Splicing may be constitutive meaning that the same introns are
always identified and spliced out from a pre-mRNA, resulting in
translation to yield a single protein product. However higher organisms
make extensive use of alternative splicing to generate
functionally different isoforms of a protein, which are expressed
in particular states of differentiation or development. Regulatory
mechanisms must determine which splice sites are selected.
The number of genes mapped in the human genome (30-40 thousand) turned out to be significantly lower than prior estimates; however about 30% of genes appear to be expressed in multiple isoforms as a result of alternative splicing.
|More rarely, trans-splicing can also generate unique mRNAs by association and linking of exons from different pre-mRNA transcripts.|
In addition to the snRNPs (which consist of RNA and specific associated proteins) a number of accessory protein factors are involved in various stages of the splicing reaction.
|RNA recognition motif, (RRM)
||(also known as RBD, RNA binding domain) consists of 4
antiparallel b-sheet strands interspersed with
two a-helices, in the pattern
This pattern binds open loops of RNA, in which the core base stack is interrupted, allowing for many specific contacts between bases and the polypeptide. The conserved Phe or Tyr residues are positioned on the surface of the b-sheet structure, where they can stack with the bases in the loop.
(For more details, see RNA-protein interactions. ) In addition to splicing factors, CBP20 also carries the RRM domain.
|RS domains||domains rich in the dipeptide repeat Arg-Ser (RS in single
RS domains are of major importance in protein-protein contact among splicing factors. The RS domain is a target for phosphorylation, and phosphorylation controls entry of splicing factors into the splicing reaction cycle. Positive Arg will ion pair with negative Ser-phosphate, and vice-versa.
|DExD/H box||also known as the DEAD box (based on amino acid motif AspGlu x Asp/His), this is the signature for RNA helicase or unwindase, which is required for several RNA base pairing rearrangements that take place in splicing reactions. Although the transesterification reactions of splicing are ATP independent, the helicase reactions do consume ATP.|
|are a family of proteins in metazoans recognised by a specific mAB raised against spliceosomes, and contain many RS domains: the family includes SRp20, SRp30, SRp40, SRp55, SRp75, ASF (Alternate Splicing Factor), SC35. In many cases, the N-terminal end carries a typical RRM (RNA recognition motif) common to many RNP associated proteins.The C-terminus contains multiple RS or dipeptides, which may be highly phosphorylated. The enzyme SRPK-1(SR-protein kinase) is a specific protein kinase, activated during mitosis that causes redistribution of SR proteins and spliceosomes. The general role of SR proteins seems to be to bridge between other splicing components. This bridging process may then allow binding of factors to sites that are too weak for effective binding of the basal factors U1 and U2.|
|Mammalian U2AF||consists of 65 kDa binding factor for the polypyrimidine tract, containing with RS and RRM domains, plus a 35 kDa SR type subunit involved in binding to other spliceosomal components, and helps determine the 3' site, by acting as a bridging factor between the exons. The 35 kDa subunit is highly conserved in higher organisms, suggesting that its importance is in selection of weaker splice sites. A single protein, Mud2p, carries out a similar role in yeast.|
|U1A, U1C, U1 70k||are structural components of U1 snRNP. U1A and U1 70 k are classic RRM stem-loop binding proteins, and U1 70k also contains RS domains needed for protein-protein interaction.|
|Sm proteins||A set of seven proteins forming the common structural core of snRNPs, and bind to a conserved sequence RAUUUUUUGR in U1,U2, U4 and U5.|
|SF proteins||Splicing factors SF1 and SF3a/b are associated with U2. SF2/ASF (alternative splicing factor) should really be classified as a SR protein. It plays a role in exon selection during alternative splicing, and binds to U1 70k.|
|1a) Formation of the commitment or E complex involves binding
of factor U1 snRNP (complex of U1 RNA, U1A RRM protein, U1C and U1 70K
protein) to the 5'-intron GU site. Recognition is by base pairing
of the 3' end of U1 with the consensus sequence
SR accessory factors associate with the exon towards the 5' direction, and facilitate binding of U1.
1b) This is generally followed by binding of U2 auxilary factor U2AF (Mud2 in yeast) to the pyrimidine rich tract between the branch site and the 3'- end of the intron. U2, plus the associated SF3a/b can then base pair with the metazoan branch point sequence YNYURAY to give the A complex. An additional protein factor, BBP (branch binding protein) binds in the region of the branchpoint A. In yeast, the branch point is more conserved, UACUAAC.
|The branchpoint sequence (BPS) is identified by base pairing
with a section of the U2 snRNA bearing the sequence
5'-GUAGUA-3'. The BPS sequence is
mismatched at a single A, which becomes looped out, exposing its
2'OH. This exposed ribose OH acts as the nucleophile attacking the
When the sequence at the branchpoint deviates from the consensus, associated protein factors such as U2AF are needed to promote complex formation.
|2a) The U4-U6, U5 tri snRNP is then recruited to give the B complex. There is some evidence that U4/U6 recruitment to the 5'-splice site can precede U2 assembly at the branchpoint. 2b) Finally some radical ATP-dependent base pair rearrangements occur to organize the catalytically competent C complex. Two tri-snRNP factors,U5 100p and U5200p have been shown to contain DExD/H box domains.|
|U5 first base pairs to the upstream exon and the 5'-splice site, a
process that requires RNA unwindase activity to displace U1 from the
U6 base-pairs to U2, resulting in displacemant of U4. Finally, U5 base pairs to exon 2 near the 3'- splice site on the same stem loop that already holds Exon1, bringing the 3'-OH of Exon 1 into close proximity to 5'-p of exon 2.
Exons contain elements called exonic enhancers which are targets for binding SR and related RRM containing proteins. The organization that lays out the splicing pattern starts with the Cap binding complex of CBP20 and CBP80, and possibly even with the CTD of the RNA Pol II (Zeng and Berget, 2000). An array of protein factors, e.g. SC35, bind in a cooperative manner between cap and first splice site to define its location. Other SR proteins bridge the intron gapfrom U1 70k to facilitate U2AF binding, and establish branch point and 3' splice site. Once the U2 complex is in place, SR proteins link up to the next 5' splice site, to continue the process. Thus the pattern of splice sites is established progressively from the 5' cap towards the 3' end, and the spliceosome does not select intron targets for splicing at random.
In metazoans, certain members of the hnRNP (heterogeneous nuclear ribonucleotprotein) class bind to sites in particular in the introns. These include hnRNP A1, which binds indiscriminately to pre-mRNA and has a negative effect on spliceosome assembly. The function of the enhancers and SR proteins seems to be to exclude hnRNA from the exons, and a gap in the chain of enhancers and SR proteins allows hnRNP A1 to act as a splicing repressor.
Splice site specificity is reasonably conserved across species, allowing expression of transgenes. Occasionally splice sites may be misread, for example when wild type Green Fluorescent Protein is expressed in higher plants, the polypeptide may be disrupted by misinterpretation of a coding sequence as a plant specific splicing site.
A 5'-splice site that conforms to the ideal sequence binds U1 without ASF, and may be considered strong site. If the sequence deviates from normal, the intrinsic affinity for U1 is weak, and the assistance of active ASF/SF2 is needed for U1 binding. ASF/SF2 acts as an antagonist of hnRNP A1, which is an indiscriminate splicing repressor and can cause 5' splice sites to be skipped (Eperon et al., 2000). This results in intron retention, which may further control polypeptide expression due to inclusion of premature stop codons in the mRNA.
In many cases, splicing regulation controls a pair of mutually exclusive exons:
e.g. variants of the muscle proteins tropomyosin and aactinin, where the different sequences confer different regulatory properties.
|Non-muscle tropomyosin skips exon 2 due to the repression by PTB. For a-actinin, however, PTB causes skipping of exon 3 (Southby et al., 1999). The different behaviours are due to different strengths of the branch point site. For tropomyosin, the presence of higher levels of PTB in non muscle cells represses the branchpoint for exon 2, so exon 3 is selected.|
The same activity of PTB has the opposite effect on a-actinin. In this case, an intrinsically strong branchpoint site at exon 3 is repressed by PTB in non muscle cells. In smooth muscle, where exon 3 is accessible, it appears to outcompete exon 2 for branch-site factors.
The process involves two steps:
polyadenylate extension of the 3'-end.
polyA increases efficiency of translation initiation.
Cleavage and Polyadenylation Specificity Factor (CPSF):- a protein with 160 kDa and 30 kDa (zinc finger) RNA binding subunits, which binds to the AAUAAA signal upstream of the cleavage site. Additional 73 kDa and 100 kDa subunits do not contact RNA and their function is unknown
|Yeast lacks a distinct AAUAAA sequence, and polyadenylation is
associated with less well defined A/U rich and A rich sites.
The CFPS equivalents are Cft1, Cft2 and Yth1. A factor comparable to CstF,
Rna15/Rna14/Pfs2, binds upstream of CPFS No factors appear to be
involved in binding downstream of the cleavage site as for mammalian CstF.
Cleavage factors PfsIA, IB are involved in cleavage, but functions do not correspond exactly to the mammalian counterparts.
|As splicing progresses from the cap site towards the polyadenylation
site, the exposed single-stranded RNA binds the protein factor hnRNP
A1. The absence of bound splicing factors appears to mark fully mature
pre-mRNA molecules which are ready to leave the nucleus
Sequences on these polypeptides act as markers for Nuclear Export signals, (NES) or Nuclear Localization signals (NLS), which are required for transfer across the nuclear pore complex.
More to come about nuclear transport next week.
Eperon, I.C. et al. (2000). Selection of alternative 5' splice sites: Role of U1 snRNP and models for the antagonistic effects of SF2/ASF and hnRNP A1. Molecular Cell Biol 20: 8303-8318.
Hastings, M.L. and Krainer, R., (2001). pre-mRNA splicing in the new millenium. Current Opinion in Cell Biology 13: 302-309.
Reed, R. (2000) Mechanisms of fidelity in pre-mRNA splicing. Current opininion in Cell Biology 12: 340-345.
Shatkin, A.J and Manley J.L. (2000). The ends of the affair: Capping and polyadenylation. Nature Structural Biology 7: 838-842.
Southby, J., Gooding, C. and Smith, C.W.J. (1999) Polypyrimidine tract binding protein functions as a repressor to regulate alternative splicing of a-actinin mutually exclusive exons. Molecular and Cellular Biology 19: 2699-2700.
(Tarn, W-Y, and Steitz, J. (1997). Pre-mRNA splicing: the discovery of a new spliceosome doubles the challenge. Trends in Biochemical Sciences 22 132-137.
Will, C.L. and Lührmann, R. (2001). Spliceosomal UsnRNP biogenesis, structure and function. Current Opinion in Cell Biology 13: 290-301.
Zeng, C. and Berget, S.M. (2000) Participation of C-terminal domain of RNA Pol II in exon definition during pre- mRNA splicing. Molecular Cell Biol 20: 8290-8301.