Nucleic Acids Research

Syndicate content
RSS feed of recent issues (covers the latest 3 issues, including the current issue)>
Updated: 3 min 12 sec ago

Predictive biophysical modeling and understanding of the dynamics of mRNA translation and its evolution

Mon, 2016-10-31 07:42

mRNA translation is the fundamental process of decoding the information encoded in mRNA molecules by the ribosome for the synthesis of proteins. The centrality of this process in various biomedical disciplines such as cell biology, evolution and biotechnology, encouraged the development of dozens of mathematical and computational models of translation in recent years. These models aimed at capturing various biophysical aspects of the process. The objective of this review is to survey these models, focusing on those based and/or validated on real large-scale genomic data. We consider aspects such as the complexity of the models, the biophysical aspects they regard and the predictions they may provide. Furthermore, we survey the central systems biology discoveries reported on their basis. This review demonstrates the fundamental advantages of employing computational biophysical translation models in general, and discusses the relative advantages of the different approaches and the challenges in the field.

Identifying proteins that bind to specific RNAs - focus on simple repeat expansion diseases

Mon, 2016-10-31 07:42

RNA–protein complexes play a central role in the regulation of fundamental cellular processes, such as mRNA splicing, localization, translation and degradation. The misregulation of these interactions can cause a variety of human diseases, including cancer and neurodegenerative disorders. Recently, many strategies have been developed to comprehensively analyze these complex and highly dynamic RNA–protein networks. Extensive efforts have been made to purify in vivo-assembled RNA–protein complexes. In this review, we focused on commonly used RNA-centric approaches that involve mass spectrometry, which are powerful tools for identifying proteins bound to a given RNA. We present various RNA capture strategies that primarily depend on whether the RNA of interest is modified. Moreover, we briefly discuss the advantages and limitations of in vitro and in vivo approaches. Furthermore, we describe recent advances in quantitative proteomics as well as the methods that are most commonly used to validate robust mass spectrometry data. Finally, we present approaches that have successfully identified expanded repeat-binding proteins, which present abnormal RNA–protein interactions that result in the development of many neurological diseases.

Incorporating a guanidine-modified cytosine base into triplex-forming PNAs for the recognition of a C-G pyrimidine-purine inversion site of an RNA duplex

Mon, 2016-10-31 07:42

RNA duplex regions are often involved in tertiary interactions and protein binding and thus there is great potential in developing ligands that sequence-specifically bind to RNA duplexes. We have developed a convenient synthesis method for a modified peptide nucleic acid (PNA) monomer with a guanidine-modified 5-methyl cytosine base. We demonstrated by gel electrophoresis, fluorescence and thermal melting experiments that short PNAs incorporating the modified residue show high binding affinity and sequence specificity in the recognition of an RNA duplex containing an internal inverted Watson-Crick C-G base pair. Remarkably, the relatively short PNAs show no appreciable binding to DNA duplexes or single-stranded RNAs. The attached guanidine group stabilizes the base triple through hydrogen bonding with the G base in a C-G pair. Selective binding towards an RNA duplex over a single-stranded RNA can be rationalized by the fact that alkylation of the amine of a 5-methyl C base blocks the Watson–Crick edge. PNAs incorporating multiple guanidine-modified cytosine residues are able to enter HeLa cells without any transfection agent.

G-quadruplex prediction in E. coli genome reveals a conserved putative G-quadruplex-Hairpin-Duplex switch

Mon, 2016-10-31 07:42

Many studies show that short non-coding sequences are widely conserved among regulatory elements. More and more conserved sequences are being discovered since the development of next generation sequencing technology. A common approach to identify conserved sequences with regulatory roles relies on topological changes such as hairpin formation at the DNA or RNA level. G-quadruplexes, non-canonical nucleic acid topologies with little established biological roles, are increasingly considered for conserved regulatory element discovery. Since the tertiary structure of G-quadruplexes is strongly dependent on the loop sequence which is disregarded by the generally accepted algorithm, we hypothesized that G-quadruplexes with similar topology and, indirectly, similar interaction patterns, can be determined using phylogenetic clustering based on differences in the loop sequences. Phylogenetic analysis of 52 G-quadruplex forming sequences in the Escherichia coli genome revealed two conserved G-quadruplex motifs with a potential regulatory role. Further analysis revealed that both motifs tend to form hairpins and G quadruplexes, as supported by circular dichroism studies. The phylogenetic analysis as described in this work can greatly improve the discovery of functional G-quadruplex structures and may explain unknown regulatory patterns.

DNA-binding protects p53 from interactions with cofactors involved in transcription-independent functions

Mon, 2016-10-31 07:42

Binding-induced conformational changes of a protein at regions distant from the binding site may play crucial roles in protein function and regulation. The p53 tumour suppressor is an example of such an allosterically regulated protein. Little is known, however, about how DNA binding can affect distal sites for transcription factors. Furthermore, the molecular details of how a local perturbation is transmitted through a protein structure are generally elusive and occur on timescales hard to explore by simulations. Thus, we employed state-of-the-art enhanced sampling atomistic simulations to unveil DNA-induced effects on p53 structure and dynamics that modulate the recruitment of cofactors and the impact of phosphorylation at Ser215. We show that DNA interaction promotes a conformational change in a region 3 nm away from the DNA binding site. Specifically, binding to DNA increases the population of an occluded minor state at this distal site by more than 4-fold, whereas phosphorylation traps the protein in its major state. In the minor conformation, the interface of p53 that binds biological partners related to p53 transcription-independent functions is not accessible. Significantly, our study reveals a mechanism of DNA-mediated protection of p53 from interactions with partners involved in the p53 transcription-independent signalling. This also suggests that conformational dynamics is tightly related to p53 signalling.

Maps of context-dependent putative regulatory regions and genomic signal interactions

Mon, 2016-10-31 07:42

Gene transcription is regulated mainly by transcription factors (TFs). ENCODE and Roadmap Epigenomics provide global binding profiles of TFs, which can be used to identify regulatory regions. To this end we implemented a method to systematically construct cell-type and species-specific maps of regulatory regions and TF–TF interactions. We illustrated the approach by developing maps for five human cell-lines and two other species. We detected ~144k putative regulatory regions among the human cell-lines, with the majority of them being ~300 bp. We found ~20k putative regulatory elements in the ENCODE heterochromatic domains suggesting a large regulatory potential in the regions presumed transcriptionally silent. Among the most significant TF interactions identified in the heterochromatic regions were CTCF and the cohesin complex, which is in agreement with previous reports. Finally, we investigated the enrichment of the obtained putative regulatory regions in the 3D chromatin domains. More than 90% of the regions were discovered in the 3D contacting domains. We found a significant enrichment of GWAS SNPs in the putative regulatory regions. These significant enrichments provide evidence that the regulatory regions play a crucial role in the genomic structural stability. Additionally, we generated maps of putative regulatory regions for prostate and colorectal cancer human cell-lines.

Long-range correlations in the mechanics of small DNA circles under topological stress revealed by multi-scale simulation

Mon, 2016-10-31 07:42

It is well established that gene regulation can be achieved through activator and repressor proteins that bind to DNA and switch particular genes on or off, and that complex metabolic networks determine the levels of transcription of a given gene at a given time. Using three complementary computational techniques to study the sequence-dependence of DNA denaturation within DNA minicircles, we have observed that whenever the ends of the DNA are constrained, information can be transferred over long distances directly by the transmission of mechanical stress through the DNA itself, without any requirement for external signalling factors. Our models combine atomistic molecular dynamics (MD) with coarse-grained simulations and statistical mechanical calculations to span three distinct spatial resolutions and timescale regimes. While they give a consensus view of the non-locality of sequence-dependent denaturation in highly bent and supercoiled DNA loops, each also reveals a unique aspect of long-range informational transfer that occurs as a result of restraining the DNA within the closed loop of the minicircles.

Chromatin structure-dependent conformations of the H1 CTD

Mon, 2016-10-31 07:42

Linker histones are an integral component of chromatin but how these proteins promote assembly of chromatin fibers and higher order structures and regulate gene expression remains an open question. Using Förster resonance energy transfer (FRET) approaches we find that association of a linker histone with oligonucleosomal arrays induces condensation of the intrinsically disordered H1 CTD in a manner consistent with adoption of a defined fold or ensemble of folds in the bound state. However, H1 CTD structure when bound to nucleosomes in arrays is distinct from that induced upon H1 association with mononucleosomes or bare double stranded DNA. Moreover, the H1 CTD becomes more condensed upon condensation of extended nucleosome arrays to the contacting zig-zag form found in moderate salts, but does not detectably change during folding to fully compacted chromatin fibers. We provide evidence that linker DNA conformation is a key determinant of H1 CTD structure and that constraints imposed by neighboring nucleosomes cause linker DNAs to adopt distinct trajectories in oligonucleosomes compared to H1-bound mononucleosomes. Finally, inter-molecular FRET between H1s within fully condensed nucleosome arrays suggests a regular spatial arrangement for the H1 CTD within the 30 nm chromatin fiber.

Optimizing sgRNA position markedly improves the efficiency of CRISPR/dCas9-mediated transcriptional repression

Wed, 2016-10-12 15:43

CRISPR interference (CRISPRi) represents a newly developed tool for targeted gene repression. It has great application potential for studying gene function and mapping gene regulatory elements. However, the optimal parameters for efficient single guide RNA (sgRNA) design for CRISPRi are not fully defined. In this study, we systematically assessed how sgRNA position affects the efficiency of CRISPRi in human cells. We analyzed 155 sgRNAs targeting 41 genes and found that CRISPRi efficiency relies heavily on the precise recruitment of the effector complex to the target gene transcription start site (TSS). Importantly, we demonstrate that the FANTOM5/CAGE promoter atlas represents the most reliable source of TSS annotations for this purpose. We also show that the proximity to the FANTOM5/CAGE-defined TSS predicts sgRNA functionality on a genome-wide scale. Moreover, we found that once the correct TSS is identified, CRISPRi efficiency can be further improved by considering sgRNA sequence preferences. Lastly, we demonstrate that CRISPRi sgRNA functionality largely depends on the chromatin accessibility of a target site, with high efficiency focused in the regions of open chromatin. In summary, our work provides a framework for efficient CRISPRi assay design based on functionally defined TSSs and features of the target site chromatin.

Absolute quantitative measurement of transcriptional kinetic parameters in vivo

Wed, 2016-10-12 15:43

mRNA expression involves transcription initiation, elongation and degradation. In cells, these dynamic processes are highly regulated. However, experimental characterization of the dynamic processes in vivo is difficult due to the paucity of methods capable of direct measurements. We present a highly sensitive and versatile method enabling direct characterization of the dynamic processes. Our method is based on single-molecule fluorescence in situ hybridization (smFISH) and quantitative analyses of hybridization signals. We hybridized multiple probes labelled with spectrally distinct fluorophores to multiple sub-regions of single mRNAs, and visualized the kinetics of synthesis and degradation of the sub-regions. Quantitative analyses of the data lead to absolute quantification of the lag time of mRNA induction (the time it takes for external signals to activate transcription initiation), transcription initiation rate, transcription elongation speed (i.e. mRNA chain-growth speed), the rate of premature termination of transcripts and degradation rates. Applying our method to three different biological problems, we demonstrated how our method may be applicable to reveal dynamics of mRNA expression that was difficult to study previously. We expect such absolute quantification can greatly facilitate understanding of gene expression and its regulation working at the levels of transcriptional initiation, elongation and degradation.

Investigating essential gene function in Mycobacterium tuberculosis using an efficient CRISPR interference system

Wed, 2016-10-12 15:43

Despite many methodological advances that have facilitated investigation of Mycobacterium tuberculosis pathogenesis, analysis of essential gene function in this slow-growing pathogen remains difficult. Here, we describe an optimized CRISPR-based method to inhibit expression of essential genes based on the inducible expression of an enzymatically inactive Cas9 protein together with gene-specific guide RNAs (CRISPR interference). Using this system to target several essential genes of M. tuberculosis, we achieved marked inhibition of gene expression resulting in growth inhibition, changes in susceptibility to small molecule inhibitors and disruption of normal cell morphology. Analysis of expression of genes containing sequences similar to those targeted by individual guide RNAs did not reveal significant off-target effects. Advantages of this approach include the ability to compare inhibited gene expression to native levels of expression, lack of the need to alter the M. tuberculosis chromosome, the potential to titrate the extent of transcription inhibition, and the ability to avoid off-target effects. Based on the consistent inhibition of transcription and the simple cloning strategy described in this work, CRISPR interference provides an efficient approach to investigate essential gene function that may be particularly useful in characterizing genes of unknown function and potential targets for novel small molecule inhibitors.

Joint Bayesian inference of risk variants and tissue-specific epigenomic enrichments across multiple complex human diseases

Wed, 2016-10-12 15:43

Genome wide association studies (GWAS) provide a powerful approach for uncovering disease-associated variants in human, but fine-mapping the causal variants remains a challenge. This is partly remedied by prioritization of disease-associated variants that overlap GWAS-enriched epigenomic annotations. Here, we introduce a new Bayesian model RiVIERA (Risk Variant Inference using Epigenomic Reference Annotations) for inference of driver variants from summary statistics across multiple traits using hundreds of epigenomic annotations. In simulation, RiVIERA promising power in detecting causal variants and causal annotations, the multi-trait joint inference further improved the detection power. We applied RiVIERA to model the existing GWAS summary statistics of 9 autoimmune diseases and Schizophrenia by jointly harnessing the potential causal enrichments among 848 tissue-specific epigenomics annotations from ENCODE/Roadmap consortium covering 127 cell/tissue types and 8 major epigenomic marks. RiVIERA identified meaningful tissue-specific enrichments for enhancer regions defined by H3K4me1 and H3K27ac for Blood T-Cell specifically in the nine autoimmune diseases and Brain-specific enhancer activities exclusively in Schizophrenia. Moreover, the variants from the 95% credible sets exhibited high conservation and enrichments for GTEx whole-blood eQTLs located within transcription-factor-binding-sites and DNA-hypersensitive-sites. Furthermore, joint modeling the nine immune traits by simultaneously inferring and exploiting the underlying epigenomic correlation between traits further improved the functional enrichments compared to single-trait models.

Global transcript structure resolution of high gene density genomes through multi-platform data integration

Wed, 2016-10-12 15:43

Annotation of herpesvirus genomes has traditionally been undertaken through the detection of open reading frames and other genomic motifs, supplemented with sequencing of individual cDNAs. Second generation sequencing and high-density microarray studies have revealed vastly greater herpesvirus transcriptome complexity than is captured by existing annotation. The pervasive nature of overlapping transcription throughout herpesvirus genomes, however, poses substantial problems in resolving transcript structures using these methods alone. We present an approach that combines the unique attributes of Pacific Biosciences Iso-Seq long-read, Illumina short-read and deepCAGE (Cap Analysis of Gene Expression) sequencing to globally resolve polyadenylated isoform structures in replicating Epstein-Barr virus (EBV). Our method, Transcriptome Resolution through Integration of Multi-platform Data (TRIMD), identifies nearly 300 novel EBV transcripts, quadrupling the size of the annotated viral transcriptome. These findings illustrate an array of mechanisms through which EBV achieves functional diversity in its relatively small, compact genome including programmed alternative splicing (e.g. across the IR1 repeats), alternative promoter usage by LMP2 and other latency-associated transcripts, intergenic splicing at the BZLF2 locus, and antisense transcription and pervasive readthrough transcription throughout the genome.

Intersection of calorie restriction and magnesium in the suppression of genome-destabilizing RNA-DNA hybrids

Wed, 2016-10-12 15:43

Dietary calorie restriction is a broadly acting intervention that extends the lifespan of various organisms from yeast to mammals. On another front, magnesium (Mg2+) is an essential biological metal critical to fundamental cellular processes and is commonly used as both a dietary supplement and treatment for some clinical conditions. If connections exist between calorie restriction and Mg2+ is unknown. Here, we show that Mg2+, acting alone or in response to dietary calorie restriction, allows eukaryotic cells to combat genome-destabilizing and lifespan-shortening accumulations of RNA–DNA hybrids, or R-loops. In an R-loop accumulation model of Pbp1-deficient Saccharomyces cerevisiae, magnesium ions guided by cell membrane Mg2+ transporters Alr1/2 act via Mg2+-sensitive R-loop suppressors Rnh1/201 and Pif1 to restore R-loop suppression, ribosomal DNA stability and cellular lifespan. Similarly, human cells deficient in ATXN2, the human ortholog of Pbp1, exhibit nuclear R-loop accumulations repressible by Mg2+ in a process that is dependent on the TRPM7 Mg2+ transporter and the RNaseH1 R-loop suppressor. Thus, we identify Mg2+ as a biochemical signal of beneficial calorie restriction, reveal an R-loop suppressing function for human ATXN2 and propose that practical magnesium supplementation regimens can be used to combat R-loop accumulation linked to the dysfunction of disease-linked human genes.

Force and twist dependence of RepC nicking activity on torsionally-constrained DNA molecules

Wed, 2016-10-12 15:43

Many bacterial plasmids replicate by an asymmetric rolling-circle mechanism that requires sequence-specific recognition for initiation, nicking of one of the template DNA strands and unwinding of the duplex prior to subsequent leading strand DNA synthesis. Nicking is performed by a replication-initiation protein (Rep) that directly binds to the plasmid double-stranded origin and remains covalently bound to its substrate 5'-end via a phosphotyrosine linkage. It has been proposed that the inverted DNA sequences at the nick site form a cruciform structure that facilitates DNA cleavage. However, the role of Rep proteins in the formation of this cruciform and the implication for its nicking and religation functions is unclear. Here, we have used magnetic tweezers to directly measure the DNA nicking and religation activities of RepC, the replication initiator protein of plasmid pT181, in plasmid sized and torsionally-constrained linear DNA molecules. Nicking by RepC occurred only in negatively supercoiled DNA and was force- and twist-dependent. Comparison with a type IB topoisomerase in similar experiments highlighted a relatively inefficient religation activity of RepC. Based on the structural modeling of RepC and on our experimental evidence, we propose a model where RepC nicking activity is passive and dependent upon the supercoiling degree of the DNA substrate.

Antibiotic resistance evolved via inactivation of a ribosomal RNA methylating enzyme

Wed, 2016-10-12 15:43

Modifications of the bacterial ribosome regulate the function of the ribosome and modulate its susceptibility to antibiotics. By modifying a highly conserved adenosine A2503 in 23S rRNA, methylating enzyme Cfr confers resistance to a range of ribosome-targeting antibiotics. The same adenosine is also methylated by RlmN, an enzyme widely distributed among bacteria. While RlmN modifies C2, Cfr modifies the C8 position of A2503. Shared nucleotide substrate and phylogenetic relationship between RlmN and Cfr prompted us to investigate evolutionary origin of antibiotic resistance in this enzyme family. Using directed evolution of RlmN under antibiotic selection, we obtained RlmN variants that mediate low-level resistance. Surprisingly, these variants confer resistance not through the Cfr-like C8 methylation, but via inhibition of the endogenous RlmN C2 methylation of A2503. Detection of RlmN inactivating mutations in clinical resistance isolates suggests that the mechanism used by the in vitro evolved variants is also relevant in a clinical setting. Additionally, as indicated by a phylogenetic analysis, it appears that Cfr did not diverge from the RlmN family but from another distinct family of predicted radical SAM methylating enzymes whose function remains unknown.

Mammalian PNLDC1 is a novel poly(A) specific exonuclease with discrete expression during early development

Wed, 2016-10-12 15:43

PNLDC1 is a homologue of poly(A) specific ribonuclease (PARN), a known deadenylase with additional role in processing of non-coding RNAs. Both enzymes were reported recently to participate in piRNA biogenesis in silkworm and C. elegans, respectively. To get insights on the role of mammalian PNLDC1, we characterized the human and mouse enzymes. PNLDC1 shows limited conservation compared to PARN and represents an evolutionary related but distinct group of enzymes. It is expressed specifically in mouse embryonic stem cells, human and mouse testes and during early mouse embryo development, while it fades during differentiation. Its expression in differentiated cells, is suppressed through methylation of its promoter by the de novo methyltransferase DNMT3B. Both enzymes are localized mainly in the ER and exhibit in vitro specificity restricted solely to 3' RNA or DNA polyadenylates. Knockdown of Pnldc1 in mESCs and subsequent NGS analysis showed that although the expression of the remaining deadenylases remains unaffected, it affects genes involved mainly in reprogramming, cell cycle and translational regulation. Mammalian PNLDC1 is a novel deadenylase expressed specifically in cell types which share regulatory mechanisms required for multipotency maintenance. Moreover, it could be involved both in posttranscriptional regulation through deadenylation and genome surveillance during early development.

Crossover-site sequence and DNA torsional stress control strand interchanges by the Bxb1 site-specific serine recombinase

Wed, 2016-10-12 15:43

DNA segment exchange by site-specific serine recombinases (SRs) is thought to proceed by rigid-body rotation of the two halves of the synaptic complex, following the cleavages that create the two pairs of exchangeable ends. It remains unresolved how the amount of rotation occurring between cleavage and religation is controlled. We report single-DNA experiments for Bxb1 integrase, a model SR, where dynamics of individual synapses were observed, using relaxation of supercoiling to report on cleavage and rotation events. Relaxation events often consist of multiple rotations, with the number of rotations per relaxation event and rotation velocity sensitive to DNA sequence at the center of the recombination crossover site, torsional stress and salt concentration. Bulk and single-DNA experiments indicate that the thermodynamic stability of the annealed, but cleaved, crossover sites controls ligation efficiency of recombinant and parental synaptic complexes, regulating the number of rotations during a breakage-religation cycle. The outcome is consistent with a ‘controlled rotation’ model analogous to that observed for type IB topoisomerases, with religation probability varying in accord with DNA base-pairing free energies at the crossover site. Significantly, we find no evidence for a special regulatory mechanism favoring ligation and product release after a single 180° rotation.

The alternative splicing program of differentiated smooth muscle cells involves concerted non-productive splicing of post-transcriptional regulators

Wed, 2016-10-12 15:43

Alternative splicing (AS) is a key component of gene expression programs that drive cellular differentiation. Smooth muscle cells (SMCs) are important in the function of a number of physiological systems; however, investigation of SMC AS has been restricted to a handful of events. We profiled transcriptome changes in mouse de-differentiating SMCs and observed changes in hundreds of AS events. Exons included in differentiated cells were characterized by particularly weak splice sites and by upstream binding sites for Polypyrimidine Tract Binding protein (PTBP1). Consistent with this, knockdown experiments showed that that PTBP1 represses many smooth muscle specific exons. We also observed coordinated splicing changes predicted to downregulate the expression of core components of U1 and U2 snRNPs, splicing regulators and other post-transcriptional factors in differentiated cells. The levels of cognate proteins were lower or similar in differentiated compared to undifferentiated cells. However, levels of snRNAs did not follow the expression of splicing proteins, and in the case of U1 snRNP we saw reciprocal changes in the levels of U1 snRNA and U1 snRNP proteins. Our results suggest that the AS program in differentiated SMCs is orchestrated by the combined influence of auxiliary RNA binding proteins, such as PTBP1, along with altered activity and stoichiometry of the core splicing machinery.

The complete chemical structure of Saccharomyces cerevisiae rRNA: partial pseudouridylation of U2345 in 25S rRNA by snoRNA snR9

Wed, 2016-10-12 15:43

We present the complete chemical structures of the rRNAs from the eukaryotic model organism, Saccharomyces cerevisiae. The final structures, as determined with mass spectrometry-based methodology that includes a stable isotope-labelled, non-modified reference RNA, contain 112 sites with 12 different post-transcriptional modifications, including a previously unidentified pseudouridine at position 2345 in 25S rRNA. Quantitative mass spectrometry-based stoichiometric analysis of the different modifications at each site indicated that 94 sites were almost fully modified, whereas the remaining 18 sites were modified to a lesser extent. Superimposed three-dimensional modification maps for S. cerevisiae and Schizosaccharomyces pombe rRNAs confirmed that most of the modified nucleotides are located in functionally important interior regions of the ribosomes. We identified snR9 as the snoRNA responsible for pseudouridylation of U2345 and showed that this pseudouridylation occurs co-transcriptionally and competitively with 2'-O-methylation of U2345. This study ends the uncertainty concerning whether all modified nucleotides in S. cerevisiae rRNAs have been identified and provides a resource for future structural, functional and biogenesis studies of the eukaryotic ribosome.