Friday, April 2, 2010

What's special about the GpU dinucleotide platform?

Recently, I (together with Drs. Wilma Olson and Harmen Bussemaker – a team with a unique combination of complementary expertise) published a new article in Nucleic Acids Research (NAR): "The RNA backbone plays a crucial role in mediating the intrinsic stability of the GpU dinucleotide platform and the GpUpA/GpA miniduplex". The key findings of this work are summarized in the abstract:
The side-by-side interactions of nucleobases contribute to the organization of RNA, forming the planar building blocks of helices and mediating chain folding. Dinucleotide platforms, formed by side-by-side pairing of adjacent bases, frequently anchor helices against loops. Surprisingly, GpU steps account for over half of the dinucleotide platforms observed in RNA-containing structures. Why GpU should stand out from other dinucleotides in this respect is not clear from the single well-characterized H-bond found between the guanine N2 and the uracil O4 groups. Here, we describe how an RNA-specific H-bond between O2'(G) and O2P(U) adds to the stability of the GpU platform. Moreover, we show how this pair of oxygen atoms forms an out-of-plane backbone ‘edge’ that is specifically recognized by a non-adjacent guanine in over 90% of the cases, leading to the formation of an asymmetric miniduplex consisting of ‘complementary’ GpUpA and GpA subunits. Together, these five nucleotides constitute the conserved core of the well-known loop-E motif. The backbone-mediated intrinsic stabilities of the GpU dinucleotide platform and the GpUpA/GpA miniduplex plausibly underlie observed evolutionary constraints on base identity. We propose that they may also provide a reason for the extreme conservation of GpU observed at most 5'-splice sites.

As a nice surprise, this publication was selected by NAR as a featured article! According to the NAR website:
Featured Articles highlight the best papers published in NAR. These articles are chosen by the Executive Editors on the recommendation of Editorial Board Members and Referees. They represent the top 5% of papers in terms of originality, significance and scientific excellence.

I feel very gratified with the "extra" recognition. From my own perspective, I can easily rank this paper as the top one in my publication list: from the very beginning, I has been struck by the simplicity and elegance of the GpU story. Hopefully, time will verify the validity of this scientific contribution.

Behind the hood, though, there is a long, complex (sometimes perplexing), yet interesting story associated with this work. Here is how it got started. While writing the 3DNA 2008 Nature Protocols (NP) paper, I selected the (previously undocumented) "-p" option of "find_pair" to showcase its capability to identify higher-order base associations, using the large ribosomal subunit (1JJ2) as an example. I noticed the unexpected O2'(G)⋅⋅⋅O2P(U) H-bond within the GpU dinucleotide platform in the pentaplet shown left in Figure A below. I was well aware of Leontis-Westholf's pioneering work on "Geometric nomenclature and classification of RNA base pairs" which involves three distinct edges – the Watson-Crick edge, the Hoogsteen edge, and the Sugar edge, yet without taking into consideration of possible sugar-phosphate backbone interactions (Figure B below). So I decided to double-check, just to be sure that the H-bond was not spurious due to defects in the H-bond detecting scheme of "find_pair", and the results were very surprising.

The following section was re-added into the 3DNA NP paper in the very last revision:
It is also worth noting that the G1971–U1972 platform is stabilized not only by the well-characterized G(N2)⋅⋅⋅U(O4) H-bond interaction, but also by a little-noticed G(O2’)⋅⋅⋅U(O2P) sugar-phosphate backbone interaction (Fig. 6a). Examination of the 50S large ribosomal unit (1JJ2) alone reveals ten such double H-bonded G–U platforms, far more occurrences than those registered by any other dinucleotide platform (including A–A) in this structure. Apparently, the G–U platform is more stable than other platforms with only a single base–base H-bond interaction. We are currently investigating this overrepresented G–U dinucleotide platform in other RNA structures. (p.1226)


  1. Dear Dr. Lu,
    I looked at your NAR paper on GU platforms and found it quite interesting. I wish to mention that I also recently worked on platform interactions in RNA using QC methods in one of our papers entitled "On the Role of the cis Hoogsteen:Sugar-Edge Family of Base Pairs in Platforms and Triplets: Quantum Chemical Insights into RNA Structural Biology" (J. Phys. Chem. B 2010, 114, 3307–3320).
    In this paper, we have analyzed different combinations of available platform interactions. In fact, while working on these platforms, we also observed the greater frequency of GU platform compared to other combinations, and even mentioned the role of phosphate interaction (Fig. 7(d)) of the paper and Table S17 of supporting information.

    Adding to it, one of the conclusions from our paper was "weakly H-bonded and are intrinsically nonplanar. The rationale for the remarkable planarity of Dinucleotide platforms, as observed in experimental structures, lies in their crystal contexts. Given their
    high propensity to participate in base triples, the interactions provided by the third base could be the most important structural context which ensures the planarity. However, identification of several isolated DNPs, which are nevertheless planar in their experimental contexts, strongly suggests possible roles for "other" environmental factors.
    I hope you will find it interesting.
    Best regards.
    Purshotam Sharma.
    PhD student,
    CCNSB, IIIT, Hyderabad, India.

  2. Dear Dr. Lu,
    I wish to mention that Fig. 7(d) from our paper refers to interaction of N2 with the phosphate oxygen of the base forming triplet, unlike the O2'-O2 interaction which you talked about. I think I was not able clear this in my earlier comment.
    Actually our paper touched the other side of the story, "Most of the dinucleotide platforms need some external support for their planarity, as observed in RNA 3D structures."

  3. Dear Purshotam,

    Thanks for your comments and the reference to your paper. I guess we are taking different approaches: Our NAR GpU paper is based on purely geometrical and empirical analysis, and yours is via high-level theoretical QM calculations. I am glad that we noticed something similar.


I am actually well aware of your recent paper. I communicated with Drs. Bhattacharyya and Mitra on Wed, Mar 17, 2010 about "Missing pairs from X3DNA?" Here is the quote (verbatim):


[quota]Recently, I came across you article titled "On the Role of the cis Hoogsteen:Sugar-Edge Family of Base Pairs in Platforms and Tripletss Quantum Chemical Insights into RNA Structural Biology" (J. Phys. Chem. B 2010, 114, 3307–3320). I am pleased that my 3DNA package was used in your comparison of three automatic base-pair identifying algorithms.

    On page 3313, you wrote: "Out of all 23 pairs, only two are detected by 3DNA44 (Table S1, Supporting Information)." I was surprised by that assessment, so I downloaded the supplementary PDF and checked carefully Table S1. It turned out that if you had run "find_pair" with "-p" option (as should have been done, since the default is for bps in double-helical regions), you should have been able to detect all of the pairs listed in Table S1 (please let me know if otherwise). I verified 1GID; A225(A):A226(A), and attached please find file '1jj2.bps' which contains all the pairs listed in Table S1. Note that '1jj2.bps' was generated by the following command using 3DNA default settings:
find_pair -p 1jj2.pdb 1jj2.bps

    FYI, the NDB bp detection tool is based on an earlier version of 3DNA, so is the RNAView program (

    By design, 'find_pair' should be able to detect any pair that fulfills a set of intuitive geometric criteria. I would like to hears counter examples, and will take them as bugs in 'find_pair' that should be fixed.[/quota]

  4. Purshotam SharmaJune 24, 2010 at 5:00 PM

    Dear Dr. Lu,
    Thanks a lot for your reply. I forgot to mention earlier regarding the mail you sent to my PhD advisor Prof. Mitra, and our collaborator, Prof. Bhattacharyya. In fact, I wish to thank you for pointing out mistake in the paper. This has also prompted us to a write a correction document, which we are in the process of submitting to JPCB.
    I wish to mention however, that before doing this work, I did not have much prior experience with 3DNA, since I have mostly used BPFind in our earlier works on RNA base pairs.
    While using find_pair, I found the description of -p option as:
    -p find all base-pairs and higher base associations.

    At first look, what I understood from it was that using -p option, one can find higher order structures, in addition to base pairs. Since my intention was to find only base pairs, I did not use this option. That's how the error crept in.
    I think if you specify a little more clearly that this option should be used while dealing with RNA base pairs or RNA structures, it may be more beneficial for inexperienced users like me.
    I wish to admit that I did not read the manual of 3DNA, and rather read the description which comes when one runs find_pair without giving the input pdb file. May be in the manual you have given greater details about this option.
    In fact, now I am trying to get more used to 3DNA. Anyways thanks a lot for developing a wonderful suite of packages for the community.

  5. Dear Purshotam,

    It is a wonderful experience to communicate with you and your PhD advisor Prof. Mitra, and collaborator, Prof. Bhattacharyya. I appreciate your courage to face the error and the effort to make a correction.

    I fully agree with you that 3DNA documentation is not complete/thorough; it is not even very user-friendly. When I get a chance, I will put more efforts in improving 3DNA's documentation.

    Regarding the "-p" option of find_pair, it was available from v1.x. However, I did *not* mentioned it explicitly before the 2008 3DNA Nature Protocols paper for two reasons: one is for further refinements, and the other is to publish something about first. Up to now, I have not published a thorough description of the underlying algorithm of find_pair yet. While at Rutgers, though, I did share the source code with the NDB and the Olson lab, and guided others un-conservatively to take full advantage of it. Later on, two papers came out, which are directly based on find_pair; I am not a coauthor on either of them.

    Thanks for using 3DNA. As always, I welcome users' comments, and I will be quite in response.

  6. Dear Dr. Lu,
    Thanks a lot for your warm words. I wish to mention that since both of us work on nucleic acid structures, (however with different approaches), I feel we have complementary sets of experiences. I work mainly in quantum chemistry (and also to some extent on structural bioinformatics) of nucleic acid building blocks, whereas you expertise seems to be more in scientific programming and structural biology. If possible, I hope we can try to merge our experiences to come up with a joint research work in future.
    Anyways, thanks a lot again.

  7. Dear Dr. Lu,
    I noticed one problem with your "find_pair" - when we use it to detect base triples, etc using the -p option, it does not create any information that can be used as input to "analyze" module of 3DNA package. Hence one can not analyze parameters of tertiary interactions. Is there any way to calculate base pair parameters of the base triples, etc. tertiary interactions?

    Dhananjay Bhattacharyya

  8. Dear Dr. Bhattacharyya,

    Could you please repost your question in the 3DNA forum? I feel this question is of general interest. Please be specific (using an example) when post.




You are welcome to make a comment. Just remember to be specific and follow common-sense etiquette.