Saturday, March 13, 2010

Hoogsteen base-pair

The A·U (or A·T) Hoogsteen pair is a well-known base pair (bp), named after the scientist who discovered it. As shown in the Figure below (left), in the Hoogsteen bp scheme, adenine uses its N7 and N6 atoms (at the major groove edge) to form two H-bonds with the N3 and O4 atoms from uracil, respectively. Interestingly, if the uracil base ring is flipped around the N7(A)N3(U) H-bond by 180 degrees, N6(A) can also form an H-bond with O2(U), i.e., N6(A)O2(U): this pairing scheme is called the reverse Hoogsteen bp (right).

I first came to know about the Hoogsteen bp from Saenger's book ("Principles of Nucleic Acid Structure"). Over the years, I have read many articles mentioning the Hoogsteen bp and touched this topic myself in the 2003 3DNA NAR publication. However, I have never read Hoogsteen's two original publications on this topic until recently:
  1. The two-page long preliminary report, titled "The structure of crystals containing a hydrogen-bonded complex of 1-methylthymine and 9-methyladenine", was published in Acta Cryst. (1959). 12, 822-3. The paper contained only a single reference to the Watson-Crick DNA structure paper, published in Nature in 1953. I found it very revealing to understand why Hoogsteen used the methyl-ed derivatives of thymine and adenine, and how the failed initial interpretation of the experimental "vector-density map" using the Watson-Crick A-T bp led to the discovery of the new base-pairing scheme:
    The fact that the first trial structure could not be refined led to a more critical scrutiny of the generalized projection and a greater emphasis on the significance of certain spurious peaks and on relatively large variations in the heights of peaks that were assumed to represent atoms. The correct structure was finally discovered by changing the positions of a few atoms in the 9-methyladenine portion of the asymmetric unit.
  2. The more extensive account of the Hoogsteen bp story, titled "The Crystal and Molecular Structure of a Hydrogen-Bonded Complex Between 1-Methylthymine and 9-Methyladenine", published in Acta Cryst. (1963) 16, 907-16.
I like these two papers, and more generally those focused-articles, where authors get directly to a point and addressed it thoroughly and clearly. Most publications nowadays are very ambitious, trying to solve "big problems": the papers are generally far more complicated and often have "reproducibility" problems.

As a side note, the term Hoogsteen "edge" appears quite frequently in today's publications of RNA structures: in the Leontis-Westhof bp classification scheme, the term simply means the major groove edge in what would be a Watson-Crick bp geometry.