Recently, I (re-)read a few articles related to the RNA backbone by Jane Richardson et al., including
- "RNA backbone is rotameric" (PNAS 2003)
- "RNA backbone: consensus all-angle conformers and modular string nomenclature" (RNA 2008)
- "MolProbity: all-atom structure validation for macromolecular crystallography" (Acta Crystallogr D Biol Crystallogr. 2010)
- "PHENIX: a comprehensive Python-based system for macromolecular structure solution" (Acta Crystallogr D Biol Crystallogr. 2010)
C3'-endo and C2'-endo sugar puckers are highly correlated to the perpendicular distance between the C1'–N1/9 glycosidic bond vector and the following phosphate: > 2.9 Å for C3'-endo and < 2.9 Å for C2'-endo. (p.16 of the MolProbity paper)
Out of curiosity and to get a better understanding of this correlation, I played around with some sample cases both visually in RasMol and numerically. Overall, this is a simple geometric problem, i.e., the shortest distance from a point to a line in 3-dimensional space. Given below is the Octave/Matlab script for calculating the distances for G175 and U176 of PDB entry 1JJ2 (the large ribosomal subunit of Haloarcula marismortui):
function d = get_p3_nc_dist(P3, C1, N) N_C1 = N - C1; # vector from N to C1' nv_N_C1 = N_C1 / norm(N_C1); # normalized vector C1_P3 = P3 - C1; # vector from C1 to P3 proj = dot(C1_P3, nv_N_C1); d = norm(C1_P3 - proj * nv_N_C1); end ## G175 (1jj2) P3 = [70.104 112.366 44.586]; C1 = [73.017 109.666 45.304]; N = [74.445 109.380 45.288]; d1 = get_p3_nc_dist(P3, C1, N) # 2.2 Å -- C2'-endo ## U176 (1jj2) P3 = [66.871 116.402 46.804]; C1 = [68.213 112.454 49.279]; N = [69.678 112.480 49.438]; d2 = get_p3_nc_dist(P3, C1, N) # 4.6 Å -- C3'-endoThe GpU used in the above example forms a dinucleotide platform, where the sugar of G175 adopts a C2'-endo conformation, and that of U176 has C3'-endo. Indeed, the distance for the G175 nucleotide is 2.2 Å, less than 2.9 Å; whilst the value for U176 is 4.6 Å, greater than 2.9 Å.
It is worth noting the above mentioned articles from Richardson et al. are focused on RNA backbone, without paying attention to the base (pair) geometry. The Zp parameter, which quantifies the z-coordinate of the phosphorus atom in the mean reference frame (see "A-form conformational motifs in ligand-bound DNA structures", JMB 2000), can be easily adapted to the analysis of single stranded RNA structures. For example, the vertical distances of the 3' phosphorus atoms to the G175 and U176 base planes are 1.9 Å and 4.4 Å, respectively.
Since base planes and the phosphorus atoms are the most accurately located entities in a given nucleic acid structure, the nucleotide-based Zp variant presumably would have some advantage over the distance from phosphorus to the glycosidic bond. Naturally, this Zp parameter will be added in future releases of 3DNA.