Saturday, July 24, 2010

3DNA for the analysis of molecular dynamics simulations

To my surprise and delight, 3DNA has been increasingly used in the analysis of trajectories of molecular dynamics (MD) simulations of nucleic acid structures. While I have heard of CHARM/AMBER for a long time and know of the basic principles of MD simulations, I have never performed any simulations (yet). Over the years, I have been mostly interested in methods development and algorithms implementation for better understanding of nucleic acids structures, and have always used NDB/PDB entries for my tests/applications. Thus unlike Curves+, 3DNA lacks direct support for the analysis of MD trajectories.

From the MD-related questions I have received over emails and on the 3DNA forum, I sense that users do not have that much of a problem or inconvenience in applying 3DNA for the task. There is one issue that does stand out, however, i.e., the use of find_pair for each and every structure in the trajectories to prepare an input file that contains base-pairing information for analyze to calculate various parameters. As an example, a recent article by Ricci et al., "Molecular Dynamics of DNA: Comparison of Force Fields and Terminal Nucleotide Definitions" (J Phys Chem B. 2010 Jul 8. [Epub ahead of print]), says that:
It is noteworthy that, because the DNA structure endured strong deformations in these simulations, 3DNA could not perform the analysis of the angles for all the 5000 snapshots. Actually, for system 3 the calculation of the backbone parameters could be performed in only 59% of the snapshots, whereas in system 4 only 18% of the configurations allowed this calculation.
Dr. Netz, the corresponding author, confirmed to me that:
I don't think this is a 3DNA problem, but is rather an effect of the distortion caused by the problems in parameterization of the GROMOS forcefield.
Essentially, though, the problem lies in the coupled use of find_pair/analyze; for strongly deformed DNA structures along the simulation trajectories, some base pairs melt out, and are thus beyond the recognition of find_pair. On the other hand, one does not need to run find_pair for each structure; simply prepare a template input file for analyze, with the help of find_pair and probably some manual editing, then all the structures can be analyzed, no matter how distorted they are. Furthermore, saving the find_pair step should speed up the analysis of thousands of structures, typical of MD simulations.

In fact, 3DNA (from v1.5) has a Perl script named nmr_strs that implements the above idea in a preliminary form. The script is intended to serve as a starting point for the analysis of multiple similar structures, in an NMR ensemble or from MD simulations. See 3DNA wiki page on nmr_strs for details.

In the long run, a better integration between 3DNA and AMBER or other MD packages would solve this problem from the upstream, thus making users' life (downstream) much easier. Until that happens, I hope this post could help clarify the issue somewhat in this important application area of 3DNA. As always, I greatly value your comments.