Saturday, July 11, 2009

On maintaining the 3DNA forum

Over the past few years, maintaining the 3DNA forum (i.e., answering questions, performing administrative tasks) has taken up a significant amount of my spare time. Sometimes it could be quite demanding, especially because I need to pay great attention to details. Overall, though, it is a valuable experience, and I feel that the time is well-spent: 3DNA has been continuously refined and more widely used; my knowledge of nucleic acid structures (especially RNA) has been significantly sharpened; I have stayed aware of progress in related research fields and see more of the world; and I feel great pleasure in being of help to the community.

Some basic facts/statistics:
  • I tell everyone who sends me an email to ask a 3DNA-related question to register and re-post in the 3DNA forum, but less than 50% actually do this. However, if you want to be helped, you have to follow the rules.

  • Most of the forum registrations (over 90% at times) are spam. To forestall this, I must continually update phpBB3 to its latest version.

  • Over 50% of legitimate registrations end up being deleted instead of being activated due to the users' failure to send me an email for activation, as required. Among those whose send me an email, most use a subject line of "Re: 3DNA forum registration 'user-id'", as suggested. Only a few volunteer to share with me their real name, address, etc. (e.g., via a signature). Furthermore, some do not post back in the forum after their account is activated.

  • With very few exceptions, questions posted in the forum are normally addressed within a couple of days, or even sooner.

  • Except for one case, communicating with users has mostly been a pleasant experience. Some (though not many) users even posted back a summary and/or a thank-you note.

  • I would like to compliment esguerra, yrxin and tgaillar for sharing their tips and tricks in the section Users' contributions. Apart from me, ghzheng has contributed the most in the forum.

  • Overall, the forum is low volume (which is just fine) and spam-free (which is very important).

To make the forum policy upfront and explicit, in order to avoid misunderstandings or surprises, I have been enclosing the following note in each new registration confirmation message:
Your 3DNA forum registration has been activated — welcome aboard! See
http://xiang-jun.blogspot.com/2009/07/on-maintaining-3dna-forum.html

---------------------------------------------------------------------
I am so pleased that you have come thus far! To make the 3DNA forum a
more pleasant virtual community for all of us to learn from and
contribute to, please be considerate and practice good netiquette
(http://www.albion.com/netiquette/). More specifically, I would like
to reemphasize the following:

0. Do your homework; read the FAQ and browse the forum.

1. Ask your questions in the 3DNA forum instead of sending me emails.

2. Be specific with your questions; provide a minimal, reproducible
example if possible; use attachments where appropriate.

3. Do not ask for or expect immediate responses to your questions.
Lower your expectations and you will more likely end up feeling
happier.

4. Respond to requests for clarifications.

5. Summarize the solution to your problem(s) from a user's
perspective by providing details, for the benefit of other users.

6+ Contribute back to 3DNA if you can:
o Report bugs — including typos
o Make constructive suggestions — anything to make 3DNA better
o Answer other users' questions
o Share your use cases in the "Users' contributions" section

In a nutshell, you are welcome to participate and should not hesitate
to ask questions, but remember to play nice and preferably share what
you've learned!
---------------------------------------------------------------------
Thus, intentionally or otherwise, the forum has also acted as a filter to make my life easier. Whenever possible, though, I have tried my best to reward those who follow the simple, common sense rules. After all, nothing should be taken for granted, and no one likes to be taken advantage of. I am glad that through my contributions and user involvement, the forum has survived and 3DNA has thrived (evident from citations, numerous web links to its homepage, other services/tools — including NDB and PDB — taking advantage of parts of its functionality, and more recently, two dedicated web-interfaces), serving as a valuable resource to the community.



PS: Two related posts in the 3DNA forum:

Friday, July 10, 2009

Does 3DNA work for RNA?

At the C2B2 party this afternoon, I was asked the question: "Does 3DNA work for RNA?" Well, a good question, indeed. The short answer is definitely, YES. However, a detailed explanation is needed to address the underlying intuitive assumption: 3DNA is only for DNA.
  1. The name 3DNA was due to Dr. Olson, after we struggled quite a while. Initially, we played with NuStar (which was actually cited once by Richard Dickerson et al), and Carnival etc. I still remember the day when Dr. Olson asked me "How about 3DNA?" We immediately reached an agreement: that's it -- what a cute name! Another advantage (as it becomes clear later): since 3DNA starts with '3', it (mostly) shows up right at the top of many on-line lists of bioinformatics tools.
  2. Interpreted literally, 3DNA could mean 3-DNA, i.e., the three most common types of DNA: A-, B- and Z-form. That may be one of the reasons where the misconception that 3DNA is only for 3DNA comes from. Another reason could be that structural work on DNA is what the Olson lab best known for.
  3. The number '3' in 3DNA should also be associated with its three key components: analysis, rebuilding and visualization. In a sense, this is my favorite.
  4. Of course, 3DNA stands for 3D-NA, 3-Dimensional Nucleic Acids, as expressed explicitly in the titles of our two 3DNA papers (2003 NAR and 2008 NP).
The applications of 3DNA to RNA structures can be broadly categorized as follows:
  1. Automatically detect all existing base-pairs, Watson-Crick (A-U, G-C, wobble G-U) or non-canonical, using a set of simple geometric criteria. Furthermore, it has a unique base-pair classification system based on the six numerical structural parameters, suitable for database storage and search.
  2. Automatically detect all triplets or higher-order base-associations.
  3. Automatically detect double helical regions, regardless of backbone connection, thus ideal for finding pseudo-continuous coaxial stacking.
  4. The above three features are seamlessly integrated with the visualization component to allow for easy generation of publication quality images. See the 3DNA 2008 NP paper for detailed examples.
As further examples, the following two RNA publications take advantage of find_pair from 3DNA:
It is well worth noting that the base-pair detecting algorithm in RNAView is based on an earlier version of find_pair, a basic fact ignored in the RNAView publication.

In summary, 3DNA works for RNA as well as for DNA, and more.

Sunday, July 5, 2009

Errors in PDB entries

In the June 24, 2009 issue of Nature (v459, pp.1038-1039), there is an news item titled "New protein structures replace the old" by Katharine Sanderson, on a 'Dutch software to weed out errors in Protein Data Bank'.

In my experience with software development and using the PDB/NDB, it is certainly not a surprise that there are errors of various types in the macromolecular databases: whenever I apply an algorithm consistently to all the entries in the NDB (which is part of PDB, consisting of only nucleic-acid containing structures), I always notice some inconsistencies. As a more concrete example, blocview, a visualization tool initially developed as a by-product of another project while I was still at Rutgers (and partially involved with the NDB), was once used for correcting errors in the NDB as well.

Pure 're-refinement' of existing structures with software is surely helpful in catching obvious, systemic errors. However, it is impossible to catch all problems, no matter how sophisticated the software could be. Moreover, as put in the comment by yet another phd: "i have hard enough time getting the RCSB to change four atoms in a structure for me." and "scientists must remember that when they click the re-refine button, you read the paper where the structure was reported."

Errors will always be there -- that's just a basic fact of life. It is thus crucial for those who perform structural analysis to draw their conclusions based on not one or just a few purposely selected structures, but on a more objective and extensive ground.

Two web-interfaces to 3DNA, and more

As a nice surprise, I found in the 2009 web-server issue of NAR published on July 1, two articles on web-interface to 3DNA back-to-back:
  1. "3D-DART: a DNA structure modelling server" by van Dijk and Bonvin from Utrecht University, The Netherlands. Excerpt from the abstract:
    As a response to the demand for 3D-structural models reflecting the intrinsic plasticity of DNA we present the 3D-DART server (3DNA-Driven DNA Analysis and Rebuilding Tool). The server provides an easy interface to a powerful collection of tools for the generation of DNA-structural models in custom conformations. The computational engine beyond the server makes use of the 3DNA software suite together with a collection of home-written python scripts.
  2. "Web 3DNA—a web server for the analysis, reconstruction, and visualization of three-dimensional nucleic-acid structures" by Zheng, me and Olson. Excerpt from the abstract:
    The w3DNA (web 3DNA) server is a user-friendly web-based interface to the 3DNA suite of programs for the analysis, reconstruction, and visualization of three-dimensional (3D) nucleic-acid-containing structures, including their complexes with proteins and other ligands.
While I was aware of 3D-DART prior to its publication, I certainly did not expect it to appear in the same issue as w3DNA of which I am a co-author. There is nothing more compelling to illustrate 3DNA's value to the community than a third-party web-interface to it! Combined together, these two web-servers make 3DNA much more accessible to even wider audience. Specifically, they could well serve for education purposes, e.g., to conveniently build a DNA-model of A-, B- or C-form, with user-supplied sequences. Users would be glad to have a choice that better fits their needs. In the long run, the one which provides best user-support will survive.

It is also worthy noting the in the same 2009 NAR web-server issue, another paper titled "SARA: a server for function annotation of RNA structures" by Capriotti and Marti-Renom from Spain also makes use of 3DNA. This serves to emphasize the point that 3DNA is not just for DNA, but for RNA as well -- 3DNA has unique features for RNA that are not found in other currently available software tools that I am aware of.