Sunday, September 19, 2010

New NDB entries have IDs start with the prefix NA?

Recently, while reading the article titled "Designing Triple Helical Fragments: The Crystal Structure of the Undecamer d(TGGCCTTAAGG) Mimicking T.AT Base Triplets" by Van Hecke (Crystal Growth & Design), I noticed the following:
The atomic coordinates and structure factors have been deposited in the Protein Data Bank41 and Nucleic Acid Database42 (PDB and NDB entry codes 3L1Q and NA0392, respectively).
The NDB id NA0392 reminded me of an email communication I had with Dr. Olson early this year when she told me of the id change of new NDB entries. Occupied with other issues, I did not pay much attention to this point until recently.

Over the years, NDB has established itself prominently as "a repository of three-dimensional structural information about nucleic acids". Traditionally, an NDB id has some associated "meanings", with noticeably exceptions, as discussed one year ago in my blog post "PDB id vs NDB id". Thus, it is not surprising that NDB decided to make changes in naming new ids. However, I cannot find any announcement on the id change in the NDB website; a Google search on "NDB id" did not uncover anything new either. Luckily, NDB provides search by release date ("Released Since"). After a few tries, I traced that the new id policy began to be implemented from around March 2010.

The new NDB id convention appears to be NA (presumably standing for for Nucleic Acid) followed by 4 digits. As more entries available, it is not that hard to imagine that the number of digits must be expanded to 5, 6, or even more. Naturally, PDB id, with a fixed 4-character length (up to now), is (far) more consistent than the NDB id is.