Saturday, May 9, 2009

Running Windows on Ubuntu Linux with VirtualBox

Ever since I switched from MS DOS to Unix and then Linux, I have never felt the desire to move back to the Windows world. Yet in the real world, MS Windows is an operating system (OS) that you simply can't avoid. For example, I compiled 3DNA v2.0 in Cygwin and MinGW, both in Windows. For such an one time process, my dual boot-able Ubuntu and Vista has worked just fine. However, when writing an article in a collaborative setting, access to Word has becomes a must and switching back and forth between Linux and Windows is really a pain -- not just time consuming, but also the inconvenience in communicating files between the two OSes. OpenOffice is fine for personal, occasional use, but I have experienced some (subtle) compatibility issues with Word, especially with EndNote support for citations and bibliography.

In the past couple of days, I installed Sun VirtualBox 2.2.2 onto my Ubuntu 9.04 with the following settings:
  • Windows XP (MS Office 2003, EndNote X2)
  • 512 MB base memory
  • 12 MB video memory
  • 10 GB hard disk
  • With Guest Additions installed
I have played around for a little while, and everything seems to work just fine. Now I have full access to MS Office (Word, PowerPoint etc) and EndNote. I really like the feature of "Shared Folders": it allows me to directly access files in my host Linux box from within the guest Windows XP.

The VirtualBox UserManual.pdf is pretty well-written: after reading carefully the first four chapters, I did not have much trouble in getting the system up and running. I also referenced the following two blog posts:

Wednesday, May 6, 2009

PhD pages on DNA base-stacking interactions

I was recently approached by Dr. Victor Zhurkin, a well-known computational biologist at the NIH, asking for some details about the sigma/pi charges used in my JMB publication on DNA base-stacking interactions. It was a paper based on my PhD thesis with Dr. Chris Hunter at Sheffield University. Ever since its publication a decade ago, this is the first time I have been asked questions about it (Dr. Hunter was the corresponding author).

This request was both a nice surprise, and at the same time, a challenge since I do not have the original files at hand. Over the years, I have never been a (big) fan of energy calculations. So unlike my SCHNAaP/SCHNArP papers for which I do keep the original source code and data files to reproduce the results published back-to-back in JMB, I do not have direct access to the original files for del Re (sigma) and Huckel (pi) charges used in my base-stacking interactions paper. I did find the HP 60 meter Data Cartridge I took from Sheffield which should contain all the details of my PhD work, but it is hard these days to find a machine to read it. At the meantime, I found a hard copy of my PhD thesis, which contains further details. So I used my digital camera, took pictures of 11 pages that are relevant, and made it available online in a tarball file.

The key feature of my DNA-stacking interaction work is the distributed charge model for the aromatic bases. This charge model was based on the seminal work of "The Nature of Pi-Pi Interactions" by Hunter and Sanders published in JACS in 1990. This simple distributed charge model catches the aromaticity concept nicely, making it easily understandable to bench chemists and biologist. As I have just checked it out, this 1990 JACS paper has been cited 2280+ times, according to Web of Science. My 1997 JMB DNA-stacking interaction paper has received 78 citations -- certainly not comparable to the JACS paper, but overall not bad at all. (Google Scholar gives much smaller citation numbers, for unknown reasons).

Looking back, the DNA base-stacking project was the main theme of my PhD work. It was the close collaboration with El Hassan in Dr. Chris Calladine's group at Cambridge University that introduced me to the elegant CEHS scheme, and based on which I created the SCHNAaP/SCHNArP software programs. The mathematical rigor of the CEHS scheme, i.e., its complete reversibility with the analysis/rebuilding of dinucleotides, put my theoretical energy calculation on a solid basis for comparison with oligonucleotide X-ray crystal structures. The "detour" to SCHNAaP/SCHNArP reflected my personal interest, and was the starting point that later led to the resolution of the long-standing discrepancies among nucleic acid conformational analyses, the establishment of standard base reference frame, and the creation of the now popular 3DNA software package.

Sunday, May 3, 2009

Two 3DNA figures made into a textbook on structural biology



Last week, while browsing amazon.com, I came across the "Textbook Of Structural Biology" by Anders Liljas. It reminded me of the request we received from Dr. Liljas a while ago, asking for permission to use two figures from 3DNA (v1.5) website, which, of course, we happily granted.

The two figures as they appear in the textbook are shown in the left and right
respectively. Historically, they both first appeared in the 3DNA v1.5 website and later on were adapted into our 3DNA 2003 NAR paper, which is now taken as the "official" source for citation. The figure illustrating nucleic acid structural parameters (left) was also put into our 3DNA 2008 Nature Protocols publication, and the figure showing the roll-slide effects on regular DNA structures (right) was based on the classic work of Calladine & Drew, and could be traced back to my SCHNArP paper published in JMB.

In retrospect, it took me two weeks to produce the figure illustrating various DNA structural parameters (left)
-- in ~20% of the time, I got over 80% right, and it was the remaining 20% that took most time. It is the details that count!

Over the past twenty years, there have been numerous versions of such illustration images, with nearly each publication/author having its own. While looking similar, they are all subtly different due to lack of precise control of the value, orientation and relative scale of the illustrated parameters. To get it consistently and rigorously (reproducibly), a pragmatic approach must be taken, and that is where 3DNA shines.