Tuesday, October 25, 2011

mutate_bases, a new 3DNA tool for in silico base mutations of 3D nucleic acid structures

In response to repeated requests from 3DNA users over the years, I recently (well, a few months ago) released a new utility program, mutate_bases, to the 3DNA suite of software package. As its name implies, mutate_bases can be used for mutating bases in a nucleic acid structure, with two key and unique features: (1) the sugar-phosphate backbone conformation is untouched; (2) the base reference frame (position and orientation) is reserved, i.e., the mutated structure shares the same base-pair/step parameters as the original one.

mutate_bases is a standalone ANSI C program, on a par with other major 3DNA programs (e.g., find_pair, analyze, rebuild, fiber etc). As seen from the help message below, it can be used for any nucleic-acid-containing structures (DNA, RNA, or their complexes) in PDB format:
Usage: mutate_bases mutinfo pdbfile outfile
    'mutinfo' can contain upto 5 fields for each mutation:
        [name=residue_name] [icode=insertion_code]
        chain=chain_id seqnum=residue_number
    Alternatively, 'mutinfo' can be specified with the '-l' option
        followed by a mutation file: '-l mutations.dat'
    o The five fields per mutation can be in any order or CaSe
    o Each field can be abbreviated to its first character
    o Multiple mutations are separated by ';'
    o Fields in [] (i.e., name and icode) are optional
    o Mutation info should be QUOTED to be taken as one entry

    mutate_bases 'c=a s=2 m=DA' 355d.pdb 355d_G2A.pdb
    # mutate G2 in chain A of B-DNA 355d to A
    mutate_bases 'c=a s=2 m=DA; c=B s=23 m=DT' 355d.pdb 355d_mutfile.pdb
    # mutate base-pair G-C (C23 in chain B) to A-T
        # also create file 'mutations.dat', see below
    mutate_bases -l mutations.dat 355d.pdb 355d_mutfile.pdb
        # 355d_mutfile.pdb and 355d_mutfile.pdb would be identical
    mutate_bases 'c=A s=74 m=U' 1evv.pdb 1evv_C74U.pdb
    # mutate C74 in chain A of tRNA 1evv to U

mutate_bases is designed to solve the base mutation problem in a practical sense: robust and efficient, getting its job done and then out of the way. The program can have many possible applications: in addition to perform base-pair mutations in DNA-protein complexes, it should also prove handy RNA modeling and in providing initial structures for QM/MM/MD energy calculations, and in DNA/RNA modeling studies.