Sunday, January 9, 2011

IUPAC nucleotide symbols and their complements

Recently, I was interested in knowing the complements of all the IUPAC nucleotide symbols. As blogged previously, I am quite familiar with the "meaning of nucleotide IUPAC codes" (namely A/C/G/T, and R/Y/N etc). However, when I first check the Gene Infinity website on nucleotide symbols, it still puzzled me for awhile to figure out the meaning of the DNA alphabet (with complements), as except below:
A  C  G  T    M  R  W  S  Y  K    B  D  H  V    N
|  |  |  |    |  |  |  |  |  |    |  |  |  |    |
T  G  C  A    K  Y  W  S  R  M    V  H  D  B    N
For example, some degenerated IUPAC symbols are complemented to themselves (e.g., W–W and S–S), while others are seemingly "hard" to apprehend (e.g., B–V and D–H).

After thinking it for a bit, things begin to become clear. They are based on the complementarity of Watson-Crick base-pairs (A–T and G–C) and the meaning of each degenerated IUPAC nucleotide symbol. For example,
  • W represents A/T, meaning weak (with only two hydrogen-bonds). The complements of A/T are T/A respectively, which is W again.
  • B (not A) represents C/G/T, and their complements are G/C/A respectively, which is V (not U/T).
It is easy to verify that all other complementary pairs follow exactly the same basic principle.

No comments:

Post a Comment

You are welcome to make a comment. Just remember to be specific and follow common-sense etiquette.