Over the past decade, the Human Genome Project has helped to clarify the number of genes from previously assumed ~100,000 to the "true" number of only ~21,000. The dramatically reduced number of genes illustrates the crucial importance of non-coding (used to be called "junk") DNA to biology, yet what non-coding DNA does is still befuddling. Nowadays, we are faced with data deluge from sequencing, gene expression, protein (transcription factor) binding, and other new technologies. The complexity of biology has grown significantly, instead of simplified, even for the most extensively studied protein p53. As put by Jennifer Doudna, “The more we know, the more we realize there is to know.”
The community has gradually realized that information gathering does not always bring corresponding increase of meaningful biological insights. We are facing an age of “drowning in information, starved for knowledge.” For example, systems biology, a new discipline “supposed to help scientists make sense of the complexity”, has turned out that “In many cases, the models themselves quickly become so complex that they are unlikely to reveal insights about the system, degenerating instead into mazes of interactions that are simply exercises in cataloguing.” I cannot agree more with the comment by Leonid Kruglyak: it is naive to think that “you can simply take very large amounts of data and run a data-mining program and understand what is going on in a generic way.”
Are we lost in the sea of biological data? Not necessarily. Eric Davidson's work is an excellent example in “taking smarter systems approaches” to reveal overarching biological rules. Instead of a “machine learning” type top-down approach, the “insights have come when scientists systematically analyse the components of processes that are easily manipulated in the laboratory — largely in model organisms. They’re still using a systems approach, but focusing it through a more traditional, bottom–up lens.” Through this systemic bottom-up approach, Davidson's group has deciphered the mechanism of how gene expressions are controlled through regulatory interactions and specify the construction of sea-urchin’s skeleton.
Interestingly, Eric Davidson gave a seminar at C2B2 on Thursday, April 22, titled "Causal Systems Biology: the Sea Urchin Embryo Gene Regulatory Network". I was very impressed by his talk, and the points he made in the Nature News article.