This reminds me of a recent paper "Repeatability of published microarray gene expression analyses" by Ioannidis et al. [Nat Genet. 2009, 41(2):149-55]. In the abstract, the authors summarized their findings:
Here we evaluated the replication of data analyses in 18 articles on microarray-based gene expression profiling published in Nature Genetics in 2005-2006. One table or figure from each article was independently evaluated by two teams of analysts. We reproduced two analyses in principle and six partially or with some discrepancies; ten could not be reproduced. The main reason for failure to reproduce was data unavailability, and discrepancies were mostly due to incomplete data annotation or specification of data processing and analysis.
Specifically, please note that:
- The authors are experts on microarray analysis, not occasional application software users.
- These 18 articles surveyed were published in Nature Genetics, one of the top journals of its field.
- Not a single analysis could be reproduced exactly: two were reproduced in principle, six only partially, and the other ten not at all.
In my experience and understanding, the methods section in journal articles is not, and should not aim to be, detailed enough for exact duplication by a qualified reader. Instead, most such reproducibility issues would be gone if journals require that authors provide raw data, detailed procedures used to process the data, software version and options used to generate the figures and tables reported in the publication. Such information could be made available in (journal or authors) websites. This is an effective way to solve the problem, especially for computational, informatics-related articles. Over the years, for papers I am the first author or I have made major contributions, I've always kept a folder for each article to include every detail (data files, scripts etc) so that the published tables and figures can be repeated precisely. This has turned out to be extremely helpful when I want to refer back to early publications, or when I was asked by readers for further details.
As noted by Osterweil et al., "repeatability, reproducibility, and transparency are the hallmarks of the scientific enterprise." To really achieve the goal, every scientist needs to pay more attention to details and be responsive. Do not be fool around by the impressive introduction or extensive discussions (which are important, of course) in a paper: to get the bottom of something, it is usually the details that count.
No comments:
Post a Comment
You are welcome to make a comment. Just remember to be specific and follow common-sense etiquette.