Roger D. Peng, Sandrah P. Eckel, "Distributed Reproducible Research Using Cached Computations", IEEE Computing in Science and Engineering, 11(1), January 2009, p.28-34.
Full Paper:
http://www.biostat.jhsph.edu/~rpeng/pap ... distRR.pdf
BibTeX:
NA
Copyright Notice:
Copyright holders include the journal/conference publisher.
Complementary URL:
http://www.biostat.jhsph.edu/~rpeng/RR/index.html
Abstract:
The ability to make scientific findings reproducible is increasingly important in areas where substantive results are the product of complex statistical computations. Reproducibility can allow others to verify the published findings and conduct alternate analyses of the same data. A question that arises naturally is how to conduct and distribute reproducible research. The authors describe a simple framework in which reproducible research can be conducted and distributed via cached computations and tools for both authors and readers. As a prototype implementation they also describe a software package written in the R language. The “cacher” package provides tools for caching computational results in a key-value style database, which can be published to a public repository for readers to download. As a case study, they demonstrate the use of the package on a study of ambient air pollution exposure and mortality in the US.






Share/Save