Home Learn Room Librum Blog Room Give a Suggestion Report a Problem About


Reproducible Research: Conferences and Workshops

February 8th, 2011

Verifiable, Reproducible Research, and Computational Science

Date: March 4
Location: Reno, Nevada.
Organisers: Part of the SIAM Conference on Computational Science and Engineering (CS11).

Reproducible Science and Open-Source Software in the Geosciences

Date: March 22-23
Location: Long Beach, CA. USA.
Organisers: Part of the SIAM Conference on Mathematical & Computational Issues in the Geosciences (GS11).

Reproducible Research: Tools and Strategies for Scientific Computing

Date: July 13-16
Location: Vancouver, BC, Canada.
Part of the ICIAM 2011 Satellite Meetings
Organisers: Randall J. LeVeque, Applied Math, U. Washington, Ian M. Mitchell, Computer Science, U. British Columbia, Cleve Moler, The Mathworks Inc., Victoria Stodden, Statistics, Columbia University.

Reproducible Research Tools

January 24th, 2011

Here is an updated list of existing tools, software and methods that can be used to create or enhance a Reproducible Research Compendium.

  • Sweave -Create dynamic reports (based on LaTeX and R)
  • Babel extension-Babel is Emacs Org-mode ability to execute source codes (R, Octave, Matlab, Python, etc.) within the same document
  • SCons- A part of the Madagascar software package, based on SCons, for managing data processing flows and reproducible computational experiments
  • AMRITA-A system for communicating software-based ideas and information (does not run under Windows)
  • CDEpack-A tool that automatically packages up everything required to execute a Linux command on another computer without any installation or configuration
  • StatDocs-Create interactive statistical documents
  • DynDoc-A set of functions to create and interact with dynamic documents and vignettes in R
  • Cacher and CacheSweave-R packages for caching statistical analyses and Sweave computations
  • Python Tools for RR-Python tools for reproducible research on hyperbolic problems
  • MATLAB Report Generator- Automatically generate reports from MATLAB in a wide variety of formats
  • Emacs Speaks Statistics-Supports editing of scripts and interaction with various statistical analysis programs such S-Plus, R, SAS and Stata

[updates]

  • Sumatra- A tool for managing and tracking projects based on numerical simulation or analysis
  • VisTrails- Create and modify executable workflows and papers

Are you aware of other RR tools? Contact us


OSA Interactive Science Publishing (ISP) and MIDAS

May 29th, 2010

The interactive science publishing—ISP lunched by Optical Society of America—OSA along with MIDAS, the repository associated with articles published in OSA journals, could be promising platforms approaching reproducible research goals.

Here is an excerpt from the announcement:

With support from the NIH National Library of Medicine, ISP allows authors to publish large 2D and 3D datasets with original source data that can be viewed and analyzed interactively by readers. ISP provides the software for authors to organize and publish source data while offering readers the viewing and analysis tools.

MIDAS, the repository: here are the terms of use:

… You may use the datasets for research purposes, provided that Author(s) are given proper credit as the source of the data, in a manner consistent with generally accepted scientific principles. …

OSA-ISP, the software: only available for Windows and Mac OS, and doesn’t seem to be open source. Moreover, access to full OSA ISP authoring functionality is freely available, following activation, only for 30 days. After 30 days, the software reverts to reader mode. Here is the ISP- FAQs.

You can give them your feedback by taking their survey.


Thoughtful Comments on Public Access to Research Publications

January 5th, 2010

The Office of Science and Technology Policy — OSTP has recently launched a public forum to discuss options for improving public access to results of federally funded research.

Here is an excerpt from the announcement:

The Office of Science and Technology Policy in the Executive Office of the President and the White House Open Government Initiative is launching a “Public Access Policy Forum” to invite public participation in thinking through what the Federal government’s policy should be with regard to public access to published federally-funded research results. To that end, OSTP will conduct an interactive, online discussion beginning Thursday, December 10. We will focus on three major areas of interest: Implementation, Features, and Management.

In the implementation section we read:

One of our nation’s most important assets is the trove of data produced by federally funded scientists and published in scholarly journals. The question that this Forum will address is: To what extent and under what circumstances should such research articles—funded by taxpayers but with value added by scholarly publishers—be made freely available on the Internet?

Many interesting comments have been made on this subject encouraging open access to the published papers as well as any supporting data and code.

Here are some short excerpts from the numerous comments:

[…] It is imperative to provide public access to tax-payer funded scientific output, not only the final published paper but also the supporting data and code necessary for the reproducibility and skepticism fundamental to scientific communication and progress.

[…] It is most feasible to open access to the data after the data-taking has completed, the data are understood, and a simple format can be provided to the public.

[…] Any embargo should be as short as possible (preferably none!), but all articles must be deposited in an institutional repository right away, upon date of acceptance, not just after the embargo elapses: there will be 2 kinds of OA documents in the archive: immediate OA (at least 63% of journals endorse immediate OA) and delayed OA (embargoed ).

[…] What version of the paper should be made public under a public access policy (e.g., the author’s peer-reviewed manuscript or the final published version)?
Both— the author’s peer-reviewed manuscript prior to publication and the final published version after publication. The heart of science is the testability of hypothesis. To this end all raw data should be included with the manuscript. The goals of open government should support the objective of keeping the science honest and testable.

[…] Some suggest that every portion of a research effort should be made public— all collected or raw data, notebooks, calculations, etc. There is a case to be made for publishing data sets. However, requiring everything recorded during a research effort be prepared for public access may have the perverse effect of slowing the publication process, or discouraging publication of some research all together. Making the published article accessible is a more reasonable and achievable first goal, followed by publishing pertinent datasets when it is decided how best to do so.

Many other interesting suggestions can be found on the OSTP blog.


JEL— a linguistic journal of reproducible research

July 26th, 2009

The LSA- eLanguage initiative has recently launched a new journal: The Journal of Experimental Linguistics.

Here is an excerpt from the announcement:

The Journal of Experimental Linguistics is part of the Linguistic Society of America’s eLanguage initiative. Like the rest of eLanguage, JEL is an Open Access online journal.

JEL is a linguistic “journal of reproducible research”, that is, a journal of reproducible computational experiments on topics related to speech and language. These experiments may involve the analysis of previously­ published corpus data, or of experiment­-specific data that is published for the occasion. Other relevant categories include computational simulations, implementations of diagnostic techniques or task scoring methods, methodological tutorials, and reviews of relevant new publications (including new data and software).

Mark Liberman is the editor in chief.


Open Database License is Released

July 2nd, 2009

Open Data Commons has finally released the Open Database License (ODbL) v1.0a major step forward for open data.

Here is an excerpt from the announcement:

The Open Database License (ODbL) is an open license for data and databases which includes explicit attribution and share-alike requirements.

This license, the first of its kind, is a major step forward for open data. There are currently very few licenses available suited to data and databases and none which provide for share-alike (existing share-alike licenses such as the GPL, GFDL and CC By-SA are all unsuitable for data).

The development of the ODbL, has been a major effort extending over more than one and half years with an intensive consultation and review period for the last 6 months.


Enhance the Visibility of Your Reproducible Research Compendia

June 4th, 2009

… Researchers provide their research compendia on their personal or institutional websites all around the world. This method of providing access to research follows a distributed scheme and brings up some issues about the worldwide visibility of the research compendia.

In such a distributed system, good visibility and retrieval of information are essential for the successful delivery of services. Fortunately, many different systems have been established to improve the information retrieval, notably search engines.

While search engines are vital for the retrieval of information on the Web, they do not index websites equally and may not index new pages for months. This usually leads to a delay in the information retrieval whereas delayed indexing of scientific research is not desirable …

|Read The Full Article|


A Guide to Making Your Data Open

May 18th, 2009

Open Data Commons has released an ultra simple guide to making research data Open: Making Your Data Open: A Guide (Beta), [PDF].

Here is an excerpt:

What is Open Data?

Open data is data that anyone is free to use, reuse and redistribute without restriction (except, perhaps the requirements to attribute and sharealike). For precise details see http://opendefinition.org/.

Why Does Openness and Licensing Matter?

[…] open data is  crucial because open data is so much easier to break-up and recombine, to use and reuse. Licensing is important because it removes uncertainty. […]

  • So How Can I Make My Data Open?
  • How Do I License My Data?

A Guide to Including Research Data in Repositories

May 15th, 2009

a guide to including research data in repositories: Policy-making for Research Data in Repositories: A Guide.

Here is an excerpt from the introduction:

The Policy-making for Research Data in Repositories: A Guide is intended to be used as a decision-making and planning tool for institutions with digital repositories in existence or in development that are considering
adding research data to their digital collections.

The guide is a public deliverable of the JISC-funded DISC-UK DataShare project (2007-2009), http://www.disc-uk.org/datashare.html, which established institutional data repositories and related services at the partner institutions: the Universities of Edinburgh, Oxford and Southampton. It is a distilled result of the experience of the partners, together with Digital Life Cycle Research & Consulting. The guide is one way of sharing our experience with the wider community, as more institutions expand their digital repository services into the realm of research data to meet the demands of researchers who are themselves facing increasing requirements of funders to make their data available for continuing access.


Reproducible Research Librumbeta is going live!

May 13th, 2009

Reproducible Research Librum is an open directory for reproducible research where you can find many reproducible research compendia and simply add yours.

Reproducible Research Librum is an in-depth approach to provide a comprehensive list of reproducible research websites and compendia.

In brief, Librum aims to increase the visibility of reproducible research compendia over Internet and to facilitate the search and find process for end-users.

|Go to Librum| |Learn more about Librum|




Reproducible Research Planet! Home | Learn Room | Librum | Blog Room | Give a Suggestion | Report a Problem | Contact Us | RSS