Highlighted Selections from:

Ten Simple Rules for Reproducible Computational Research


DOI: 10.1371/journal.pcbi.1003285

Sandve GK, Nekrutenko A, Taylor J, Hovig E (2013) Ten Simple Rules for Reproducible Computational Research. PLoS Comput Biol 9(10): e1003285.

p.1: This has led to discussions on how individual researchers, institutions, funding bodies, and journals can establish routines that increase transparency and reproducibility. In order to foster such aspects, it has been suggested that the scientific community needs to develop a "culture of reproducibility" for computational science, and to require it for published claims. -- Highlighted mar 12, 2014

Rule 1: For Every Result, Keep Track of How It Was Produced

p.2: We refer to such a sequence of steps, whether it is automated or performed manually, as an analysis workflow. While the essential part of an analysis is often represented by only one of the steps, the full sequence of preand post-processing steps are often critical in order to reach the achieved result. For every involved step, you should ensure that every detail that may influence the execution of the step is recorded -- Highlighted mar 12, 2014

Rule 2: Avoid Manual Data Manipulation Steps

p.2: Whenever possible, rely on the execution of programs instead of manual procedures to modify data. Such manual procedures are not only inefficient and error-prone, they are also difficult to reproduce. -- Highlighted mar 12, 2014

p.2: Other manual operations like the use of copy and paste between documents should also be avoided. If manual operations cannot be avoided, you should as a minimum note down which data files were modified or moved, and for what purpose. -- Highlighted mar 12, 2014

Rule 3: Archive the Exact Versions of All External Programs Used
Rule 4: Version Control All Custom Scripts
Rule 5: Record All Intermediate Results, When Possible in Standardized Formats
Rule 6: For Analyses That Include Randomness, Note Underlying Random Seeds

p.3: As a minimum, you should note which analysis steps involve randomness, so that a certain level of discrepancy can be anticipated when reproducing the results. -- Highlighted mar 12, 2014

Rule 7: Always Store Raw Data behind Plots

p.3: As a minimum, one should note which data formed the basis of a given plot and how this data could be reconstructed. -- Highlighted mar 12, 2014

Rule 8: Generate Hierarchical Analysis Output, Allowing Layers of Increasing Detail to Be Inspected
Rule 9: Connect Textual Statements to Underlying Results

p.3: Although the results of analyses and their corresponding textual interpretations are clearly interconnected at the conceptual level, they tend to live quite separate lives in their representations: results usually live on a data area on a server or personal computer, while interpretations live in text documents in the form of personal notes or emails to collaborators. Such textual interpretations are not generally mere shadows of the results— they often involve viewing the results in light of other theories and results. As such, they carry extra information, while at the same time having their necessary support in a given result. -- Highlighted mar 12, 2014

Rule 10: Provide Public Access to Scripts, Runs, and Results

p.3: Making reproducibility of your work by peers a realistic possibility sends a strong signal of quality, trustworthiness, and transparency. This could increase the quality and speed of the reviewing process on your work, the chances of your work getting published, and the chances of your work being taken further and cited by other researchers after publication [25]. -- Highlighted mar 12, 2014