Documenting Scientific Code and Data

EDS 214: Analytical Workflows and Scientific Reproducibility


Day 2 Afternoon | August 26th, 2025

This afternoon, you’ll learn:


  • The importance of documentation for reproducibility
  • The target audiences for different types of documentation
  • How to use READMEs, code comments, and code style to document your workflow

LEGO Blueprints Activity


Step 1: Build (in trios)

  • Form trios and receive LEGOs
  • Each trio builds a LEGO structure. Get weird with it!
  • Take a picture of the finished product
  • Destroy the LEGO structure

LEGO Blueprints Activity


Step 2: Copy (in pairs)

  • Switch LEGOs and pictures with another trio
  • Try to replicate the structure using only the picture
  • Take notes about challenges encountered

LEGO Blueprints Activity


Step 3: Share (in groups)

  • Trios compare original and finished structures
  • Take notes about what was different
  • Describe 2-3 pieces of documentation that would have helped

LEGO Blueprints Activity


Step 4: Debrief (as a class)

Types of Documentation


README

A file that (succintly) describes the project and how it works.

Comments

Annotations within the code to explain why.

Code style

A consistent format for writing code, including naming conventions and structure.

README Documentation


Typical README contents

  1. A short, but descriptive, title
  2. A brief explanation of the repository’s purpose
  3. A concise description of what’s housed in the repository
  4. Details regarding data access
  5. A list of authors or current contributors (for collaborative work)
  6. References

Code Comments


Flavors of code comments

Headers

Describe file’s purpose, author, and date created/modified

Inline

Explain complex logic, clarify assumptions, warn about edge cases

Function

Purpose, parameters, and return values of functions

Code Style


Readability → Comprehension → Reproducibility

Consistency is key! (Just like workflow folder organization)

Key considerations:

  • Object naming conventions
  • Spacing and layout
  • Writing and calling functions
  • Comments!

Tidyverse Style Guide

Documentation Scavenger Hunt


Instructions

  1. Form groups of 3

  2. Pick a published repository from yesterday

  1. Find examples of READMEs, comments (header, inline, and function), and coding style in each repo

Note: not every repo will have all of these!

  1. I will call on three groups to share

Roxygen


Roxygen is a special commenting system for R functions.

#' Norm of the jerk vector
#'
#' Uses Savitzky-Golay filtering to de-noise before differentiating acceleration
#'
#' @param A Tri-axial acceleration (n x 3 matrix)
#' @param fs Sampling rate (Hz)
#' @param p Order of Savitzky-Golay filter (3 by default)
#' @param n Window size of Savitzky-Golay filter (11 by default)
#'
#' @return Norm of the jerk vector of A
#' @export
jerk <- function(A, fs, p = 3, n = 11) {
  n <- n + 1 - n %% 2
  apply(A, 2, function(axis) signal::sgolayfilt(axis, p, n, m = 1))
}

Roxygen


Roxygen is a special commenting system for R functions.

Documentation Recap


Why document?

  • Facilitates collaboration
  • Supports knowledge transfer
  • Prevents nasty, horrible, awful headaches

Key documentation types:

  • READMEs
  • Code comments
  • Consistent style