Reproducible Workflows

EDS 214: Analytical Workflows and Scientific Reproducibility


Day 1 Afternoon | August 25th, 2025

Workflow organization


Let’s look at some examples of workflows on GitHub

All focus on nearby marine ecosystems, but the principles apply to any subject

Keep your eyes peeled for these four components:

  • Raw data
  • Code
  • Outputs
  • Documentation

Kelp me kelp you


Global patterns of kelp forest change over the past half-century (Krumhansl et al. 2016)

MPAs protect against marine heatwaves


Marine protected areas promote stability of reef fish communities under climate warming (Benedetti-Cecchi et al. 2024)

MPAs don’t protect against marine heatwaves


A marine protected area network does not confer community structure resilience to a marine heatwave across coastal ecosystems (Smith et al. 2023)

Workflow organization jigsaw pt 1


Try to find the following four components in the workflow:

  • Raw data
  • Code
  • Outputs
  • Documentation

For each component, make a note of:

  • The name of the folder(s) containing it
  • How you found that component (intuition? documentation? parsing code?)

Workflow organization jigsaw pt 2


  • Now, form groups with one student from each workflow
  • Each person shares what they found in their workflow
  • As a group, write down 2 traits of a workflow that make it easy and challenging to find things (4 traits total)
  • Provide examples for each
  • I will randomly call on a group to share

What are the goals for workflow organization?


Reproducibility

Another scientist (including Future You) should be able to repeat your analysis.

Maintainability

You should be able to jump back in to editing your analysis, even if you haven’t looked at it in a while

Collaboration

You should be able to share your analysis methods and results with others

Tools that help with workflow organization


GitHub

  • Version control - track your (and your collaborators’) changes
  • Issues - a built-in to-do list directly connected to code and conversations
  • Branches and pull requests - work in parallel and merge your work as seamlessly as possible
  • GitHub Pages - a website for sharing your analysis with collaborators

Tools that help with workflow organization


Folder organization

  • It’s less important which system you pick than it is to be consistent
  • Free up cognitive load!
    • Brainpower you spent figuring out where to put or find files - reallocated to your actual science
  • Follow conventions
    • By playing nice with others you get to use their tools

Introducing: The FlukeAndFeather Workflow Organization System™


  • You have before you a shuffled collection of folders and files
  • You have no context other than their names
  • With a partner, try to:
    • Organize the folders and files hierarchically
    • Briefly describe what you think the purpose of each folder/file is
    • Use the two blank cards to add two files to the project
  • We will discuss your results in groups

Introducing: The FlukeAndFeather Workflow Organization System™


SNOWBALL

5 minutes

  • Combine pairs
  • Choose a notetaker
  • Share and combine into a single solution
  • Notetaker writes down:
    • One example of differing solutions
    • How you decided which solution to use
    • One remaining question about a file/folder

5 minutes

  • Combine groups again to make groups of EIGHT(ish)
  • Choose a new notetaker
  • Repeat the exercise

I will randomly call on two notetakers to share

Works cited

Benedetti-Cecchi, Lisandro, Amanda E. Bates, Giovanni Strona, Fabio Bulleri, Barbara Horta e Costa, Graham J. Edgar, Bernat Hereu, et al. 2024. “Marine Protected Areas Promote Stability of Reef Fish Communities Under Climate Warming.” Nature Communications 15 (1). https://doi.org/10.1038/s41467-024-44976-y.
Krumhansl, Kira A., Daniel K. Okamoto, Andrew Rassweiler, Mark Novak, John J. Bolton, Kyle C. Cavanaugh, Sean D. Connell, et al. 2016. “Global Patterns of Kelp Forest Change over the Past Half-Century.” Proceedings of the National Academy of Sciences 113 (48): 13785–90. https://doi.org/10.1073/pnas.1606102113.
Smith, Joshua G., Christopher M. Free, Cori Lopazanski, Julien Brun, Clarissa R. Anderson, Mark H. Carr, Joachim Claudet, et al. 2023. “A Marine Protected Area Network Does Not Confer Community Structure Resilience to a Marine Heatwave Across Coastal Ecosystems.” Global Change Biology 29 (19): 5634–51. https://doi.org/10.1111/gcb.16862.