Building a Data Pipeline

Session Description

In this session, we’ll explore some of the basic workflow which we’ll use over the course of the semester to package and share analysis. We’ll develop familiarity with Quarto, and basic operations in Github so that you are able to share code and analysis over the course of the semester.

Lab 1 Link

Before Class

Review today’s lab guide.

Ensure that your computer has the latest stable versions of R and RStudio installed.

Accept the GitHub invitation to our Lab 1 repository and download the repository to your local computer (we will set up more advanced tools for interacting with GitHub in our next lab session.

Reflect

Workflows

What are the types of common tasks in your workflows that you think would benefit from a data pipeline?
How do we hold ourselves accountable for our analysis?

Readings

Whose interests and goals do you seek to represent through your work?
What missing datasets (akin to the Library of Missing Datasets) have you observed?¹

Slides

Resources for Further Exploration

Footnotes

At the beginning of our session, we’ll catalog some of these datasets - it may help to write down some of your thoughts to share.↩︎