UGA Bulletin Logo

Software Carpentries for Scientific Programming


Course Description

An introduction to foundational programming techniques, data organization, and computational analysis approaches commonly used across the biomedical and informational sciences fields. This course uses open-source lessons and curricula developed by The Carpentries organization (https://carpentries.org/). Students will gain an understanding of the command-line interface and how this can be used to manipulate files and perform data analysis tasks on their computers.


Athena Title

Software Carpentries


Equivalent Courses

Not open to students with credit in BINF 7960E


Non-Traditional Format

Credit hours and lecture hours per week will vary by semester, depending on the course length and format. Course may be run as a traditional semester-long course or an intensive immersive course (e.g., weeklong intensive course during the summer).


Semester Course Offered

Offered fall, spring and summer


Grading System

S/U (Satisfactory/Unsatisfactory)


Student learning Outcomes

  • Students will be able to explain how to use command-line computing interfaces and demonstrate the use of common Unix commands and shell scripts.
  • Students will be able to compare and contrast the benefits and drawbacks of command-line versus graphical interfaces.
  • Students will understand the basic principles of automated version control systems (Git) and explain how to apply these systems to common computational tasks.
  • Students will develop a working knowledge of Python programming fundamentals and common programming applications of this language.
  • Students will develop a working knowledge of R programming and the use of the R studio graphical interface, with an emphasis on using R for data cleaning, organization, and visualization.
  • Students will understand how specific data science tools and workflows are commonly applied across diverse scientific disciplines such as Ecology, Genomics, Social Science, and Library Science.

Topical Outline

  • Topic 1: The Unix Shell Introducing the Shell Navigating and Working with Files and Directories Pipes and Filters Loops Shell Scripts Finding Files on the Command Line
  • Topic 2: Version Control with Git Introduction to Automated Version Control Systems Setting up Git and Creating a Repository Tracking Changes and Exploring History Ignoring Files and Customizing Tracking Collaboration and Using Remotes in GitHub Identifying and Correcting Conflicts Open Science, Licensing, Citation, and Hosting Integrating Git and R Studio
  • Topic 3: Programming with Python Python Fundamentals Lists, Loops, and Conditionals Analyzing Data from Multiple Files Creating Functions Errors and Exceptions Debugging and Defensive Programming Plotting and Visualizing Data Using Python Best Practices for Code Formatting and Commenting
  • Topic 4: Programming with R and R Studio Fundamental of R and R Studio Data types and Data Structures Importing and Organizing Common Types of Data Files (CSV, TSV) Creating Functions and using Statements Loops and the Call Stack Categorical Data and Factors Dynamic Reports with knitr Data Frame Manipulation with dplyr and tidyr Visualizing Data using Base R and Common Commands Creating Publication-quality Graphics with ggplot2 Best Practices for Writing R code Making Packages in R
  • Topic 5: Introduction to Data Analysis Workflows in Ecology, Genomics, Geospatial Data, Social Sciences, and Library Sciences