UGA Bulletin Logo

Introduction to Coding in R, Data Science and Simulation for Public Health and the Life Sciences


Course Description

Introduction to programing in R with a focus on data analysis, visualization, and simulation. Students will learn to examine data and think algorithmically, as well as develop intuition for key study design and statistical concepts.


Athena Title

Intro Coding in R for Pub Hlth


Prerequisite

BIOS 7010 or BIOS 7010E or permission of department


Grading System

A - F (Traditional)


Course Objectives

Scientific computing has become an incredibly powerful tool for research in public health and other life sciences. This course will provide students an introduction to some of the skills necessary for scientific computing using the programming language R, with a focus on example applications in public health. While the primary course goal is to learn how to program efficiently, coding exercises will be tailored to help develop students’ intuitions in ways that will aid their research and cement concepts from other courses in data analysis and statistics. The course time will be divided into short lectures and interactive individual and group coding exercises. The course will focus as much on specific programming concepts as on how to learn R and programming efficiently so that students will have the ability necessary to effectively improve their programming skills on their own after the course. While this course will be tailored to R, many skills gained will be readily applicable to other programming languages. By the end of the course, students will be able to: • Understand the basics of interacting with a programming language • Understand and apply algorithmic thinking • Create R scripts to reproducibly import, analyze, and visualize data • Apply debugging and R-help skills to read and understand others’ computer code • Evaluate an R script’s efficiency through profiling • Create scripts that use random number generation to simulate data and use these simulations to understand statistical concepts • Efficiently search R-help and the internet for efficient self-directed learning during and after the course.


Topical Outline

Specific topics include using R to: • write and comment readable code • import and export data • load R packages • understand R objects: vectors, factors, matrices, data frames, lists • reproducibly clean, manage, and visualize data • use for loops and conditional statements • write functions • debug code • automatically handle and analyze textual data • vectorize operations • use random number generators • create a workflow for automating data processing and analysis • perform simple statistical analyses Additional advanced concepts that may be covered include: • using Git version control to manage and collaborate on coding projects • simulating data to understand statistical concepts (e.g., P-values, confidence intervals, and likelihood) • understanding random variables from diverse probability distributions • permutation tests • working with large data sets • managing multi-script workflows • statistical power simulations for simple study designs


Syllabus