UGA Bulletin Logo

Data Science Capstone Course


Course Description

Provides an exposure to advanced methods and technologies in data science, including data acquisition, data quality, big data management and analytics, data mining, data security and privacy, and introduces the students to data science experience with a real-world problem. In addition, effective oral and written communication of technologies, methods, and results are emphasized.


Athena Title

Data Science Capstone Course


Prerequisite

(CSCI 4360/6360 and CSCI 4370/6370) or (STAT 4220 and STAT 4230/6230)


Semester Course Offered

Offered spring


Grading System

A - F (Traditional)


Course Objectives

Students graduating with Data Science major are expected to have experience beyond the text book – that is managing parts or even complete data life cycles in real-world settings. Data life cycle includes data acquisition, data management, data analysis, data mining and security and privacy protection of data. Data life cycles in organizations often involve many complex issues – data comes from multiple distinct sources, data sets are heterogeneous along multiple dimensions, including data size, data quality, data type/structure, data velocities, etc., and organizations have very different data analytics needs. Furthermore, the tradeoffs among these issues tend to be very intricate. Traditional textbook-based courses rarely expose students to these important issues and the tradeoffs among them. A data scientist should be able to interact with various stakeholders in an organization to understand their problems and requirements. This skill can truly only be acquired through experience. This capstone course will provide students with the opportunity to learn about advanced data science methods and techniques and apply them in a real-world setting. This course will (a) round out their education; (b) prepare them for data scientist positions in the real world; and (c) make them more attractive to prospective employers.


Topical Outline

A wide variety of topics can be covered in this capstone course based on the preference of the instructors and the students. Since all students will have gone through courses in which they learned the basic methods of data analysis, the instructors can concentrate on other techniques, according to their preferences, the preferences of the class, and the particular projects the students will be working on. Possibilities include: data acquisition and ingestion, data integration, data analytics, streaming data management, big data systems, data security, and data privacy. The idea is not to go into any one of these topics in-depth and exclusively; rather, we propose covering many different topics over the course of the semester, spending three to four lectures on each. In addition, the course will include units on effective communication (written and oral) and how to make a poster presentation. The rest of the class periods will be used for student presentations of their projects.


Syllabus