UGA Bulletin Logo

Privacy-Preserving Data Analysis


Course Description

An introduction to the privacy preservation problems, as well as algorithmic and statistical techniques for data privacy, in modern data analysis, such as machine learning and data mining. Approaches include randomized algorithms, synthetic data generation, stability analysis, and so on.


Athena Title

Privacy-Preserving Data Analys


Prerequisite

CSCI 4380/6380 or permission of department


Semester Course Offered

Not offered on a regular basis.


Grading System

A - F (Traditional)


Course Objectives

The goal of this course is to introduce students to the privacy preservation problems in modern data analysis, such as machine learning and data mining. The course will explore mathematically rigorous algorithmic and statistical tools that enable analysis of sensitive data while protecting privacy of individuals. The course is appropriate for students preparing to do research in machine learning and data mining, as well as for Science and Engineering students who want to learn how to deal with privacy issues related to their research.


Topical Outline

Review of linear algebra and probability theory Privacy notions: k-anonymity l-diversity Differential privacy (overview) Differential privacy:  -laplace mechanisms  -exponential mechanism Randomized algorithms Linear query answering mechanisms:  -interactive VS non]interactive mechanisms Synthetic database generation: -histogram based approaches  -Bayesian network based approaches Algorithmic stability analysis Private convex optimization:  -objective perturbation  -posterior sampling


Syllabus