UGA Bulletin

Course Description

Reinforcement learning studies methods for learning to act optimally based on the reward or punishment over time. Such machine learning is useful when we wish to learn high-quality behavior under uncertainty and the only data are reward signals. Introduces classical and modern methods in single- and multi-agent settings.

Additional Requirements for Graduate Students:
All graduate students will complete additional course work and will be evaluated in a separate pool commensurate with earning graduate credit hours for taking the course. Specifically, graduate students will be required to solve additional questions involving greater academic rigor on both the paper-and-pencil and programming parts of the assignments. Answering these questions will require the graduate students to research additional course material and read more research papers. The exam will contain additional required questions of both short-answer and long-answer formats for the graduate students. These questions will require a more in-depth analysis and understanding of the course material. Finally, all graduate students will be evaluated in a pool, separated from undergraduate students in the course, and the grade cutoffs will be stricter, which implies more stringent expectations of accomplishment from the graduate students.

Athena Title

Reinforcement Learning

Prerequisite

CSCI(PHIL) 4550/6550

Semester Course Offered

Not offered on a regular basis.

Grading System

A - F (Traditional)

Course Objectives

1. Situate and understand a key area of artificial intelligence and specifically in the field of machine learning. Understand the corresponding class of problems. 2. Study the challenges and algorithms for reinforcement learning by agents situated in uncertain single-agent and multi-agent environments. 3. Gain proficiency in the use of computing tools related to reinforcement learning, designing and giving effective research presentations, and working in a team.

Topical Outline

I. Introduction a. Requirements for reinforcement learning (RL) and its limitations, exploration vs. exploitation b. Probability theory background c. History of RL II. Model-based RL a. Markov decision processes (MDP) b. Planning using dynamic programming c. Model learning (CE, Dyna, prioritized sweeping, RTDP*) III. Model-free RL a. Value-based learning - On-policy methods (Sarsa, TD, eligibility traces) - Off-policy methods (Q-learning, Deep Q networks) b. Policy-based learning - Policy gradient methods (Monte Carlo, Trust-region, Proximal policy) - Actor-Critic Schema (A2C, A3C) IV. Advanced Concepts in RL a. Inverse RL b. Multi-agent RL (multi-agent AC, MADDPG, LOLA) c. Human RL (time permitting)

publish