Course Page - IIT Madras Degree Program

Degree Level Course

Special topics in Machine Learning (Reinforcement Learning)

To enable the student to understand the reinforcement learning paradigm, to be able to identify when an RL formulation is appropriate, to understand the basic solution approaches in RL, to implement and evaluate various RL algorithms.

by Prof. Balaraman Ravindran

Course ID: BSCS4002

Course Credits: 4

Course Type: Elective

Pre-requisites: None

Co-requisites: BSCS3004 - Deep Learning

About the Instructors

Prof. Balaraman Ravindran

Professor, CSE , IIT Madras

B. Ravindran heads the Robert Bosch Centre for Data Science & Artificial Intelligence (RBCDSAI) at IIT Madras. He is the Mindtree Faculty Fellow, TCS Affiliate Faculty and Professor in the Department of Computer Science and Engineering at IIT Madras. He has held visiting positions at the Indian Institute of Science, University of Technology, Sydney, and Google Research. Currently, his research interests span the areas of geometric deep learning and reinforcement learning. He is one of the founding executive committee members of the India chapter of ACM SIGKDD. He is currently serving on the editorial boards of Machine Learning Journal, JAIR, ACM Transactions on Intelligent Systems and Technology, PLOS One, and Frontiers in Big Data and AI. He has published more than 100 papers in premier journals and conferences. His work with students have won multiple best paper awards, the most recent being the best application paper at PAKDD 2021. His video lectures on NPTEL are widely viewed and have received accolades for their depth and delivery. He received his PhD from the University of Massachusetts, Amherst and his Master’s degree from the Indian Institute of Science, Bangalore. He is a senior member of the Association for Advancement of AI (AAAI) and an ACM Distinguished Member.

support@study.iitm.ac.in

7850999966

IITM BS Degree Office, 3rd Floor,
ICSR Building, IIT Madras,
Chennai - 600036

Please use only the above methods for program queries. Response time: 3 working days. During peak periods, Google Meet links will be shared. Call wait times may be longer.

WEEK 1	Review of ML fundamentals – Classification, Regression. Review of probability theory and optimization concepts.
WEEK 2	RL Framework; Supervised learning vs. RL; Explore-Exploit Dilemma; Examples.
WEEK 3	MAB: Definition, Uses, Algorithms, Contextual Bandits, Transition to full RL, Intro to full RL problem
WEEK 4	Intro to MDPs: Definitions , Returns, Value function, Q-function.
WEEK 5	Bellman Equation, DP, Value Iteration, Policy Iteration, Generalized Policy Iteration.
WEEK 6	Evaluation and Control: TD learning, SARSA, Q-learning, Monte Carlo, TD Lambda, Eligibility Traces.
WEEK 7	Maximization-Bias & Representations: Double Q learning, Tabular learning vs. Parameterized, Q-learning with NNs
WEEK 8	Function approximation: Semi-gradient methods, SGD, DQNs, Replay Buffer.
WEEK 9	Policy Gradients: Introduction, Motivation, REINFORCE, PG theorem, Introduction to AC methods
WEEK 10	Actor-Critic Methods, Baselines, Advantage AC, A3C Advanced Value-Based Methods: Double DQN, Prioritized Experience Replay, Dueling Architectures, Expected SARSA.
WEEK 11	Advanced PG/A-C methods: Deterministic PG and DDPG, Soft Actor-Critic (SAC) HRL: Introduction to hierarchies, types of optimality, SMDPs, Options, HRL algorithms POMDPS: Intro, Definitions, Belief states, Solution Methods; History-based methods, LSTMS, Q-MDPs, Direct Solutions, PSR.
WEEK 12	Model-Based RL: Introduction, Motivation, Connections to Planning, Types of MBRL, Benefits, RL with a Learnt Model, Dyna-style models, Latent variable models, Examples, Implicit MBRL. Case study on design of RL solution for real-world problems.