Course Page - IIT Madras Degree Program

Degree Level Course

Deep Learning for Computer Vision

-Knowledge of basics of image processing and computer vision -Knowledge of building blocks of deep learning including feedforward networks, convolutional neural networks, recurrent neural networks and transformers -Knowledge of generative AI models in computer vision -Knowledge of recent trends including explainability/zero-shot learning, few-shot learning, self-supervised learning, etc -Hands-on experience on implementation of basic image processing tasks -Hands-on experience on implementation of deep learning models for computer vision tasks -Hands-on experience on implementation of advanced computer vision tasks such as explainability, self-supervised learning, etc

by Prof. Vineeth N B

Course ID: BSCS5003

Course Credits: 4

Course Type: Elective

Pre-requisites: None

Course structure & Assessments

For details of standard course structure and assessments, visit Academics page.

WEEK 1	Introduction and Overview: Course Overview and Motivation; Introduction to Image Formation, Capture and Representation; Linear Filtering, Correlation, Convolution
WEEK 2	Visual Features and Representations: Edge, Blobs, Corner Detection; Scale Space and Scale Selection; SIFT, SURF; HoG,LBP, etc.
WEEK 3	Visual Matching: Bag-of-words, VLAD; RANSAC, Hough transform; Pyramid Matching; Optical Flow
WEEK 4	Deep Learning Review: Review of Deep Learning, Multi-layer Perceptrons, Backpropagation
WEEK 5	Convolutional Neural Networks (CNNs): Introduction to CNNs; Evolution of CNN Architectures: AlexNet, ZFNet, VGG, InceptionNets, ResNets, DenseNets
WEEK 6	Visualization and Understanding CNNs: Visualization of Kernels; Backprop-to-image/Deconvolution Methods; Deep Dream, Hallucination, Neural Style Transfer; CAM, Grad-CAM, Grad-CAM++; Recent Methods (IG, Segment-IG, SmoothGrad)
WEEK 7	CNNs for Recognition, Verification, Detection, Segmentation: CNNs for Recognition and Verification (Siamese Networks, Triplet Loss, Contrastive Loss, Ranking Loss); CNNs for Detection: Background of Object Detection, R-CNN, Fast R-CNN, Faster R-CNN, YOLO, SSD, RetinaNet; CNNs for Segmentation: FCN, SegNet, U-Net, Mask-RCNN
WEEK 8	Recurrent Neural Networks (RNNs): Review of RNNs; CNN + RNN Models for Video Understanding: Spatio-temporal Models, Action/Activity Recognition
WEEK 9	Attention Models: Introduction to Attention Models in Vision; Vision and Language: Image Captioning, Visual QA, Visual Dialog; Spatial Transformers; Transformer Networks
WEEK 10	Deep Generative Models: Review of (Popular) Deep Generative Models: GANs, VAEs; Other Generative Models: PixelRNNs, NADE, Normalizing Flows, etc
WEEK 11	Variants and Applications of Generative Models in Vision: Applications: Image Editing, Inpainting, Superresolution, 3D Object Generation, Security; Variants: CycleGANs, Progressive GANs, StackGANs, Pix2Pix, etc
WEEK 12	Recent Trends: Zero-shot, One-shot, Few-shot Learning; Self-supervised Learning; Reinforcement Learning in Vision; Other Recent Topics and Applications

+ Show all weeks

Prescribed Books

The following are the suggested books for the course:

Ian Goodfellow, Yoshua Bengio, Aaron Courville, Deep Learning, 2016

Michael Nielsen, Neural Networks and Deep Learning, 2016

Yoshua Bengio, Learning Deep Architectures for AI, 2009

Richard Szeliski, Computer Vision: Algorithms and Applications, 2010.

Simon Prince, Computer Vision: Models, Learning, and Inference, 2012.

David Forsyth, Jean Ponce, Computer Vision: A Modern Approach, 2002.

About the Instructors

Prof. Vineeth N B

Professor, Computer science and Engineering, IIT Hyderabad

View all Degree Level courses

support@study.iitm.ac.in

7850999966

IITM BS Degree Office, 3rd Floor,
ICSR Building, IIT Madras,
Chennai - 600036

Please use only the above methods for program queries. Response time: 3 working days. During peak periods, Google Meet links will be shared. Call wait times may be longer.