Interested in joining our January 2025 batch? Applications opens on September 30, 2024.

Interested in joining our January 2025 batch? Applications opens on September 30, 2024.

Degree Level Course

Deep Learning for Computer Vision

-Knowledge of basics of image processing and computer vision -Knowledge of building blocks of deep learning including feedforward networks, convolutional neural networks, recurrent neural networks and transformers -Knowledge of generative AI models in computer vision -Knowledge of recent trends including explainability/zero-shot learning, few-shot learning, self-supervised learning, etc -Hands-on experience on implementation of basic image processing tasks -Hands-on experience on implementation of deep learning models for computer vision tasks -Hands-on experience on implementation of advanced computer vision tasks such as explainability, self-supervised learning, etc

by Prof. Vineeth N B

Course ID: BSCS5003

Course Credits: 4

Course Type: Elective

Pre-requisites: None

Course structure & Assessments

For details of standard course structure and assessments, visit Academics page.

WEEK 1 Introduction and Overview: Course Overview and Motivation; Introduction to Image Formation, Capture and Representation; Linear Filtering, Correlation, Convolution
WEEK 2 Visual Features and Representations: Edge, Blobs, Corner Detection; Scale Space and Scale Selection; SIFT, SURF; HoG,LBP, etc.
WEEK 3 Visual Matching: Bag-of-words, VLAD; RANSAC, Hough transform; Pyramid Matching; Optical Flow
WEEK 4 Deep Learning Review: Review of Deep Learning, Multi-layer Perceptrons, Backpropagation
WEEK 5 Convolutional Neural Networks (CNNs): Introduction to CNNs; Evolution of CNN Architectures: AlexNet, ZFNet, VGG, InceptionNets, ResNets, DenseNets
WEEK 6 Visualization and Understanding CNNs: Visualization of Kernels; Backprop-to-image/Deconvolution Methods; Deep Dream, Hallucination, Neural Style Transfer; CAM, Grad-CAM, Grad-CAM++; Recent Methods (IG, Segment-IG, SmoothGrad)
WEEK 7 CNNs for Recognition, Verification, Detection, Segmentation: CNNs for Recognition and Verification (Siamese Networks, Triplet Loss, Contrastive Loss, Ranking Loss); CNNs for Detection: Background of Object Detection, R-CNN, Fast R-CNN, Faster R-CNN, YOLO, SSD, RetinaNet; CNNs for Segmentation: FCN, SegNet, U-Net, Mask-RCNN
WEEK 8 Recurrent Neural Networks (RNNs): Review of RNNs; CNN + RNN Models for Video Understanding: Spatio-temporal Models, Action/Activity Recognition
WEEK 9 Attention Models: Introduction to Attention Models in Vision; Vision and Language: Image Captioning, Visual QA, Visual Dialog; Spatial Transformers; Transformer Networks
WEEK 10 Deep Generative Models: Review of (Popular) Deep Generative Models: GANs, VAEs; Other Generative Models: PixelRNNs, NADE, Normalizing Flows, etc
WEEK 11 Variants and Applications of Generative Models in Vision: Applications: Image Editing, Inpainting, Superresolution, 3D Object Generation, Security; Variants: CycleGANs, Progressive GANs, StackGANs, Pix2Pix, etc
WEEK 12 Recent Trends: Zero-shot, One-shot, Few-shot Learning; Self-supervised Learning; Reinforcement Learning in Vision; Other Recent Topics and Applications
+ Show all weeks

Prescribed Books

The following are the suggested books for the course:

Ian Goodfellow, Yoshua Bengio, Aaron Courville, Deep Learning, 2016

Michael Nielsen, Neural Networks and Deep Learning, 2016

Yoshua Bengio, Learning Deep Architectures for AI, 2009

Richard Szeliski, Computer Vision: Algorithms and Applications, 2010.

Simon Prince, Computer Vision: Models, Learning, and Inference, 2012.

David Forsyth, Jean Ponce, Computer Vision: A Modern Approach, 2002.

About the Instructors

Prof. Vineeth N B
Professor, Computer science and Engineering, IIT Hyderabad