| Program start date | Application deadline |
| 2026-02-09 | - |
| 2027-02-09 | - |
Program Overview
Program Overview
The Deep Learning for Computer Vision course is offered by the Stanford School of Engineering. This course explores how deep learning is driving modern computer vision systems, enabling machines to interpret the visual world.
Course Details
- Format: 100% Online, on-demand, live
- Time to Complete: 10-15 hours/week
- Tuition: $1,950.00
- Schedule: February 9 - April 19, 2026
- Units: 10 CEU-equivalent
- Course Access: Course materials are available for 90 days after the course ends
- Credentials: Certificate of Achievement
Course Material
The course covers the evolution of computer vision models, from early recurrent neural networks to cutting-edge diffusion methods, alongside their real-world applications. Students will implement essential components like CNNs, language models, CLIP, and diffusion models, gaining hands-on experience in training, fine-tuning, and evaluating these models.
Competency Areas
- Deep Learning Fundamentals (Neural networks and model training)
- Convolutional Neural Networks
- Attention and Transformers
- Image Classification
- Object Detection and Image Segmentation
- Generative Models (GANs & VAEs & Diffusion)
- Robot Learning and Deep Reinforcement Learning
What You Need to Get Started
Prior to enrolling in the course, students must demonstrate:
- Proficiency in Python: Coding assignments will be in Python. Some assignments will require familiarity with basic Linux command line workflows.
- College Calculus and Linear Algebra: Students should be comfortable taking (multivariable) derivatives and understand matrix/vector notation and operations.
- Probability Theory: Students should be familiar with basic probability distributions (Continuous, Gaussian, Bernoulli, etc.) and be able to define concepts for both continuous and discrete random variables: Expectation, independence, probability distribution functions, and cumulative distribution functions.
- Note: Familiarity with PyTorch, along with prior knowledge of basic machine learning concepts, neural networks, optimization, and backpropagation, is recommended.
Teaching Team
- Ehsan Adeli: Assistant Professor, Psychiatry and Computer Science
- Fei-Fei Li: Sequoia Capital Professor, Computer Science
Program Structure
The course is part of the Artificial Intelligence Professional Program. Students can enroll individually or as part of a group. The program offers flexible enrollment options, including special pricing for groups of five or more.
Additional Information
- The course is designed to provide students with a well-rounded grasp of the field, covering various visual tasks—classification, detection, segmentation, captioning, and image synthesis.
- Students will engage with a variety of visual tasks, allowing them to develop practical proficiency in deep learning frameworks for building, training, and fine-tuning large-scale vision models.
