Coursera

Advanced Statistics for Data Science

Coursera

CCodeEmporium

CShared byCodeEmporium

From video: DINO - Explained!

Published: March 23, 2026

Video Description

In this video, we take a look at DIstillation with NO labels. What is it? Why do we have it? How does it look? ABOUT ME ⭕ Subscribe: https://www.youtube.com/c/CodeEmporium?sub_confirmation=1 📚 Medium Blog: https://medium.com/@dataemporium 💻 Github: https://github.com/ajhalthor 👔 LinkedIn: https://www.linkedin.com/in/ajay-halthor-477974bb/ RESOURCES [1 📚] Main Paper: https://arxiv.org/pdf/2104.14294 [2 📚] Slides: https://link.excalidraw.com/p/readonly/ccVu9FUIwD5miDWgdK3s [3 📚] Vision Transformers paper: https://arxiv.org/pdf/2010.11929 [4 📚] BERT paper: https://arxiv.org/pdf/1810.04805 PLAYLISTS FROM MY CHANNEL ⭕ Reinforcement Learning: https://youtube.com/playlist?list=PLTl9hO2Oobd9kS--NgVz0EPNyEmygV1Ha&si=AuThDZJwG19cgTA8 Natural Language Processing: https://youtube.com/playlist?list=PLTl9hO2Oobd_bzXUpzKMKA3liq2kj6LfE&si=LsVy8RDPu8jeO-cc ⭕ Transformers from Scratch: https://youtube.com/playlist?list=PLTl9hO2Oobd_bzXUpzKMKA3liq2kj6LfE ⭕ ChatGPT Playlist: https://youtube.com/playlist?list=PLTl9hO2Oobd9coYT6XsTraTBo4pL1j4HJ ⭕ Convolutional Neural Networks: https://youtube.com/playlist?list=PLTl9hO2Oobd9U0XHz62Lw6EgIMkQpfz74 ⭕ The Math You Should Know : https://youtube.com/playlist?list=PLTl9hO2Oobd-_5sGLnbgE8Poer1Xjzz4h ⭕ Probability Theory for Machine Learning: https://youtube.com/playlist?list=PLTl9hO2Oobd9bPcq0fj91Jgk_-h1H_W3V ⭕ Coding Machine Learning: https://youtube.com/playlist?list=PLTl9hO2Oobd82vcsOnvCNzxrZOlrz3RiD MATH COURSES (7 day free trial) 📕 Mathematics for Machine Learning: https://imp.i384100.net/MathML 📕 Calculus: https://imp.i384100.net/Calculus 📕 Statistics for Data Science: https://imp.i384100.net/AdvancedStatistics 📕 Bayesian Statistics: https://imp.i384100.net/BayesianStatistics 📕 Linear Algebra: https://imp.i384100.net/LinearAlgebra 📕 Probability: https://imp.i384100.net/Probability OTHER RELATED COURSES (7 day free trial) 📕 ⭐ Deep Learning Specialization: https://imp.i384100.net/Deep-Learning 📕 Python for Everybody: https://imp.i384100.net/python 📕 MLOps Course: https://imp.i384100.net/MLOps 📕 Natural Language Processing (NLP): https://imp.i384100.net/NLP 📕 Machine Learning in Production: https://imp.i384100.net/MLProduction 📕 Data Science Specialization: https://imp.i384100.net/DataScience 📕 Tensorflow: https://imp.i384100.net/Tensorflow CHAPTERS 00:00 What is DINO? 00:24 Historical context: Vision Transformers Recap 02:40 Self supervised learning 04:51 Student-teacher architecture as we do in knowledge distillation 05:12 Training DINO: forward pass 09:43 Why is the cardinality of output neurons large? 10:27 temperature softmax in the teacher and student 11:43 mode collapse and reason for centering teacher activations 13:10 How the student and teacher update weights 15:22 Inference 17:50 Interesting Findings 18:30 visualizing segmentation masks that emerge in ViT 20:59 understanding rich image embeddings of ViT 22:06 Quiz Time 22:57 Summary