You are NOT signed in as a student! Your progress will not be saved and you won't see milestones on your ZTM Passport.
Make sure you are logged into
Academy
Autoplay
Autocomplete
Previous Lesson
Complete and Continue
Advanced AI: LLMs Explained with Math (Transformers, Attention Mechanisms & More)
Introduction
Advanced AI: LLMs Explained with Math (3:00)
Exercise: Meet Your Classmates and Instructor
Introduction to Tokenizations and Encodings
Creating Our Optional Experiment Notebook - Part 1 (3:21)
Creating Our Optional Experiment Notebook - Part 2 (4:01)
Encoding Categorical Labels to Numeric Values (13:24)
Understanding the Tokenization Vocabulary (15:05)
Encoding Tokens (10:56)
Practical Example of Tokenization and Encoding (12:48)
Embeddings and Positional Encodings
DistilBert vs. Bert Differences (4:46)
Embeddings In A Continuous Vector Space (7:40)
Introduction To Positional Encodings (5:13)
Positional Encodings - Part 1 (4:14)
Positional Encodings - Part 2 (Even and Odd Indices) (10:10)
Why Use Sine and Cosine Functions (5:08)
Understanding the Nature of Sine and Cosine Functions (9:52)
Visualizing Positional Encodings in Sine and Cosine Graphs (9:24)
Solving the Equations to Get the Values for Positional Encodings (18:07)
Attention Mechanism, Multi Head Attention, Masked Language Learning and More
Introduction to Attention Mechanism (3:02)
Query, Key and Value Matrix (18:10)
Getting Started with Our Step by Step Attention Calculation (6:53)
Calculating Key Vectors (20:05)
Query Matrix Introduction (10:20)
Calculating Raw Attention Scores (21:24)
Understanding the Mathematics Behind Dot Products and Vector Alignment (13:32)
Visualizing Raw Attention Scores in 2D (5:42)
Converting Raw Attention Scores to Probability Distributions with Softmax (9:16)
Normalization (3:19)
Understanding the Value Matrix and Value Vector (9:07)
Calculating the Final Context Aware Rich Representation for the Word "River" (10:45)
Understanding the Output (1:58)
Understanding Multi Head Attention (11:55)
Multi Head Attention Example and Subsequent Layers (9:51)
Masked Language Learning (2:29)
Where To Go From Here?
Review This Byte!
Converting Raw Attention Scores to Probability Distributions with Softmax
This lecture is available exclusively for ZTM Academy members.
If you're already a member,
you'll need to login
.
Join ZTM To Unlock All Lectures