⚡ Building Transformers from Scratch with PyTorch
Hand-write every component of a GPT-style decoder — tokenizer, attention, blocks, training, KV-cache generation — and train your own tiny GPT.
10-day mini-course — Hand-write every component of a GPT-style decoder — tokenizer, attention, blocks, training, KV-cache generation — and train your own tiny GPT.
Like every course on this site, each day is a deep, code-first lesson: an intuition-first explainer, a staged code walkthrough explaining the methodology line by line, visuals, and a 🧪 Your task exercise with a hidden solution. Theory lives in the AI & ML Encyclopedia; here you build.
▶ Start Day 1 📚 All mini-courses