<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Kader Mohideen</title>
<link>https://kader-xai.github.io/kader-library.html</link>
<atom:link href="https://kader-xai.github.io/kader-library.xml" rel="self" type="application/rss+xml"/>
<description>Illustrated, from-scratch explainers — one topic at a time, each built to actually make the idea click. A growing collection.</description>
<image>
<url>https://kader-xai.github.io/images/card.png</url>
<title>Kader Mohideen</title>
<link>https://kader-xai.github.io/kader-library.html</link>
</image>
<generator>quarto-1.9.37</generator>
<lastBuildDate>Fri, 05 Jun 2026 00:00:00 GMT</lastBuildDate>
<item>
  <title>What is an Embedding?</title>
  <dc:creator>Kader Mohideen</dc:creator>
  <link>https://kader-xai.github.io/explainers/2026-06-05-what-is-an-embedding/</link>
  <description><![CDATA[ 





<div class="explainer-body">
<p><a href="../banners/what-is-an-embedding.png" class="lightbox" data-gallery="quarto-lightbox-gallery-1"><img src="https://kader-xai.github.io/explainers/banners/what-is-an-embedding.png" class="xpl-fig img-fluid"></a></p>
<div class="xpl-lead">
<p>A computer can’t compare two words the way you do. It has no sense of “king and queen feel related.” So we give it one — we turn every word into a list of numbers, place it as a point in space, and arrange that space so things that mean similar things land near each other. That list of numbers is an embedding.</p>
</div>
<section id="the-short-version" class="level2">
<h2 class="anchored" data-anchor-id="the-short-version">The short version</h2>
<p>An <strong>embedding</strong> is a vector — a fixed-length list of numbers — that represents a piece of data (a word, a sentence, an image, a user) as a <em>point in space</em>. The whole trick is that the space is arranged so <strong>distance means similarity</strong>: close points are alike, far points are not.</p>
<div class="xpl-key">
<p><strong>Key idea:</strong> An embedding turns “what does this mean?” into “where does this sit?” — and once meaning is a location, similarity is just distance.</p>
</div>
</section>
<section id="why-we-cant-just-use-the-words" class="level2">
<h2 class="anchored" data-anchor-id="why-we-cant-just-use-the-words">Why we can’t just use the words</h2>
<p>Computers store the word <em>cat</em> as a number (an ID), and <em>dog</em> as another. But those IDs are arbitrary — ID 4821 isn’t “closer” to ID 4822 in any meaningful way. There’s no math you can do on raw word-IDs that respects meaning.</p>
<p>Embeddings fix this. Instead of one arbitrary ID, each word gets a few hundred numbers, <em>learned</em> so that the geometry carries meaning. Now <code>cat</code> and <code>dog</code> end up near each other, while <code>cat</code> and <code>bulldozer</code> end up far apart — and that’s something a machine can actually compute with.</p>
</section>
<section id="distance-is-the-whole-point" class="level2">
<h2 class="anchored" data-anchor-id="distance-is-the-whole-point">Distance is the whole point</h2>
<p>Once words are points, you measure similarity with the <strong>dot product</strong> or <strong>cosine similarity</strong> — the same vector operations you already know. Try it: drag the two vectors and watch how “aligned” they are. Two embeddings pointing the same way = similar meaning.</p>
<div class="xpl-try">
<div class="ml-viz" data-viz="dot" data-params="{}">

</div>
</div>
<p>The famous example: take the embedding for <em>king</em>, subtract <em>man</em>, add <em>woman</em> — and you land almost exactly on <em>queen</em>. Meaning became arithmetic.</p>
</section>
<section id="where-embeddings-come-from" class="level2">
<h2 class="anchored" data-anchor-id="where-embeddings-come-from">Where embeddings come from</h2>
<p>Nobody hand-writes these numbers. A model <strong>learns</strong> them by reading enormous amounts of text and nudging each word’s vector based on the company it keeps — words that appear in similar contexts get pulled together. Word2Vec and GloVe did this for single words; modern transformer models produce <em>contextual</em> embeddings, where the same word gets a different vector depending on the sentence around it.</p>
<div class="callout callout-style-default callout-tip callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Tip
</div>
</div>
<div class="callout-body-container callout-body">
<p>The dimension count (e.g.&nbsp;384, 768, 1536) is just how many numbers each point has. More dimensions = more room to separate fine distinctions, at the cost of compute and memory.</p>
</div>
</div>
</section>
<section id="why-it-matters" class="level2">
<h2 class="anchored" data-anchor-id="why-it-matters">Why it matters</h2>
<p>Embeddings are the quiet engine under most of modern AI:</p>
<ul>
<li><strong>Search &amp; RAG</strong> — find documents whose embedding is closest to your question’s embedding.</li>
<li><strong>Recommenders</strong> — users and items as nearby points; recommend what’s close.</li>
<li><strong>Clustering &amp; dedup</strong> — group by proximity in embedding space.</li>
<li><strong>LLMs</strong> — the very first thing a transformer does is embed your tokens before any reasoning happens.</li>
</ul>
<p>If you understand embeddings, you understand the layer where raw data becomes something a model can think about.</p>
</section>
<section id="going-deeper" class="level2">
<h2 class="anchored" data-anchor-id="going-deeper">Going deeper</h2>
<ul>
<li><a href="https://arxiv.org/abs/1301.3781">Word2Vec (Mikolov et al., 2013)</a></li>
<li><a href="https://jalammar.github.io/illustrated-word2vec/">The Illustrated Word2Vec — Jay Alammar</a></li>
</ul>
<div class="xpl-nav">
<p><a href="../../kader-library.html">← Back to Library</a> <a href="../../kader-library.html">All explainers →</a></p>
</div>
</section>
</div>



 ]]></description>
  <category>AI</category>
  <category>Machine Learning</category>
  <category>Foundations</category>
  <guid>https://kader-xai.github.io/explainers/2026-06-05-what-is-an-embedding/</guid>
  <pubDate>Fri, 05 Jun 2026 00:00:00 GMT</pubDate>
  <media:content url="https://kader-xai.github.io/explainers/banners/what-is-an-embedding.png" medium="image" type="image/png" height="76" width="144"/>
</item>
</channel>
</rss>
