<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Kader Mohideen</title>
<link>https://kader-xai.github.io/blog.html</link>
<atom:link href="https://kader-xai.github.io/blog.xml" rel="self" type="application/rss+xml"/>
<description>Posts on cybersecurity, AI in security, and emerging defensive tech.</description>
<image>
<url>https://kader-xai.github.io/images/card.png</url>
<title>Kader Mohideen</title>
<link>https://kader-xai.github.io/blog.html</link>
</image>
<generator>quarto-1.9.37</generator>
<lastBuildDate>Thu, 07 May 2026 00:00:00 GMT</lastBuildDate>
<item>
  <title>Data Science Roadmap — From print(‘hello’) to Production LLMs</title>
  <dc:creator>Kader Mohideen</dc:creator>
  <link>https://kader-xai.github.io/blog/2026-05-07-data-science-roadmap/</link>
  <description><![CDATA[ 





<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><a href="../banners/data-science-roadmap.jpg" class="lightbox" data-gallery="quarto-lightbox-gallery-1"><img src="https://kader-xai.github.io/blog/banners/data-science-roadmap.jpg" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:100.0%"></a></p>
</figure>
</div>
<section id="from-printhello-to-production-llms" class="level1">
<h1>From <code>print('hello')</code> to Production LLMs</h1>
<p><em>A 31-module open-source data-science course you can finish in a weekend or stretch over a month.</em></p>
<hr>
<section id="why-i-built-this" class="level2">
<h2 class="anchored" data-anchor-id="why-i-built-this">Why I built this</h2>
<p>Most “learn data science” courses do one of two things badly:</p>
<ul>
<li>They lock the good stuff behind a subscription and split it across three separate courses (and never talk to each other).</li>
<li>Or they jump from <code>print("hello world")</code> straight to a Kaggle notebook, with no in-between.</li>
</ul>
<p>This repo is the in-between. <strong>31 deep-dive notebooks</strong>, every one runnable in Google Colab in one click, every one paired with a colour-coded explanation document. From your first variable to reading the inference code of a 671-billion-parameter LLM.</p>
</section>
<section id="who-its-for" class="level2">
<h2 class="anchored" data-anchor-id="who-its-for">Who it’s for</h2>
<ul>
<li><strong>Beginners</strong> who want a structured path that doesn’t skip steps.</li>
<li><strong>Self-taught coders</strong> who can write Python but want to fill the gaps in Pandas, NumPy, scikit-learn — and beyond.</li>
<li><strong>Career-switchers</strong> building a portfolio. The capstone notebook (Module 16) and the production modules (M29-M31) are portfolio-ready as-is.</li>
</ul>
</section>
<section id="whats-inside-six-parts" class="level2">
<h2 class="anchored" data-anchor-id="whats-inside-six-parts">What’s inside — six parts</h2>
<section id="part-1-python-for-data-science-modules-15" class="level3">
<h3 class="anchored" data-anchor-id="part-1-python-for-data-science-modules-15">Part 1 · Python for Data Science (Modules 1–5)</h3>
<p>Variables, data structures, OOP, file I/O, NumPy, Pandas, APIs, web scraping. The alphabet of every later module.</p>
</section>
<section id="part-2-data-visualization-modules-610" class="level3">
<h3 class="anchored" data-anchor-id="part-2-data-visualization-modules-610">Part 2 · Data Visualization (Modules 6–10)</h3>
<p>Matplotlib’s object-oriented API; the seven core chart types; specialised tools (waffle, word cloud, Folium maps); animation and Plotly; building dashboards that tell one cohesive story.</p>
</section>
<section id="part-3-data-analysis-ml-foundations-modules-1116" class="level3">
<h3 class="anchored" data-anchor-id="part-3-data-analysis-ml-foundations-modules-1116">Part 3 · Data Analysis &amp; ML Foundations (Modules 11–16)</h3>
<p>The universal workflow: import → wrangle → explore → model → evaluate → communicate. Built around a shared dataset (auto-mpg) so each step builds on the last, then validated end-to-end on California Housing.</p>
</section>
<section id="part-4-machine-learning-ai-modules-1722" class="level3">
<h3 class="anchored" data-anchor-id="part-4-machine-learning-ai-modules-1722">Part 4 · Machine Learning &amp; AI (Modules 17–22)</h3>
<p>PyTorch fundamentals; the six core model archetypes (Linear, Logistic, K-Means, MLP, CNN, Transformer LM); self-attention from scratch with <code>d_model = 2</code> so every matrix is hand-checkable; multi-head + causal attention; diffusion models on a 2D toy; time-series forecasting with ARIMA, Prophet, and LSTM.</p>
</section>
<section id="part-5-ai-research-foundations-modules-2325" class="level3">
<h3 class="anchored" data-anchor-id="part-5-ai-research-foundations-modules-2325">Part 5 · AI-Research Foundations (Modules 23–25)</h3>
<p>The math under every neural network (functions, derivatives, gradients, matrices, probability) plus a deep PyTorch primer. A guided tour of <strong>DeepSeek-V3’s actual inference code</strong> (RMSNorm, RoPE, Multi-Latent Attention, Mixture-of-Experts). Fine-tuning examples — full fine-tuning, <strong>LoRA</strong>, <strong>QLoRA</strong>, and SFT with TRL.</p>
</section>
<section id="part-6-practitioner-skills-modules-2631" class="level3">
<h3 class="anchored" data-anchor-id="part-6-practitioner-skills-modules-2631">Part 6 · Practitioner Skills (Modules 26–31)</h3>
<p>The day-to-day skills a working data scientist or ML engineer uses but most courses skip:</p>
<ul>
<li><strong>SQL</strong> — JOINs, CTEs, window functions, the SQL ↔︎ Pandas bridge</li>
<li><strong>Tree-based models</strong> — Random Forest, <strong>XGBoost</strong>, <strong>LightGBM</strong>, <strong>SHAP</strong> for interpretation</li>
<li><strong>A/B testing</strong> — proportion z-test, sample-size calc, Bonferroni / BH correction, the peeking trap</li>
<li><strong>MLOps</strong> — FastAPI, Docker, <strong>MLflow</strong>, drift monitoring with KS + PSI</li>
<li><strong>RAG &amp; vector search</strong> — embeddings, Chroma, hybrid BM25 + vector, reranker, grounded answers</li>
<li><strong>Prompt engineering &amp; LLM eval</strong> — few-shot, chain-of-thought, ReAct, structured outputs, LLM-as-judge</li>
</ul>
</section>
</section>
<section id="what-makes-it-different" class="level2">
<h2 class="anchored" data-anchor-id="what-makes-it-different">What makes it different</h2>
<table class="caption-top table">
<colgroup>
<col style="width: 33%">
<col style="width: 33%">
<col style="width: 33%">
</colgroup>
<thead>
<tr class="header">
<th></th>
<th>This course</th>
<th>Typical course</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Production architecture depth</td>
<td>DeepSeek-V3 dissection</td>
<td>“Transformers exist”</td>
</tr>
<tr class="even">
<td>Math integrated with code</td>
<td>Yes (Module 23)</td>
<td>usually skipped</td>
</tr>
<tr class="odd">
<td>Practical skills (SQL, A/B, MLOps)</td>
<td>Modules 26-29</td>
<td>rarely covered</td>
</tr>
<tr class="even">
<td>Companion docs</td>
<td>Line-by-line, colour-coded, ~30 pages each</td>
<td>None</td>
</tr>
<tr class="odd">
<td>Cost</td>
<td>Free, MIT-licensed</td>
<td>$40-300/month</td>
</tr>
</tbody>
</table>
</section>
<section id="how-to-use-it" class="level2">
<h2 class="anchored" data-anchor-id="how-to-use-it">How to use it</h2>
<p><strong>Option A — Colab.</strong> Click any badge in the README, hit <code>Save a copy in Drive</code>, run the cells. Zero install.</p>
<p><strong>Option B — Locally.</strong></p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">git</span> clone https://github.com/kader-xai/data-science-roadmap.git</span>
<span id="cb1-2"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">cd</span> data-science-roadmap</span>
<span id="cb1-3"><span class="ex" style="color: null;
background-color: null;
font-style: inherit;">pip</span> install jupyter numpy pandas scikit-learn torch transformers</span>
<span id="cb1-4"><span class="ex" style="color: null;
background-color: null;
font-style: inherit;">jupyter</span> notebook</span></code></pre></div></div>
</section>
<section id="what-you-walk-away-with" class="level2">
<h2 class="anchored" data-anchor-id="what-you-walk-away-with">What you walk away with</h2>
<p>After the 31 modules you can:</p>
<ul>
<li>Write any Python program and load data from any source.</li>
<li>Build classical ML models (regression, gradient boosting) AND modern AI models (transformers, diffusion).</li>
<li>Read production LLM source code (DeepSeek, Llama, Mistral, Qwen).</li>
<li>Fine-tune any open-weight model on your own data with LoRA.</li>
<li>Ship a model behind FastAPI + Docker with MLflow tracking.</li>
<li>Build a working RAG pipeline with vector search.</li>
<li>A/B-test prompts and evaluate LLMs scientifically.</li>
</ul>
<p>That’s effectively a 2026 ML-engineer career, built from <code>print('hello')</code>.</p>
<hr>
<p><strong>Repo:</strong> <a href="https://github.com/kader-xai/data-science-roadmap">github.com/kader-xai/data-science-roadmap</a> <strong>Live site:</strong> <a href="https://kader-xai.github.io/data-science-roadmap/">kader-xai.github.io/data-science-roadmap</a> <strong>License:</strong> MIT</p>
<hr>
</section>
<section id="module-index" class="level2">
<h2 class="anchored" data-anchor-id="module-index">Module index</h2>
<p>For anyone scanning to find a specific topic — here’s the full module list with a one-liner each.</p>
<section id="part-1-python-for-data-science" class="level3">
<h3 class="anchored" data-anchor-id="part-1-python-for-data-science">Part 1 · Python for Data Science</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 33%">
<col style="width: 33%">
<col style="width: 33%">
</colgroup>
<thead>
<tr class="header">
<th>#</th>
<th>Module</th>
<th>Topic</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>01</td>
<td>Python Basics</td>
<td>variables, types, strings, format strings, debugging</td>
</tr>
<tr class="even">
<td>02</td>
<td>Data Structures</td>
<td>lists, tuples, dicts, sets, comprehensions</td>
</tr>
<tr class="odd">
<td>03</td>
<td>Programming Fundamentals</td>
<td>conditionals, loops, functions, exceptions, OOP</td>
</tr>
<tr class="even">
<td>04</td>
<td>Working with Data</td>
<td>files, CSV/JSON, NumPy arrays, Pandas DataFrames</td>
</tr>
<tr class="odd">
<td>05</td>
<td>APIs &amp; Web Scraping</td>
<td><code>requests</code>, BeautifulSoup, <code>pd.read_html</code>, <code>yfinance</code></td>
</tr>
</tbody>
</table>
</section>
<section id="part-2-data-visualization" class="level3">
<h3 class="anchored" data-anchor-id="part-2-data-visualization">Part 2 · Data Visualization</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 33%">
<col style="width: 33%">
<col style="width: 33%">
</colgroup>
<thead>
<tr class="header">
<th>#</th>
<th>Module</th>
<th>Topic</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>06</td>
<td>Intro to Visualization</td>
<td>Matplotlib OO API, line plots, styling</td>
</tr>
<tr class="even">
<td>07</td>
<td>Basic Charts</td>
<td>bar, hist, pie, box, scatter, bubble, area</td>
</tr>
<tr class="odd">
<td>08</td>
<td>Specialized Viz</td>
<td>waffle, word cloud, regression plot, Folium</td>
</tr>
<tr class="even">
<td>09</td>
<td>Advanced Viz</td>
<td>subplots, time-series patterns, animation, Plotly</td>
</tr>
<tr class="odd">
<td>10</td>
<td>Dashboards &amp; Storytelling</td>
<td>composing charts to answer one question</td>
</tr>
</tbody>
</table>
</section>
<section id="part-3-data-analysis-ml-foundations" class="level3">
<h3 class="anchored" data-anchor-id="part-3-data-analysis-ml-foundations">Part 3 · Data Analysis &amp; ML Foundations</h3>
<table class="caption-top table">
<colgroup>
<col style="width: 33%">
<col style="width: 33%">
<col style="width: 33%">
</colgroup>
<thead>
<tr class="header">
<th>#</th>
<th>Module</th>
<th>Topic</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>11</td>
<td>Importing Data</td>
<td>CSV, Excel, JSON, SQL, web; the 5-line inspection ritual</td>
</tr>
<tr class="even">
<td>12</td>
<td>Data Wrangling</td>
<td>missing values, scaling, binning, encoding, outliers</td>
</tr>
<tr class="odd">
<td>13</td>
<td>Exploratory Data Analysis</td>
<td>distributions, correlations, group-bys, pivot tables</td>
</tr>
<tr class="even">
<td>14</td>
<td>Model Development</td>
<td>linear / multiple / polynomial regression with Pipelines</td>
</tr>
<tr class="odd">
<td>15</td>
<td>Model Evaluation</td>
<td>MSE/RMSE/MAE/R², CV, Ridge &amp; Lasso, GridSearch</td>
</tr>
<tr class="even">
<td>16</td>
<td>Capstone</td>
<td>California Housing end-to-end with Random Forest</td>
</tr>
</tbody>
</table>
</section>
<section id="part-4-machine-learning-ai-deeper-dive" class="level3">
<h3 class="anchored" data-anchor-id="part-4-machine-learning-ai-deeper-dive">Part 4 · Machine Learning &amp; AI (deeper dive)</h3>
<p>PyTorch fundamentals; the six core archetypes (Linear, Logistic, K-Means, MLP, CNN, Transformer LM); self-attention from scratch; multi-head + causal attention; diffusion models on a 2D toy; time-series with ARIMA, Prophet, and LSTM.</p>
</section>
<section id="part-5-ai-research-foundations" class="level3">
<h3 class="anchored" data-anchor-id="part-5-ai-research-foundations">Part 5 · AI-Research Foundations</h3>
<p>Math foundations integrated with code, a deep PyTorch primer, a guided dissection of <strong>DeepSeek-V3’s actual inference code</strong> (RMSNorm, RoPE, Multi-Latent Attention, Mixture-of-Experts), and worked fine-tuning examples — full fine-tuning, LoRA, QLoRA, and SFT with TRL.</p>
</section>
<section id="part-6-practitioner-skills" class="level3">
<h3 class="anchored" data-anchor-id="part-6-practitioner-skills">Part 6 · Practitioner Skills</h3>
<p>The day-to-day skills most courses skip — SQL · tree-based models with SHAP · A/B testing · MLOps with FastAPI/Docker/MLflow · RAG with vector search · prompt engineering and LLM eval.</p>
</section>
</section>
<section id="try-it" class="level2">
<h2 class="anchored" data-anchor-id="try-it">Try it</h2>
<p>The fastest path in is:</p>
<ol type="1">
<li>Open <a href="https://colab.research.google.com/github/kader-xai/data-science-roadmap/blob/main/module_01_python_basics.ipynb">Module 1 in Colab</a></li>
<li>Click <strong>File → Save a copy in Drive</strong></li>
<li>Run cells with <code>Shift+Enter</code></li>
</ol>
<p>If you only want the <em>retrieval</em> part of the AI track without training a model, jump to Modules 30–31 — the RAG and prompt-engineering notebooks stand on their own.</p>
</section>
<section id="related" class="level2">
<h2 class="anchored" data-anchor-id="related">Related</h2>
<ul>
<li>🔗 Project page on this site: <a href="../../projects/2025-data-science-roadmap/">Data Science Roadmap project</a></li>
<li>📦 Related project: <a href="../../projects/2025-employee-recall/">Employee Recall — LoRA + RAG</a> — a worked example of the LoRA + RAG techniques covered in Modules 25 and 30</li>
</ul>


</section>
</section>

 ]]></description>
  <category>Data Science</category>
  <category>Machine Learning</category>
  <category>Education</category>
  <category>Python</category>
  <category>LLM</category>
  <guid>https://kader-xai.github.io/blog/2026-05-07-data-science-roadmap/</guid>
  <pubDate>Thu, 07 May 2026 00:00:00 GMT</pubDate>
  <media:content url="https://kader-xai.github.io/blog/banners/data-science-roadmap.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Employee Recall — Capturing a Departing Employee’s Writing Style and Memory in an AI Successor</title>
  <dc:creator>Kader Mohideen</dc:creator>
  <link>https://kader-xai.github.io/blog/2026-05-01-employee-recall-lora-rag/</link>
  <description><![CDATA[ 





<p>When a senior employee leaves a company, two things go with them:</p>
<ol type="1">
<li><strong>Writing style.</strong> How they wrote to customers, peers, executives. Their tone, their hedging, their decision register, their opening and closing patterns — the things that make a reply <em>sound like them</em>.</li>
<li><strong>Historical knowledge.</strong> <em>Why</em> did we pick Postgres in 2023? <em>Why</em> did Acme get a $4,200 credit? Who is Mike Reyes and how should I handle him?</li>
</ol>
<p>The successor inherits an inbox and a Confluence dump. Neither captures <strong>why</strong>.</p>
<p>Onboarding documents tell you what the role does. They do not tell you why six months ago we agreed to give a customer an account credit, what tone the previous CSM used to push back on a procurement team, or which old engineering decisions are settled vs ripe to revisit.</p>
<p>That is the gap <strong>Employee Recall</strong> addresses — an open-source methodology and reference implementation for capturing a departing employee’s writing style and memory as a small, locally-runnable AI model.</p>
<blockquote class="blockquote">
<p>Repo: <a href="https://github.com/kader-xai/EmployeeRecall">github.com/kader-xai/EmployeeRecall</a></p>
</blockquote>
<section id="table-of-contents" class="level2">
<h2 class="anchored" data-anchor-id="table-of-contents">Table of contents</h2>
</section>
<section id="the-thesis-writing-style-and-knowledge-need-different-machinery" class="level2">
<h2 class="anchored" data-anchor-id="the-thesis-writing-style-and-knowledge-need-different-machinery">The thesis: writing style and knowledge need different machinery</h2>
<p>The single most important architectural decision in this project is to <strong>separate the two</strong>:</p>
<ul>
<li><strong>Writing style</strong> is <em>parametric</em>. It lives in the model’s weights. Bake it in via LoRA fine-tuning on the persona’s reply pairs.</li>
<li><strong>Knowledge</strong> is <em>retrieval</em>. Don’t try to memorise it; embed every document into a vector index and look it up at inference time.</li>
</ul>
<p>People often try to fine-tune for both style and facts at once. It is a bad idea. It bloats the model, it makes facts hard to update, and it costs more compute. Worse, you can’t tell after the fact whether a given answer was in the training data or hallucinated.</p>
<p>By contrast, style is genuinely a low-rank perturbation of the base model — that is exactly what LoRA is for. Facts belong in a vector index that you can rebuild every night. Two cheap pieces, glued together at inference time.</p>
<blockquote class="blockquote">
<p><strong>LoRA gives you the style. RAG gives you the receipts.</strong></p>
</blockquote>
</section>
<section id="architecture" class="level2">
<h2 class="anchored" data-anchor-id="architecture">Architecture</h2>
<p>Four ingredients, recombined at query time:</p>
<pre><code>Base model (Qwen2.5-7B, frozen)
  + LoRA adapter (~150 MB, trained on ~1,300 reply pairs)
  + FAISS index (~50 MB, ~17,000 chunks, BGE-base embeddings)
  + System prompt (a short text fingerprint of the persona)
  = persona-continuity model</code></pre>
<p>A useful analogy: think of the persona as a person.</p>
<table class="caption-top table">
<colgroup>
<col style="width: 50%">
<col style="width: 50%">
</colgroup>
<thead>
<tr class="header">
<th>Layer</th>
<th>Person</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Base model</td>
<td>The brain — language, reasoning, general knowledge</td>
</tr>
<tr class="even">
<td>LoRA adapter</td>
<td>The personality — tone, default mood, mannerisms</td>
</tr>
<tr class="odd">
<td>System prompt</td>
<td>Self-awareness — “I am Priya. I am at work. Here are my rules.”</td>
</tr>
<tr class="even">
<td>RAG index</td>
<td>The notes they brought to this meeting</td>
</tr>
</tbody>
</table>
<p>Pull any one of the four out and the model breaks differently:</p>
<ul>
<li>Without the LoRA: a generic AI flavour with the persona’s notes.</li>
<li>Without the RAG: the persona’s writing style with no specific knowledge — confident hallucinations.</li>
<li>Without the system prompt: the model writes in the right style but doesn’t know it’s the persona; introduces itself as “an AI assistant”.</li>
<li>Without the base model: nothing to fine-tune in the first place.</li>
</ul>
</section>
<section id="the-synthetic-dataset" class="level2">
<h2 class="anchored" data-anchor-id="the-synthetic-dataset">The synthetic dataset</h2>
<p>To make the methodology reproducible and shareable without privacy risk, the repo ships with <strong>a fully synthetic corpus</strong>: 18,978 documents across 4 simulated years, generated deterministically (<code>random.seed(42)</code> — same output on every run).</p>
<p>Two demo personas:</p>
<table class="caption-top table">
<colgroup>
<col style="width: 33%">
<col style="width: 33%">
<col style="width: 33%">
</colgroup>
<thead>
<tr class="header">
<th>Persona</th>
<th>Role</th>
<th>What’s in the corpus</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Priya Sharma</strong></td>
<td>Senior CSM at Northwind SaaS, 40 customer accounts, $4.2M ARR</td>
<td>Emails, meeting notes (QBRs, 1:1s), customer storylines</td>
</tr>
<tr class="even">
<td><strong>Rohan Iyer</strong></td>
<td>Staff Engineer on Platform team</td>
<td>Emails, meeting notes, RFCs, ADRs, postmortems</td>
</tr>
</tbody>
</table>
<p>The corpus has three tiers of content:</p>
<table class="caption-top table">
<colgroup>
<col style="width: 33%">
<col style="width: 33%">
<col style="width: 33%">
</colgroup>
<thead>
<tr class="header">
<th>Layer</th>
<th>Purpose</th>
<th>Volume</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Hand-written storylines</strong></td>
<td>Demo material — the questions that need to land cleanly</td>
<td>8 threads, ~50 docs</td>
</tr>
<tr class="even">
<td><strong>Dense per-account / per-project</strong></td>
<td>Realism — frequent cadence with named entities</td>
<td>~150 docs/year</td>
</tr>
<tr class="odd">
<td><strong>Bulk routine</strong></td>
<td>Ambient volume — generic emails, weekly syncs</td>
<td>~16,000 total</td>
</tr>
</tbody>
</table>
<p>The mistake first-time builders make is generating only bulk content. The model trains fine but the demo falls flat — every answer is generic. The fix is to invest scarce hand-authoring time in 5–8 specific narratives that the demo will actually walk through. The bulk corpus then provides realistic background volume.</p>
<p>We measured this directly: the eval scores 1.0 on hand-written storyline questions and ~0.0 on the same-topic questions whose answers exist only in templated bulk content.</p>
<blockquote class="blockquote">
<p><strong>Hand-write the demo. Generate the rest.</strong></p>
</blockquote>
<p>The corpus is also available as <strong>multi-format extraction</strong> — the same content rendered as <code>.eml</code> / <code>.html</code> / <code>.ics</code> / <code>.vtt</code> / <code>.md</code> / <code>.txt</code> (54,927 files in total). This lets a video demo show real <code>.eml</code> files in Mail.app and real <code>.ics</code> files in Calendar.app — proving the methodology applies to a production extraction pipeline, not just a custom JSON format.</p>
</section>
<section id="the-training-pipeline" class="level2">
<h2 class="anchored" data-anchor-id="the-training-pipeline">The training pipeline</h2>
<p>Five scripts run in order:</p>
<pre><code>prep_training_data.py  →  build_rag_index.py  →  train_lora.py  →  inference.py
                                                                        ↓
                                                                     eval.py</code></pre>
<section id="prep" class="level3">
<h3 class="anchored" data-anchor-id="prep">1. Prep</h3>
<p><code>prep_training_data.py</code> takes the JSONL corpus and produces two things:</p>
<ul>
<li><strong>SFT pairs</strong> — every email thread is walked, and any case where the persona replied to a prior message becomes an <code>(incoming → reply)</code> chat-format pair. ~1,287 pairs for Priya, 95/5 train/eval split.</li>
<li><strong>RAG chunks</strong> — every document is broken into retrievable text chunks with metadata. Type-aware: emails kept whole, meetings split by section (decisions and action_items get a retrieval boost), RFCs split by markdown heading.</li>
</ul>
</section>
<section id="index" class="level3">
<h3 class="anchored" data-anchor-id="index">2. Index</h3>
<p><code>build_rag_index.py</code> embeds every chunk with <strong>BGE-base-en-v1.5</strong> (768-dim, L2-normalised) and writes a FAISS <code>IndexFlatIP</code>. Exact cosine search. Sub-5 ms per query for ~17k chunks.</p>
<p>The same embedder must be used at query time. This is the single biggest footgun with RAG: a different embedder produces vectors in a different space and similarity scores become meaningless.</p>
</section>
<section id="train" class="level3">
<h3 class="anchored" data-anchor-id="train">3. Train</h3>
<p><code>train_lora.py</code> fine-tunes Qwen2.5-7B with LoRA via Unsloth + PEFT + TRL.</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1">model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> FastLanguageModel.get_peft_model(</span>
<span id="cb3-2">    model,</span>
<span id="cb3-3">    r<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,</span>
<span id="cb3-4">    lora_alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>,</span>
<span id="cb3-5">    lora_dropout<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>,</span>
<span id="cb3-6">    target_modules<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"q_proj"</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"k_proj"</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"v_proj"</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"o_proj"</span>,</span>
<span id="cb3-7">                    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"gate_proj"</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"up_proj"</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"down_proj"</span>],</span>
<span id="cb3-8">)</span></code></pre></div></div>
<p>Rank-16 LoRA on <strong>all attention and MLP linear layers</strong>. Targeting attention only would be enough for a task; for <em>writing style</em> you need the MLP layers too. With 4-bit base loading via bitsandbytes, the whole thing fits in ~14 GB VRAM.</p>
<p>Three epochs is the sweet spot. One epoch underfits the writing style; five overfits to specific phrasings. Cosine LR with a 3% warmup. Boring, reliable.</p>
<p>Cost: ~$0.25 and ~30 minutes on an A100 spot instance.</p>
</section>
<section id="inference" class="level3">
<h3 class="anchored" data-anchor-id="inference">4. Inference</h3>
<p><code>inference.py</code> does three steps per query:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># 1. retrieve</span></span>
<span id="cb4-2">q_emb <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> embedder.encode([q], normalize_embeddings<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>).astype(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"float32"</span>)</span>
<span id="cb4-3">scores, ids <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> index.search(q_emb, k<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>)</span>
<span id="cb4-4"></span>
<span id="cb4-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># 2. compose prompt with [Source N] labels</span></span>
<span id="cb4-6">sources <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.join(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"[Source </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>i<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">] </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>chunk[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'text'</span>]<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i, chunk <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> retrieved)</span>
<span id="cb4-7">messages <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [</span>
<span id="cb4-8">    {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"system"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: persona_system_prompt},</span>
<span id="cb4-9">    {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"role"</span>: <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"user"</span>,   <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"content"</span>: <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"...QUESTION: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>q<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n\n</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">SOURCES:</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>sources<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>},</span>
<span id="cb4-10">]</span>
<span id="cb4-11"></span>
<span id="cb4-12"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># 3. generate</span></span>
<span id="cb4-13">out <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model.generate(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span>inputs, max_new_tokens<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">800</span>, temperature<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.4</span>)</span></code></pre></div></div>
<p>Citations come from the prompt instruction. The model is told to cite <code>[Source N]</code> inline. Temperature 0.4 keeps the writing style consistent without making it stiff.</p>
</section>
<section id="eval" class="level3">
<h3 class="anchored" data-anchor-id="eval">5. Eval</h3>
<p><code>eval.py</code> runs two automated metrics:</p>
<ul>
<li><strong>History recall</strong> — keyword overlap with gold answers from <code>eval_questions.json</code>.</li>
<li><strong>Style cosine</strong> — cosine similarity of the model’s reply versus the persona’s actual reply on held-out incoming emails.</li>
</ul>
<p>Live numbers from a real Priya run:</p>
<pre><code>History keyword score: 0.30   strong on hand-written storylines, weak elsewhere
Style cosine:          0.735  solid style match</code></pre>
<p>The 0.30 history score reflects the corpus shape: questions whose answers exist only in templated bulk content can’t be answered correctly because the bulk content doesn’t actually contain those facts.</p>
</section>
</section>
<section id="sample-output" class="level2">
<h2 class="anchored" data-anchor-id="sample-output">Sample output</h2>
<p>This is the model answering <em>“What happened with Acme Corp?”</em>:</p>
<blockquote class="blockquote">
<p>Thanks for the patience while I dug into this. The seat-count on Acme’s Q1 invoice came in 31 seats over (388 vs 357), about $14k. <strong>Mike Reyes (CFO)</strong> pushed on this in a working session on March 4 [Source 1]. We landed on a partial credit (<strong>$4.2k</strong>) and Sarah Lin’s team agreed to run quarterly seat-deactivation reviews going forward [Source 2]. The renewal is mid-April; Mike was clear that the credit conversation and the renewal conversation are linked from his side.</p>
</blockquote>
<pre><code>SOURCES:
  [1] meeting-acme-001     2025-03-04   score=0.74
  [2] email-acme-003       2025-03-03   score=0.71
  [3] meeting-acme-1on1    2025-10-16   score=0.70</code></pre>
<p>Three things to notice:</p>
<ol type="1">
<li><strong>Writing style</strong> — <em>“Thanks for the patience while I dug into this”</em> is a Priya opener. The soft connection of the credit to the renewal is her register.</li>
<li><strong>Facts</strong> — specific dollar amount (<code>$4.2k</code>), specific seat-count delta (<code>31 seats over</code>), named people from the cast file.</li>
<li><strong>Citations</strong> — the source IDs are real corpus filenames you can <code>cat</code> to verify. Nothing was hallucinated.</li>
</ol>
<p>The killer demo moment: ask <strong>both personas</strong> the same cross-cutting question.</p>
<pre><code>/ask-priya What was the May 2025 Hooli incident from the customer side?
/ask-rohan What was the May 2025 Hooli incident? Walk me through the root cause.</code></pre>
<p>Priya answers from the customer-comms angle: SLA credit, exec sponsor, advocate-program protection. Rohan answers from the engineering angle: misconfigured per-tenant limit, circuit breaker, hardening work in the postmortem.</p>
<p>Same event. Two grounded perspectives. No model in the world can do that without per-person training data — but a small LoRA + RAG can, for a quarter.</p>
</section>
<section id="local-deployment" class="level2">
<h2 class="anchored" data-anchor-id="local-deployment">Local deployment</h2>
<p>The whole stack runs on a Mac:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb8-1"><span class="ex" style="color: null;
background-color: null;
font-style: inherit;">brew</span> install ollama</span>
<span id="cb8-2"><span class="ex" style="color: null;
background-color: null;
font-style: inherit;">ollama</span> serve <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">&amp;</span></span>
<span id="cb8-3"></span>
<span id="cb8-4"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">cd</span> local_inference</span>
<span id="cb8-5"><span class="ex" style="color: null;
background-color: null;
font-style: inherit;">ollama</span> create priya <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">-f</span> Modelfile.priya</span>
<span id="cb8-6"><span class="ex" style="color: null;
background-color: null;
font-style: inherit;">ollama</span> run priya</span></code></pre></div></div>
<p>For the cited-answer experience, three options ship in the repo:</p>
<ul>
<li><strong>Jupyter notebook</strong> (<code>local_inference/ask.ipynb</code>) — load the embedder + FAISS index once, then <code>ask("...")</code> per question.</li>
<li><strong>REST API</strong> (<code>local_inference/api.py</code>) — FastAPI on port 8000 with auto-generated Swagger docs at <code>/docs</code>.</li>
<li><strong>Telegram bot</strong> (<code>local_inference/telegram_bot.py</code>) — one self-contained script, no tunnel needed.</li>
</ul>
<p>For a Slack demo, an importable n8n workflow + Cloudflare tunnel setup is documented in <a href="https://github.com/kader-xai/EmployeeRecall/blob/main/local_inference/SLACK_N8N_SETUP.md">SLACK_N8N_SETUP.md</a>. The full pipeline:</p>
<pre><code>Slack → cloudflared → n8n :5678 → api.py :8000 → Ollama :11434 → cited reply in Slack</code></pre>
</section>
<section id="cost-and-time" class="level2">
<h2 class="anchored" data-anchor-id="cost-and-time">Cost and time</h2>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Step</th>
<th>Where</th>
<th>Time</th>
<th>Cost</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Generate corpus</td>
<td>local laptop</td>
<td>~30 sec</td>
<td>$0</td>
</tr>
<tr class="even">
<td>Prep training data</td>
<td>local laptop</td>
<td>~5 sec</td>
<td>$0</td>
</tr>
<tr class="odd">
<td>Build FAISS index</td>
<td>local laptop</td>
<td>~30 sec</td>
<td>$0</td>
</tr>
<tr class="even">
<td>LoRA fine-tune</td>
<td>Colab A100</td>
<td>~30 min</td>
<td>~$0.25</td>
</tr>
<tr class="odd">
<td>Merge + GGUF + quantise</td>
<td>Colab A100</td>
<td>~10 min</td>
<td>~$0.10</td>
</tr>
<tr class="even">
<td>Daily inference</td>
<td>Mac</td>
<td>—</td>
<td>$0</td>
</tr>
</tbody>
</table>
<p><strong>Total per persona: under $1.</strong></p>
<p>Total compute budget for both demo personas (Priya + Rohan): about $0.70. The cost that dominates is the human time spent hand-authoring the storylines, which is the right cost ratio.</p>
</section>
<section id="beyond-the-demo-the-real-use-case-spectrum" class="level2">
<h2 class="anchored" data-anchor-id="beyond-the-demo-the-real-use-case-spectrum">Beyond the demo: the real use-case spectrum</h2>
<p>The exact same pipeline supports a range of deployments. Pick by how personal the training data is:</p>
<table class="caption-top table">
<colgroup>
<col style="width: 25%">
<col style="width: 25%">
<col style="width: 25%">
<col style="width: 25%">
</colgroup>
<thead>
<tr class="header">
<th>Pattern</th>
<th>LoRA on</th>
<th>RAG on</th>
<th>Risk</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Pure company RAG</td>
<td>nothing (use base model)</td>
<td>all internal docs</td>
<td>low — safest first deployment</td>
</tr>
<tr class="even">
<td>Onboarding tutor</td>
<td>company brand style</td>
<td>onboarding handbook</td>
<td>low</td>
</tr>
<tr class="odd">
<td>Role persona</td>
<td>aggregate of all CSMs</td>
<td>new hire’s accounts</td>
<td>medium — depersonalised</td>
</tr>
<tr class="even">
<td><strong>Departing employee twin</strong> <em>(this demo)</em></td>
<td>one specific person</td>
<td>their corpus</td>
<td><strong>high — needs full consent</strong></td>
</tr>
<tr class="odd">
<td>Public digital twin</td>
<td>one public figure</td>
<td>their published work</td>
<td>very high — heavy legal review</td>
</tr>
</tbody>
</table>
<p>The technique is the same across all five rows. What scales is the <em>governance, consent, and audit requirements</em>. A “pure company RAG” can ship in a week with low risk. A “departing employee twin” needs a privacy programme around it before it ships at all.</p>
</section>
<section id="privacy-the-part-that-actually-matters" class="level2">
<h2 class="anchored" data-anchor-id="privacy-the-part-that-actually-matters">Privacy: the part that actually matters</h2>
<p>The synthetic corpus in the repo is safe because nothing about it is real. For real deployment with real employees, the technical pipeline is the easy part. The governance is the work:</p>
<ul>
<li><strong>Explicit, written consent</strong> from the persona, scoped to specific corpora and successor users.</li>
<li><strong>Sunset clause</strong> — model retires on a date or on the persona’s request. Re-training is the only true erasure for parametric memorisation.</li>
<li><strong>PII redaction at ingest</strong> — Microsoft Presidio or similar, applied at chunk-write time. Don’t put email addresses, phone numbers, customer IDs into FAISS in the clear.</li>
<li><strong>Access tiers</strong> — tag every doc with a clearance level; filter retrieval per asker. The model should not see what the asker can’t legitimately read.</li>
<li><strong>Audit log</strong> — every query, every retrieval, every output, retained per regulatory requirement.</li>
<li><strong>Memorisation audit</strong> — sample 100 outputs, n-gram-check against training. Refuse to ship if leakage rate exceeds a threshold.</li>
<li><strong>Citation enforcement</strong> — refuse to answer if no source crosses a similarity threshold. <em>“I don’t have a source for that”</em> beats a confident guess.</li>
<li><strong>Mandatory disclaimer</strong> on every output: <em>“Drafted in the writing style of X by an AI; not authored by X.”</em></li>
<li><strong>Memorisation versus retrieval</strong> — RAG is recoverable (delete a doc, re-index, fact gone). LoRA-baked content is harder to remove. Plan accordingly.</li>
</ul>
<p>The technique is real. The risks are real. Synthetic-data demos are safe; real-data deployment is a privacy programme, not a code project.</p>
</section>
<section id="stack" class="level2">
<h2 class="anchored" data-anchor-id="stack">Stack</h2>
<p>The whole project is built on open tools:</p>
<table class="caption-top table">
<thead>
<tr class="header">
<th>Layer</th>
<th>Tool</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Base model</td>
<td><a href="https://huggingface.co/Qwen/Qwen2.5-7B-Instruct">Qwen2.5-7B-Instruct</a></td>
</tr>
<tr class="even">
<td>LoRA training</td>
<td><a href="https://github.com/unslothai/unsloth">Unsloth</a> + <a href="https://github.com/huggingface/peft">PEFT</a> + <a href="https://github.com/huggingface/trl">TRL</a></td>
</tr>
<tr class="odd">
<td>4-bit base loading</td>
<td><a href="https://github.com/bitsandbytes-foundation/bitsandbytes">bitsandbytes</a></td>
</tr>
<tr class="even">
<td>Embeddings</td>
<td><a href="https://huggingface.co/BAAI/bge-base-en-v1.5">BGE-base-en-v1.5</a></td>
</tr>
<tr class="odd">
<td>Vector index</td>
<td><a href="https://github.com/facebookresearch/faiss">FAISS</a></td>
</tr>
<tr class="even">
<td>GGUF conversion</td>
<td><a href="https://github.com/ggerganov/llama.cpp">llama.cpp</a></td>
</tr>
<tr class="odd">
<td>Local inference</td>
<td><a href="https://ollama.com">Ollama</a></td>
</tr>
<tr class="even">
<td>API wrapper</td>
<td><a href="https://fastapi.tiangolo.com">FastAPI</a></td>
</tr>
<tr class="odd">
<td>Workflow / Slack</td>
<td><a href="https://n8n.io">n8n</a></td>
</tr>
</tbody>
</table>
<p>Apache or MIT-licensed throughout. No proprietary tooling needed at any step.</p>
</section>
<section id="try-it" class="level2">
<h2 class="anchored" data-anchor-id="try-it">Try it</h2>
<p>Three paths into the repo, ranked by effort:</p>
<section id="run-the-demo-personas" class="level3">
<h3 class="anchored" data-anchor-id="run-the-demo-personas">1. Run the demo personas</h3>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode bash code-with-copy"><code class="sourceCode bash"><span id="cb10-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">git</span> clone https://github.com/kader-xai/EmployeeRecall.git</span></code></pre></div></div>
<p>Open <code>training/Persona_Continuity_Colab.ipynb</code> in Google Colab. Set runtime to A100. Run All. Thirty minutes later you have a working LoRA + RAG system answering questions about Priya’s accounts.</p>
</section>
<section id="train-on-your-own-persona" class="level3">
<h3 class="anchored" data-anchor-id="train-on-your-own-persona">2. Train on your own persona</h3>
<p>The system is fully parameterised. Copy <code>personas/priya.json</code> to <code>personas/yourname.json</code>, edit the fingerprint fields (<code>tone_profile</code>, <code>vocab_fingerprint</code>, etc.), drop your corpus into <code>corpus/yourname/</code> as JSONL, and re-run the same four scripts.</p>
</section>
<section id="pure-rag-only-skip-the-lora" class="level3">
<h3 class="anchored" data-anchor-id="pure-rag-only-skip-the-lora">3. Pure RAG only — skip the LoRA</h3>
<p>If you only need the <em>memory</em> part — citations, document Q&amp;A — and don’t want to deal with style cloning at all, skip <code>train_lora.py</code> entirely. The inference script will retrieve and cite using the base model. This is the <strong>safest deployment pattern</strong> for sensitive corpora since there’s no parametric memorisation risk.</p>
</section>
</section>
<section id="whats-open-sourced" class="level2">
<h2 class="anchored" data-anchor-id="whats-open-sourced">What’s open-sourced</h2>
<p>Everything:</p>
<ul>
<li>The code (MIT)</li>
<li>The 18,978-document synthetic corpus and persona JSON (CC0 — public domain)</li>
<li>The full training pipeline + Colab notebook</li>
<li>The local-inference stack (notebook, FastAPI, Telegram bot)</li>
<li>The n8n workflow for Slack</li>
<li>Methodology docs, lecture deck, technical detail walkthrough</li>
</ul>
<blockquote class="blockquote">
<p><strong>Repo:</strong> <a href="https://github.com/kader-xai/EmployeeRecall">github.com/kader-xai/EmployeeRecall</a></p>
</blockquote>
<p>If you build something on top of this — especially with real (consented) employee data — please open an issue with what you learned. The hard parts of this project are not in the code; they are in the deployment governance, and we are all figuring that out together.</p>
</section>
<section id="tldr" class="level2">
<h2 class="anchored" data-anchor-id="tldr">TL;DR</h2>
<ul>
<li>A senior employee’s writing style and historical knowledge are the most valuable things they take when they leave.</li>
<li>The architecture is simple: <strong>LoRA for writing style</strong> (parametric, distilled), <strong>RAG for knowledge</strong> (retrieval, updateable), <strong>system prompt for identity</strong> (text, swappable).</li>
<li>A complete reproduction recipe — including a fully synthetic 19k-document corpus with two demo personas — is open-source under <a href="https://github.com/kader-xai/EmployeeRecall">github.com/kader-xai/EmployeeRecall</a>.</li>
<li>Trains in 30 minutes on an A100 for ~$0.25. Runs on a Mac via Ollama for free.</li>
<li>Same pipeline supports a spectrum of deployments from “pure company RAG” through “public digital twin.” The technique scales; the governance work is what changes.</li>
</ul>
<p>If you want to talk about this — building it, deploying it, or the privacy programme around it — find me on <a href="https://linkedin.com/in/kader-m-1a6023a6">LinkedIn</a> or open an issue on the repo.</p>


</section>

 ]]></description>
  <category>AI</category>
  <category>LLM</category>
  <category>LoRA</category>
  <category>RAG</category>
  <category>Fine-tuning</category>
  <category>Knowledge Management</category>
  <guid>https://kader-xai.github.io/blog/2026-05-01-employee-recall-lora-rag/</guid>
  <pubDate>Fri, 01 May 2026 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Exploring the 2024 MAD (Machine Learning, AI &amp; Data) Landscape</title>
  <dc:creator>Kader Mohideen</dc:creator>
  <link>https://kader-xai.github.io/blog/2024-07-03-2024-mad-landscape/</link>
  <description><![CDATA[ 





<p>The convergence of machine learning, artificial intelligence (AI), and data science has heralded a transformative era. The FirstMark-curated <strong>2024 MAD Landscape</strong> examines the current ecosystem and the major innovations shaping progress across these interconnected fields.</p>
<section id="infrastructure" class="level2">
<h2 class="anchored" data-anchor-id="infrastructure">Infrastructure 🏗️</h2>
<p>The foundation supporting AI systems includes:</p>
<ul>
<li><strong>Data Storage</strong> — Snowflake, Databricks, and Amazon S3 manage large-scale data efficiently, with Snowflake excelling at multi-cloud environments.</li>
<li><strong>Data Integration &amp; ETL</strong> — Fivetran, Stitch, and Talend automate data unification across sources.</li>
<li><strong>Data Governance &amp; Security</strong> — Collibra, Alation, and Immuta provide compliance and privacy frameworks.</li>
<li><strong>Compute &amp; Infrastructure</strong> — AWS, Google Cloud, and Microsoft Azure deliver essential cloud computing capabilities.</li>
</ul>
</section>
<section id="analytics" class="level2">
<h2 class="anchored" data-anchor-id="analytics">Analytics 📊</h2>
<ul>
<li><strong>Business Intelligence</strong> — Tableau, Looker, and Power BI enable data visualization and actionable insights.</li>
<li><strong>Data Science Platforms</strong> — DataRobot, H2O.ai, and Dataiku simplify ML model development and deployment.</li>
<li><strong>Data Engineering</strong> — dbt Labs, Matillion, and Astronomer build and manage data pipelines.</li>
</ul>
</section>
<section id="machine-learning-ai" class="level2">
<h2 class="anchored" data-anchor-id="machine-learning-ai">Machine Learning &amp; AI 🤖</h2>
<ul>
<li><strong>ML &amp; AI Platforms</strong> — IBM Watson, Google AI, and Microsoft Azure ML provide comprehensive development tools.</li>
<li><strong>MLOps</strong> — Domino Data Lab, Algorithmia, and Tecton ensure production model reliability.</li>
<li><strong>NLP</strong> — OpenAI, Hugging Face, and Cohere advance language understanding technology.</li>
</ul>
</section>
<section id="applications" class="level2">
<h2 class="anchored" data-anchor-id="applications">Applications 🌐</h2>
<ul>
<li><strong>Enterprise</strong> — Salesforce and HubSpot leverage AI for customer engagement.</li>
<li><strong>Healthcare</strong> — Tempus and PathAI revolutionize diagnostics and treatment.</li>
<li><strong>Finance</strong> — Zest AI and Kensho provide predictive analytics and risk assessment.</li>
</ul>
</section>
<section id="data-sources-apis" class="level2">
<h2 class="anchored" data-anchor-id="data-sources-apis">Data Sources &amp; APIs 📡</h2>
<ul>
<li><strong>Public Marketplaces</strong> — AWS Data Exchange, Datarade, and Snowflake Data Marketplace offer extensive datasets.</li>
<li><strong>Integration APIs</strong> — Twilio, Stripe, and Plaid enable seamless data and functionality integration.</li>
</ul>
</section>
<section id="open-source-infrastructure" class="level2">
<h2 class="anchored" data-anchor-id="open-source-infrastructure">Open Source Infrastructure 🔓</h2>
<ul>
<li><strong>Frameworks</strong> — TensorFlow, PyTorch, and Scikit-learn offer flexibility for model building.</li>
<li><strong>Data Tools</strong> — Apache Kafka, Apache Spark, and Druid manage large-scale data processing.</li>
</ul>
</section>
<section id="consulting-strategy" class="level2">
<h2 class="anchored" data-anchor-id="consulting-strategy">Consulting &amp; Strategy 🧠</h2>
<ul>
<li><strong>Major Firms</strong> — Deloitte, McKinsey, and BCG offer specialized AI and data science consulting.</li>
<li><strong>Specialists</strong> — Element AI and Cognizant provide targeted implementation expertise.</li>
</ul>
<div class="callout callout-style-default callout-note callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Note
</div>
</div>
<div class="callout-body-container callout-body">
<p>Originally published on riddlesphere.com on July 3, 2024.</p>
</div>
</div>


</section>

 ]]></description>
  <category>AI</category>
  <category>Machine Learning</category>
  <category>Data</category>
  <guid>https://kader-xai.github.io/blog/2024-07-03-2024-mad-landscape/</guid>
  <pubDate>Wed, 03 Jul 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Attack-Centric Framework — AC1</title>
  <dc:creator>Kader Mohideen</dc:creator>
  <link>https://kader-xai.github.io/blog/2024-06-13-acf-ac1/</link>
  <description><![CDATA[ 





<div class="callout callout-style-default callout-warning callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Warning
</div>
</div>
<div class="callout-body-container callout-body">
<p>This is a stub — the original post on riddlesphere.com is currently unreachable for migration. The full content will be restored here when recovered.</p>
</div>
</div>
<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>The <strong>Attack-Centric Framework (ACF)</strong> is introduced as a response to the extensive array of compliance standards that organizations must navigate in cybersecurity today.</p>
<p>Where traditional compliance-driven approaches focus on satisfying audit requirements, the attack-centric perspective re-orients defense around the realities of how attackers operate — what techniques they use, what assets they target, and how to disrupt them at each stage.</p>
<p>Continue with <a href="../2024-06-13-acf-ac2/">Attack-Centric Framework — AC2</a> for the eight key components of the framework.</p>
<div class="callout callout-style-default callout-note callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Note
</div>
</div>
<div class="callout-body-container callout-body">
<p>Originally published on riddlesphere.com on June 13, 2024.</p>
</div>
</div>


</section>

 ]]></description>
  <category>Cybersecurity</category>
  <category>Attack-Centric Framework</category>
  <category>Compliance</category>
  <guid>https://kader-xai.github.io/blog/2024-06-13-acf-ac1/</guid>
  <pubDate>Thu, 13 Jun 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Attack-Centric Framework — AC2</title>
  <dc:creator>Kader Mohideen</dc:creator>
  <link>https://kader-xai.github.io/blog/2024-06-13-acf-ac2/</link>
  <description><![CDATA[ 





<section id="overview" class="level2">
<h2 class="anchored" data-anchor-id="overview">Overview</h2>
<p>The Attack-Centric Framework presents a comprehensive cybersecurity approach structured around eight key components.</p>
</section>
<section id="key-components" class="level2">
<h2 class="anchored" data-anchor-id="key-components">Key Components</h2>
<p>The framework emphasizes <strong>unified risk assessment</strong> that consolidates existing methodologies for evaluating vulnerabilities and threats. It incorporates proactive security measures drawing from <strong>zero trust principles and threat intelligence integration</strong>.</p>
<p>The approach includes <strong>dynamic defense protocols</strong> designed to respond to threats in real-time, alongside integration of compliance standards like <strong>GDPR and ISO/IEC 27001</strong>. The framework also highlights <strong>threat-centric analytics</strong> focused on detection and incident response.</p>
<p>Additionally, the strategy addresses <strong>industry-specific customization, continuous improvement cycles, and organizational culture</strong>. Fostering a security-conscious culture represents a cornerstone element for effective cybersecurity.</p>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>The Attack-Centric Framework offers organizations a structured approach to cybersecurity defense, combining technical controls with strategic thinking and organizational awareness.</p>
<p>See also: <a href="../2024-06-13-acf-ac1/">Attack-Centric Framework — AC1</a> for the introduction to ACF.</p>
<div class="callout callout-style-default callout-note callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
Note
</div>
</div>
<div class="callout-body-container callout-body">
<p>Originally published on riddlesphere.com on June 13, 2024.</p>
</div>
</div>


</section>

 ]]></description>
  <category>Cybersecurity</category>
  <category>Attack-Centric Framework</category>
  <category>Zero Trust</category>
  <guid>https://kader-xai.github.io/blog/2024-06-13-acf-ac2/</guid>
  <pubDate>Thu, 13 Jun 2024 00:00:00 GMT</pubDate>
</item>
</channel>
</rss>
