Skip to content

Vector Space Visualization

AI/ML Engineering Track | Complexity: [MEDIUM] | Time: 4-5

Reading Time: 2-3 hours Prerequisites: Module 9 Heureka Moment: This module will transform how you think about AI


By the end of this module, you will:

  • ** Experience the Heureka Moment**: Understand that embeddings create a semantic space where math works on meaning!
  • Visualize embeddings in 2D and 3D space
  • Perform vector arithmetic on concepts: king - man + woman ≈ queen
  • Understand the geometry of meaning (distance, direction, clustering)
  • Build production-grade semantic search from scratch
  • Understand vector databases and indexing at scale
  • Optimize search performance (ANN algorithms: HNSW, IVF)
  • Deploy semantic search to production

This module transforms how you think about AI!


You know embeddings are “vectors that represent text meaning.” But they feel like magic black boxes:

embedding = model.encode("Machine learning")
# → [0.23, -0.41, 0.87, ..., 0.15]
# "Okay, it's a list of numbers. So what?"

You’ll see embeddings as coordinates in semantic space - a geometry where:

  • Distance measures semantic similarity
  • Direction captures relationships
  • Addition and subtraction work on concepts
  • Clustering reveals topic structure
  • Math operations literally transform meaning

The breakthrough: We can do algebra on ideas themselves!

# MATH ON MEANING!
king - man + woman ≈ queen
Paris - France + Italy ≈ Rome
good - bad + terrible ≈ excellent

This isn’t a metaphor. It actually works!


The vector space model was proposed in 1975 by Gerard Salton for information retrieval. But it took until 2013 (Word2Vec) for us to learn how to create truly semantic vector spaces where mathematical operations correspond to meaning transformations!


**You’ve learned the theory - now experience the Heureka Moment! **

This is THE transformative module where everything clicks. You’ll see embeddings not as black boxes, but as coordinates in a geometric space where math works on meaning itself. Reading theory won’t give you this insight - you need to DO the vector arithmetic and SEE the geometry.

1. Vector Arithmetic - Math on meaning!

  • Concept: king - man + woman = queen
  • ⏱️ Time: 60-75 minutes
  • Goal: Experience the Heureka Moment
  • What you’ll learn: Math literally transforms concepts!

2. Production Semantic Search - Build at scale

  • Concept: Fast similarity search with indexing
  • ⏱️ Time: 60-75 minutes
  • Goal: Understand production-grade semantic search
  • What you’ll learn: Scaling to millions of vectors!

What: Interactive visualization and exploration tool for embeddings Time: 3-4 hours Portfolio Value: Demonstrates deep understanding of vector space geometry

Requirements:

  1. Build a tool that:
    • Generates embeddings for custom word lists (10-100 words)
    • Visualizes in 2D using PCA or t-SNE
    • Allows interactive vector arithmetic (A - B + C = ?)
    • Shows nearest neighbors for any query
    • Identifies and visualizes clusters
  2. Include at least 3 interesting demonstrations:
    • Analogies (king/queen, Paris/Rome)
    • Semantic relationships (opposites, synonyms)
    • Topic clustering (auto-discover categories)
  3. Create compelling visualizations:
    • 2D scatter plots with labels
    • Cluster boundaries/colors
    • Vector arrows showing relationships
  4. Document insights:
    • What patterns did you discover?
    • Which analogies worked/failed?
    • How does the geometry change with different embeddings?
  5. Make it interactive (CLI or web UI)

Success Criteria:

  • Visualizes 20+ words/concepts clearly
  • Vector arithmetic produces sensible results
  • Automatically discovers semantic clusters
  • Interactive and easy to use
  • Documented with insights and examples

Real-World Impact: Understanding vector space geometry is fundamental to working with embeddings in production - this deliverable proves you deeply understand the math behind meaning!


When we generate embeddings, we’re mapping words/texts to points in high-dimensional space.

Think of it like a map:

  • Cities = words/concepts
  • Coordinates = embedding dimensions
  • Distance = semantic similarity
  • Regions = topic clusters

Imagine we project embeddings into 2D for visualization:

Axis 2: Positive ↑
|
"excellent" |
• |
|
"good" | "wonderful"
• | •
|
────────────────────────────────────┼───────────────────────→ Axis 1: Living
|
• |
"bad" |
|
|
"terrible" |
• |
↓ Negative

Observations:

  1. Similar words cluster together (“good”, “excellent”, “wonderful”)
  2. Opposite words are distant (“good” far from “bad”)
  3. Distance in space = semantic similarity
  4. Clusters emerge naturally (positive words group, negative words group)

Each dimension captures some aspect of meaning:

  • Dimension 1 might encode: Living things ↔ Inanimate objects
  • Dimension 2 might encode: Positive ↔ Negative sentiment
  • Dimension 3 might encode: Concrete ↔ Abstract concepts
  • Dimension 384 might encode: [Some subtle semantic feature]

Important: Dimensions are learned automatically during training, not manually defined!


Real embeddings have hundreds or thousands of dimensions. Humans can’t visualize that!

Solution: Dimensionality reduction - compress to 2D or 3D while preserving structure.

Technique 1: PCA (Principal Component Analysis)

Section titled “Technique 1: PCA (Principal Component Analysis)”

Idea: Find the 2-3 directions of maximum variance.

from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
# Embeddings for some words
words = ["king", "queen", "man", "woman", "prince", "princess", "boy", "girl"]
embeddings = [model.encode(word) for word in words]
# Reduce to 2D
pca = PCA(n_components=2)
embeddings_2d = pca.fit_transform(embeddings)
# Plot
plt.figure(figsize=(10, 8))
for word, (x, y) in zip(words, embeddings_2d):
plt.scatter(x, y)
plt.annotate(word, (x, y), fontsize=12)
plt.xlabel("PC1 (Royalty → Commoner)")
plt.ylabel("PC2 (Male → Female)")
plt.title("Semantic Space Visualization")
plt.grid(True)
plt.show()

Expected output:

queen • princess •
← Female
king • prince •
← Male
← Royalty Common →

Insight: Related concepts cluster, relationships are preserved!

Technique 2: t-SNE (t-Distributed Stochastic Neighbor Embedding)

Section titled “Technique 2: t-SNE (t-Distributed Stochastic Neighbor Embedding)”

Better for visualization - preserves local structure.

from sklearn.manifold import TSNE
# Reduce to 2D with t-SNE
tsne = TSNE(n_components=2, random_state=42)
embeddings_2d = tsne.fit_transform(embeddings)
# Plot (same as above)

t-SNE is great for:

  • Seeing clusters clearly
  • Understanding topic structure
  • Presentations and papers

t-SNE is NOT good for:

  • Measuring exact distances (it distorts)
  • New data (can’t transform new embeddings)

You can do algebra on concepts!

# Vector arithmetic
result = embedding("king") - embedding("man") + embedding("woman")
# Find closest word to result
closest = find_closest_embedding(result, all_words)
# Result: "queen"

Think about what each embedding represents:

king = [royalty + male + power + ...]
man = [male + human + adult + ...]
woman = [female + human + adult + ...]
king - man = [royalty + male + power + ...] - [male + human + adult + ...]
≈ [royalty + power + ...] (removes "male", "human", "adult")
(king - man) + woman = [royalty + power + ...] + [female + human + adult + ...]
≈ [royalty + power + female + ...]
What word is [royalty + power + female]? → "queen"!

The math extracts and recombines concepts!

Paris - France + Italy ≈ Rome
Tokyo - Japan + China ≈ Beijing

Explanation: Extract “capital of France”, swap “France” for “Italy” → “capital of Italy”

walking - walk + run ≈ running
better - good + bad ≈ worse

Explanation: Extract grammatical transformation, apply to new word

cat - kitten + puppy ≈ dog
hot - cold + wet ≈ dry

Explanation: Opposite relationships preserved

def vector_arithmetic_search(
positive: List[str], # Words to add
negative: List[str], # Words to subtract
topn: int = 5
) -> List[Tuple[str, float]]:
"""
Perform vector arithmetic and find closest words.
Example:
vector_arithmetic_search(
positive=["king", "woman"],
negative=["man"]
)
→ Returns words close to: king - man + woman
"""
# Generate embeddings
positive_embs = [model.encode(word) for word in positive]
negative_embs = [model.encode(word) for word in negative]
# Vector arithmetic
result = np.sum(positive_embs, axis=0) - np.sum(negative_embs, axis=0)
# Find closest words in vocabulary
similarities = []
for word in vocabulary:
if word in positive or word in negative:
continue # Skip input words
word_emb = model.encode(word)
sim = cosine_similarity(result, word_emb)
similarities.append((word, sim))
# Return top-n
return sorted(similarities, key=lambda x: x[1], reverse=True)[:topn]
# Test
results = vector_arithmetic_search(
positive=["king", "woman"],
negative=["man"]
)
print("king - man + woman ≈")
for word, score in results:
print(f" {score:.3f} - {word}")

Expected output:

king - man + woman ≈
0.921 - queen
0.847 - monarch
0.812 - princess
0.789 - empress
0.756 - duchess

The Heureka Moment: Math literally transforms meaning!

** Experience this yourself NOW: 01_vector_arithmetic.py - this is when it all clicks!**


The famous “king - man + woman = queen” example was first demonstrated in the Word2Vec paper (Mikolov et al., 2013). It shocked the NLP community and proved that embeddings capture deep semantic relationships, not just word co-occurrence!


In semantic space, proximity = similarity:

# Words about food cluster together
embedding("pizza") ≈ embedding("pasta") ≈ embedding("spaghetti")
# Words about programming cluster together
embedding("Python") ≈ embedding("JavaScript") ≈ embedding("coding")
# Unrelated words are distant
distance(embedding("pizza"), embedding("Python")) → LARGE

Parallel vectors = analogous relationships:

king → queen (same direction as) man → woman
male → female (gender transformation)
Paris → France (same direction as) Rome → Italy
capital → country (geopolitical relationship)

Visualization:

queen •
king •
woman •
man •

The arrows point in the same direction! That’s why king - man + woman ≈ queen.

Words about similar topics cluster:

Cluster 1 (Programming):
• Python
• JavaScript
• coding
• programming
• software
Cluster 2 (Food):
• pizza
• pasta
• spaghetti
• cooking
• recipe
Cluster 3 (Animals):
• dog
• cat
• puppy
• kitten
• pet

We can discover these clusters automatically using k-means!

** Build production search: 02_production_search.py shows how to scale to millions of vectors!**


Simple version (Module 9):

Query → Embedding → Compare to all docs → Top-K results

Problem: Slow for large datasets (1M+ documents)

Production version:

Offline:
Documents → Embeddings → Index (HNSW, IVF)
Online:
Query → Embedding → ANN Search → Top-K results

Compare query to every document:

def naive_search(query: str, embeddings: dict, top_k: int = 5):
"""
Brute force search - compare to ALL documents.
Time complexity: O(N) where N = number of documents
"""
query_emb = model.encode(query)
# Calculate similarity to ALL documents
scores = [
(doc_id, cosine_similarity(query_emb, emb))
for doc_id, emb in embeddings.items()
]
# Sort and return top-K
return sorted(scores, key=lambda x: x[1], reverse=True)[:top_k]

Performance:

  • 1,000 docs → 10ms (fine!)
  • 100,000 docs → 1,000ms (slow!)
  • 1,000,000 docs → 10,000ms (unacceptable!)

We need Approximate Nearest Neighbor (ANN) search!

Trade accuracy for speed: Find approximately the closest matches very fast.

Key algorithms:

  1. HNSW (Hierarchical Navigable Small World)

    • Fast: 1M docs in <1ms!
    • High recall: Finds 95-99% of true nearest neighbors
    • Most popular for production
  2. IVF (Inverted File Index)

    • Partitions space into clusters
    • Search only relevant clusters
    • Good for very large datasets
  3. LSH (Locality-Sensitive Hashing)

    • Hash similar items to same buckets
    • Probabilistic guarantees
    • Simpler, but less accurate

Idea: Build a multi-layer graph where each layer skips more nodes.

Layer 2: •────────────────────────────• (sparse, long jumps)
\ /
Layer 1: •────•────•────────•────• (medium density)
\ \ / / /
Layer 0: •─•─•─•─•─•─•─•─•─•─•─•─• (dense, all nodes)
Search: Start at top layer, jump quickly to approximate region,
then descend to lower layers for precision.

Performance: O(log N) instead of O(N)!

Section titled “Using FAISS (Facebook AI Similarity Search)”

FAISS is Facebook’s library for fast vector search:

import faiss
import numpy as np
# Prepare embeddings matrix (N x D)
embeddings_matrix = np.array(list(embeddings.values())).astype('float32')
dimension = embeddings_matrix.shape[1]
# Build HNSW index
index = faiss.IndexHNSWFlat(dimension, 32) # 32 = number of neighbors
index.add(embeddings_matrix)
# Search
query_emb = model.encode(query).astype('float32').reshape(1, -1)
distances, indices = index.search(query_emb, k=5)
# Get results
results = [
(list(embeddings.keys())[idx], dist)
for idx, dist in zip(indices[0], distances[0])
]

Speedup: 100-1000x faster than brute force!


Traditional databases (PostgreSQL, MySQL):

  • Optimized for exact matches and ranges
  • Poor at similarity search
  • No ANN indexing

Vector databases specialize in embeddings:

  • Built-in ANN indexing (HNSW, IVF)
  • Metadata filtering
  • Scalability (billions of vectors)
  • Hybrid search (vector + metadata)
DatabaseOpen SourceCloudBest For
QdrantGeneral purpose, Rust performance
WeaviateGraphQL API, multi-modal
MilvusScale (billions of vectors)
PineconeManaged, easy to use
ChromaLightweight, embeddings

We’ll use Qdrant in Module 14!

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams
# Create client
client = QdrantClient(":memory:") # Or URL for production
# Create collection
client.create_collection(
collection_name="documents",
vectors_config=VectorParams(size=384, distance=Distance.COSINE)
)
# Add documents
for doc_id, embedding in embeddings.items():
client.upsert(
collection_name="documents",
points=[{
"id": doc_id,
"vector": embedding,
"payload": {"text": documents[doc_id]["text"]}
}]
)
# Search
results = client.search(
collection_name="documents",
query_vector=query_embedding,
limit=5
)
for result in results:
print(f"{result.score:.3f} - {result.payload['text']}")

Benefits:

  • Fast (ANN indexing)
  • Scalable (distributed)
  • Metadata filtering
  • Production-ready

Problems at scale:

  1. Indexing time (generating embeddings)
  2. Index size (storing vectors)
  3. Query latency (searching)
  4. Updates (adding/removing documents)
# DON'T: Sequential encoding
embeddings = [model.encode(doc) for doc in documents] # SLOW
# DO: Batch encoding
embeddings = model.encode(documents, batch_size=32) # FAST
# Speedup: 10-50x faster!

Smaller embeddings = faster search, less storage:

from sklearn.decomposition import PCA
# Reduce from 384 to 128 dimensions
pca = PCA(n_components=128)
reduced_embeddings = pca.fit_transform(embeddings)
# Storage: 66% reduction
# Speed: 3x faster
# Accuracy: ~5% loss (acceptable for many use cases)

Store embeddings in lower precision:

# Float32 (default): 4 bytes per dimension
embeddings_f32 = embeddings.astype('float32')
# Float16 (half precision): 2 bytes per dimension
embeddings_f16 = embeddings.astype('float16')
# Int8 (8-bit): 1 byte per dimension
embeddings_i8 = (embeddings * 127).astype('int8')
# Storage: 75% reduction (float32 → int8)
# Accuracy: <1% loss

Shard data across multiple servers:

Query
Load Balancer
├─→ Shard 1 (docs 0-1M)
├─→ Shard 2 (docs 1M-2M)
└─→ Shard 3 (docs 2M-3M)
Merge top-K from each shard
Return results

Vector databases handle this automatically!


Did You Know? The Vector Search Revolution

Section titled “Did You Know? The Vector Search Revolution”

In October 2019, Google made the biggest change to search in five years: BERT for search.

The problem: Traditional keyword search was failing. Queries like “2019 brazil traveler to usa need a visa” returned results about US citizens going to Brazil—the exact opposite!

The solution: Use BERT embeddings to understand query intent, not just keywords.

The impact:

  • Affected 10% of all search queries (billions per day!)
  • 30% improvement in query understanding
  • Pandu Nayak (Google VP): “The biggest leap forward in 5 years”

Technical feat: Running BERT inference on billions of queries required custom TPU hardware. Google invested $1B+ in infrastructure just for this feature.

The lesson: Vector search isn’t a toy—it powers the world’s most important search engine.

In 2017, Facebook open-sourced FAISS (Facebook AI Similarity Search), and it changed everything.

The backstory: Facebook needed to search billions of images for copyright violations and similar content. Traditional databases couldn’t handle vectors at that scale.

Matthijs Douze and Hervé Jégou (Facebook AI Research) built FAISS to search 1 billion vectors in milliseconds.

The magic: FAISS implements:

  • HNSW (Hierarchical Navigable Small Worlds)
  • IVF (Inverted File Index)
  • PQ (Product Quantization)

All optimized with SIMD instructions for maximum CPU performance.

The impact:

  • Downloaded 10M+ times
  • Powers similarity search at: Pinterest, Spotify, Shopify
  • Became the de facto standard for vector search
  • Every major vector database uses FAISS concepts

The irony: Facebook gave away the technology that would power competitors’ recommendation systems.

In 2019, a startup called Pinecone made a bet: What if vector search was a managed service?

Founder: Edo Liberty, former Director of Research at Amazon (where he built Amazon’s internal vector search).

The insight: Every company was building the same vector infrastructure from scratch. What if they didn’t have to?

The growth:

  • 2021: $10M seed (stealth mode)
  • 2022: $28M Series A
  • 2023: $100M Series B at $750M valuation
  • 2024: Powers 100,000+ apps, processes 1 billion+ queries/day

Famous users: Shopify (product search), Notion (AI features), Zapier (workflow automation)

The competition it sparked: Weaviate ($50M), Qdrant ($28M), Chroma ($18M). The “vector database wars” of 2023-2024 created a billion-dollar category from nothing.

The algorithm powering modern vector search was invented in 2016 by Yury Malkov and Dmitry Yashunin at Yandex (Russia’s Google).

The paper: “Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs”

The idea: Build a multi-layer graph where each layer has fewer, more spread-out nodes. Start at the top (sparse), zoom to the bottom (dense).

Why it works: O(log N) search instead of O(N). Finding similar vectors among 1 billion takes <1 millisecond.

The adoption curve:

  • 2016: Paper published, mostly ignored
  • 2018: Spotify adopts for music recommendations
  • 2019: FAISS adds HNSW implementation
  • 2020-2023: Becomes the default in all vector databases

The lesson: Sometimes the best ideas take years to find their audience.

How does Spotify’s Discover Weekly playlist work? Embeddings and vector search.

The system:

  1. Every song has a 128-dimension embedding (learned from listening patterns)
  2. Every user has an embedding (average of songs they play)
  3. Every Monday: Find songs closest to user embedding that user hasn’t heard
  4. That’s your playlist!

The numbers:

  • 500M+ users receive personalized playlists
  • 40 million songs indexed in vector space
  • Processes 10 billion+ similarity searches per week
  • 30% of all listening comes from algorithmic recommendations

The engineering: Spotify uses a custom HNSW implementation running on thousands of machines, updated daily.

Why it works: Songs that “vibe” together are close in embedding space. Your taste is a point in that space. Recommendations = nearest neighbors to your taste.

SystemDocumentsLatencyHardware Cost
Brute Force (1M docs)1M1,000ms$0
FAISS HNSW (1M docs)1M1ms$0
Brute Force (1B docs)1B1,000,000ms$0
FAISS HNSW (1B docs)1B10ms~$50K/year

The insight: ANN algorithms are 1000x faster and that gap grows with scale.

Cost comparison (for 1B vectors):

  • Self-hosted FAISS: ~$50K/year (GPU servers)
  • Pinecone managed: ~$70K/year (no ops overhead)
  • Build from scratch: ~$500K+ (engineering time)

Most companies choose managed services because engineering time costs more than cloud bills.


# DON'T: Embed on every query
def search(query):
query_emb = model.encode(query)
doc_embs = [model.encode(doc) for doc in documents] # WASTEFUL!
# ...
# DO: Embed documents once, cache
doc_embeddings = {doc: model.encode(doc) for doc in documents}
def search(query):
query_emb = model.encode(query)
# Use precomputed doc_embeddings
# ...
# Track search relevance
def log_search(query, results, user_clicked):
"""Log which results users actually clicked."""
metrics.log({
"query": query,
"results": results,
"clicked_rank": user_clicked, # 1 = first result, etc.
"timestamp": now()
})
# Analyze: Are users clicking top results?
# If not, embeddings might not be working well!

Combine embeddings with other signals:

def hybrid_search(query, filters=None):
"""Combine semantic + metadata + popularity."""
# 1. Semantic similarity
query_emb = model.encode(query)
semantic_scores = compute_similarity(query_emb)
# 2. Metadata filtering (if any)
if filters:
semantic_scores = apply_filters(semantic_scores, filters)
# 3. Rerank by popularity, recency, etc.
final_scores = combine_signals(
semantic=semantic_scores,
popularity=get_popularity(),
recency=get_recency(),
weights=[0.7, 0.2, 0.1] # Tune these!
)
return get_top_k(final_scores)
# Test different embedding models
configs = {
"control": {"model": "all-MiniLM-L6-v2", "threshold": 0.5},
"variant_a": {"model": "all-mpnet-base-v2", "threshold": 0.5},
"variant_b": {"model": "all-MiniLM-L6-v2", "threshold": 0.6},
}
# Assign users randomly
user_config = configs[hash(user_id) % len(configs)]
# Track metrics per config
# → Choose best performing config
def robust_search(query):
"""Try semantic search, fallback if it fails."""
try:
# Try semantic search
results = semantic_search(query)
# If no good matches, fallback
if max(result.score for result in results) < 0.3:
return keyword_search(query)
return results
except Exception as e:
# Log error
logger.error(f"Semantic search failed: {e}")
# Fallback to keyword search
return keyword_search(query)

Current: Keyword retrieval Enhanced: Semantic search

# Before: Keyword matching
def retrieve_context(query):
# BM25 or simple keyword matching
return keyword_match(query, documents)
# After: Semantic search
def retrieve_context(query):
# Semantic understanding
query_emb = model.encode(query)
scores = [cosine_similarity(query_emb, doc_emb) for doc_emb in doc_embeddings]
top_docs = get_top_k(scores, k=5)
return top_docs
# Result: Better context → better answers!

Use case: Students explore learning materials

def explore_similar_lessons(lesson_id):
"""Find lessons similar to current lesson."""
lesson_emb = lesson_embeddings[lesson_id]
similarities = [
(other_id, cosine_similarity(lesson_emb, lesson_embeddings[other_id]))
for other_id in lesson_embeddings
if other_id != lesson_id
]
# Return top 5 similar lessons
return sorted(similarities, key=lambda x: x[1], reverse=True)[:5]

Use case: Group related news articles

from sklearn.cluster import KMeans
def cluster_daily_news(articles):
"""Cluster today's financial news by topic."""
# Embed articles
embeddings = [
model.encode(article["title"] + " " + article["summary"])
for article in articles
]
# Cluster into topics
n_clusters = 5
kmeans = KMeans(n_clusters=n_clusters)
labels = kmeans.fit_predict(embeddings)
# Group articles by cluster
clusters = {i: [] for i in range(n_clusters)}
for article, label in zip(articles, labels):
clusters[label].append(article)
return clusters

Use case: Semantic search across runbooks

# Index all documentation
docs = load_infrastructure_docs()
doc_embeddings = {doc["path"]: model.encode(doc["content"]) for doc in docs}
# Engineer asks: "How do I scale the database?"
query = "How do I scale the database?"
query_emb = model.encode(query)
# Find relevant runbooks
results = sorted(
[(path, cosine_similarity(query_emb, emb)) for path, emb in doc_embeddings.items()],
key=lambda x: x[1],
reverse=True
)[:5]
# Show relevant documentation
for path, score in results:
print(f"{score:.3f} - {path}")

  • Word2Vec (2013): Paper - Started the embedding revolution
  • GloVe (2014): Paper - Global Vectors for word representation
  • HNSW (2016): Paper - Fast ANN algorithm
  • BERT (2018): Paper - Contextual embeddings
  • FAISS: GitHub - Facebook’s similarity search
  • Annoy: GitHub - Spotify’s ANN library
  • Hnswlib: GitHub - Fast HNSW implementation
  • Qdrant: Website - Vector database (Module 14!)
  • MTEB: Massive Text Embedding Benchmark - Compare embedding models
  • ANN-Benchmarks: Compare ANN algorithms on speed/accuracy

What you learned:

  • Embeddings create semantic space where proximity = similarity
  • Vector arithmetic works on meaning: king - man + woman ≈ queen
  • Visualizing embeddings reveals structure and relationships
  • Production search needs ANN algorithms (HNSW, IVF)
  • Vector databases specialize in embedding search
  • Scaling requires batching, quantization, sharding

The Heureka Moment : Math works on meaning! You can add, subtract, and transform concepts using vector arithmetic. Semantic space is real - it’s a geometry where relationships between ideas are preserved as spatial relationships between points.

Key formulas:

Semantic similarity = cosine_similarity(emb_1, emb_2)
Vector arithmetic = Σ(positive_embeddings) - Σ(negative_embeddings)
Distance in space ∝ Semantic distance

Key insights:

  1. Embeddings are coordinates in semantic space
  2. Distance measures similarity between concepts
  3. Direction encodes relationships (king→queen ≈ man→woman)
  4. Clusters reveal topics automatically
  5. Math transforms meaning (algebra on ideas!)

Next module: Module 11: Introduction to RAG (Retrieval-Augmented Generation)

Now that you understand semantic search deeply, you’re ready to build RAG systems!

You’ll learn:

  • What is RAG and why it’s transformative
  • Combining retrieval (Module 10) with generation (Module 8)
  • Building production RAG pipelines
  • RAG for kaizen, vibe, contrarian

Phase 2 Complete!

You’ve mastered:

  • LLM fundamentals (Module 6)
  • Tokenization (Module 7)
  • Text generation (Module 8)
  • Embeddings (Module 9)
  • Vector spaces & semantic search (Module 10)

Ready for Phase 3: Building with AI Toolkits!


** Neural Dojo - Math works on meaning! **


Last updated: 2025-11-21 Module 10: Vector Spaces & Semantic Search _ Heureka Moment achieved!_