Embeddings

Overview

An embedding is a dense numerical vector that represents meaning. Similar concepts land close together in vector space—"king" and "queen" are nearer than "king" and "car". Models like OpenAI text-embedding-3-small, Cohere embed, or open-source sentence-transformers convert text (words, sentences, documents) into fixed-length arrays of floats.

Embeddings power semantic search, recommendations, clustering, and RAG retrieval. They compress language into coordinates you can compare with math instead of keyword matching.

Syntax / Usage

Core operations:

1. Embed query and documents → vectors (e.g. 1536 dimensions)
2. Compare with cosine similarity or dot product
3. Return top-k nearest neighbors

Cosine similarity (values −1 to 1, higher = more similar):

import math

def cosine_similarity(a: list[float], b: list[float]) -> float:
    dot = sum(x * y for x, y in zip(a, b))
    norm_a = math.sqrt(sum(x * x for x in a))
    norm_b = math.sqrt(sum(x * x for x in b))
    return dot / (norm_a * norm_b)

API call pattern:

const response = await fetch("https://api.openai.com/v1/embeddings", {
  method: "POST",
  headers: { Authorization: `Bearer ${apiKey}`, "Content-Type": "application/json" },
  body: JSON.stringify({
    model: "text-embedding-3-small",
    input: ["How do I reset my password?", "Password reset steps for SSO"],
  }),
});
const { data } = await response.json();
const vectors = data.map((d: { embedding: number[] }) => d.embedding);

Store vectors in pgvector (Supabase/Postgres), Pinecone, Weaviate, or Qdrant for scalable search.

Examples

Semantic FAQ lookup:

User query: "can't log in after changing email"
Top match:  "Updating account email breaks SSO session" (score 0.89)
Weak match: "Billing FAQ" (score 0.31)

Chunking long docs before embedding (500–1000 tokens per chunk with overlap) improves retrieval precision. Include metadata (source URL, title) with each vector for citations in RAG.

Common Mistakes

Embedding entire books as one vector—queries match poorly; chunk instead
Mixing embedding models in one index (dimensions and geometry differ)
Using Euclidean distance without normalizing when cosine is appropriate
Re-embedding the corpus on every query—cache document embeddings
Assuming high similarity always means factual equivalence

Overview

Syntax / Usage

Examples

Common Mistakes

See Also