Tutorial 4 of 757% complete
All tutorials
5 minBeginner

Semantic Search

Find past decisions effectively

Semantic Search: Finding What You Need

Duration: 5 minutes | Difficulty: Beginner

The Problem with Keyword Search

You logged a decision about "choosing PostgreSQL for ACID transactions."

Six months later, you search for "database reliability" and find... nothing.

Keyword search fails because you used different words.

How Continuity Search Works

Continuity uses hybrid search: keyword matching + semantic similarity.

┌─────────────────────────────────────────────────────────────┐
│                    HYBRID SEARCH                             │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Your Query: "database reliability"                          │
│                                                              │
│  ┌─────────────────┐    ┌─────────────────────┐             │
│  │ Keyword Match   │    │ Semantic Match      │             │
│  │ (40% weight)    │    │ (60% weight)        │             │
│  │                 │    │                     │             │
│  │ Finds:          │    │ Finds:              │             │
│  │ - "database"    │    │ - PostgreSQL ACID   │             │
│  │ - "reliability" │    │ - data consistency  │             │
│  │                 │    │ - transaction safety │             │
│  └─────────────────┘    └─────────────────────┘             │
│                                                              │
│  Combined Results: PostgreSQL decision found!                │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Basic Search

output
@continuity search_decisions query="authentication"

Returns decisions matching "authentication" by:

  • Exact keyword match
  • Tag match
  • Semantic similarity (MiniLM embeddings)

Search with Filters

Limit results

output
@continuity search_decisions query="database" limit=5

Minimum similarity score

output
@continuity search_decisions query="performance" minScore=0.5

Score ranges:

  • 0.8+ - Exact or near-exact match
  • 0.5-0.8 - Strongly related
  • 0.3-0.5 - Loosely related
  • <0.3 - Probably not relevant

Understanding Search Results

json
{
  "decisions": [
    {
      "id": "decision-123",
      "question": "Why PostgreSQL for user data?",
      "answer": "ACID transactions required for payment reliability...",
      "score": 0.82,
      "matchType": "semantic"
    },
    {
      "id": "decision-89",
      "question": "Why add database connection pooling?",
      "answer": "PostgreSQL max_connections limit...",
      "score": 0.67,
      "matchType": "hybrid"
    }
  ],
  "query": "database reliability",
  "totalResults": 2
}

Match Types

TypeMeaning
keywordFound by exact word match
semanticFound by meaning similarity
tagFound by tag match
hybridFound by multiple methods

Semantic Search Examples

Find conceptually similar decisions

Query: "caching strategy" Finds:

  • "Why use Redis?" (mentions caching)
  • "Why add CDN?" (conceptually related to caching)
  • "Why implement memoization?" (type of caching)

Find decisions by problem, not solution

Query: "slow page loads" Finds:

  • "Why add Redis caching?" (solves slow loads)
  • "Why optimize database queries?" (solves slow loads)
  • "Why use lazy loading?" (solves slow loads)

Find related but different concepts

Query: "security best practices" Finds:

  • "Why bcrypt for passwords?" (security)
  • "Why JWT over sessions?" (authentication security)
  • "Why rate limiting?" (API security)

How Embeddings Work

Continuity uses MiniLM (all-MiniLM-L6-v2) for semantic embeddings:

  1. Text → Vector: Your decision text becomes a 384-dimensional vector
  2. Query → Vector: Your search query becomes a vector
  3. Cosine Similarity: Measures angle between vectors
  4. Ranking: Higher similarity = better match

All processing is 100% local. No API calls, no privacy concerns.

Optimizing Your Searches

Be specific

output
❌ "database"
✅ "database performance optimization"

Use natural language

output
❌ "auth jwt session"
✅ "why did we choose JWT over sessions"

Search by problem, not solution

output
❌ "Redis"
✅ "caching for high traffic"

Checking Embedding Coverage

All decisions should have embeddings for semantic search to work:

output
@continuity get_quick_context

If semantic search isn't finding related decisions:

  1. Check embedding coverage - Older decisions might lack embeddings
  2. Regenerate embeddings - Run the embedding generation script
  3. Verify MiniLM is loaded - Check MCP logs for transformer initialization

Search vs Browse

When to Search

  • Looking for specific topic
  • Need to find past decision quickly
  • Checking if decision already exists

When to Browse

  • Exploring decision history
  • Understanding project evolution
  • Finding patterns across decisions
output
# Browse all decisions
@continuity list_all_decisions limit=20

# Browse by subject
@continuity get_decisions_by_subject subject="authentication"

# Browse timeline
Cmd+Shift+P → "Continuity: Show Timeline"

Real Search Scenarios

Scenario 1: Before adding a dependency

output
@continuity search_decisions query="state management library"

Find if you already decided on Redux, Zustand, or Context API.

Scenario 2: Understanding a legacy decision

output
@continuity search_decisions query="why deprecated old auth"

Find the reasoning behind past changes.

Scenario 3: Checking for conflicts

output
@continuity search_decisions query="microservices vs monolith"

See if there's existing architecture philosophy.

Key Takeaways

  1. Hybrid search combines keywords + semantics
  2. MiniLM embeddings enable meaning-based matching
  3. 100% local - no API calls, full privacy
  4. Score thresholds help filter relevance
  5. Natural language queries work best
  6. Search by problem to find solutions

Troubleshooting

Search returns nothing?

  • Try different keywords
  • Lower minScore threshold
  • Check if decisions exist: list_all_decisions

Wrong results?

  • Be more specific in query
  • Add context words
  • Use quotes for exact phrases

Slow search?

  • MiniLM initialization takes ~2-3 seconds first time
  • Subsequent searches are fast (<100ms)

← Best Practices | Session Notes →