SimpleProvider RAG Implementation
Overview
The SimpleProvider offers a lightweight, high-performance RAG implementation using JSON storage with advanced text search algorithms. It provides excellent search quality without the complexity of vector embeddings, making it perfect for small to medium-sized knowledge bases.
Current Architecture
Key Features
β Advanced Text Search - Multi-factor relevance scoring with term frequency, phrase matching, and coverage analysis
β Optimal Performance - O(n log n) sorting and efficient document processing
β VectorProvider Interface - Clean abstraction compatible with the provider registry
β LangChain Integration - Uses LangChain Go for PDF processing and text splitting
β Production Ready - Comprehensive error handling and resource management
β Zero Dependencies - No external vector databases required
β High Performance - Suitable for knowledge bases up to 10,000+ documents
Current Implementation
1. VectorProvider Interface
Located in internal/rag/provider_interface.go:
2. SimpleProvider Implementation
Located in internal/rag/simple_provider.go (435 lines):
3. Provider Registration
SimpleProvider automatically registers itself in the provider factory:
4. MCP Client Integration
Located in internal/rag/client.go (238 lines):
Usage Examples
CLI Usage
Via Slack MCP Tool
Provider Factory Usage
Configuration
Performance Characteristics
SimpleProvider Strengths
Search Algorithm
O(n log n)
Built-in Go sort for optimal performance
Memory Usage
Low
JSON documents loaded into memory once
Startup Time
Fast
No vector index building required
Storage
Minimal
Simple JSON file format
Dependencies
Zero
No external databases or services
Search Quality Features
Multi-Factor Relevance Scoring:
Benefits:
Phrase matching: Exact phrases get highest scores
Term frequency: Common terms weighted appropriately
Coverage analysis: Rewards documents matching more query terms
Partial matching: Finds related terms and substrings
Benefits of SimpleProvider
β
Current Advantages
Zero Setup - No external databases or services required
High Performance - O(n log n) search with advanced scoring algorithms
Production Ready - Comprehensive error handling and resource management
VectorProvider Compatible - Works with the unified provider interface
LangChain Integration - Uses LangChain Go for document processing
Memory Efficient - Documents loaded once, efficient search operations
Portable - Single JSON file, easy backup and migration
Fast Startup - No index building or initialization delays
β
When to Use SimpleProvider
Ideal for:
Small to medium knowledge bases (up to 10,000+ documents)
Development and testing environments
Single-instance deployments without clustering needs
Quick prototyping and proof-of-concept projects
Cost-sensitive scenarios where external services aren't viable
Consider alternatives when:
Knowledge base exceeds 50,000+ documents
Semantic similarity is more important than keyword matching
Multi-language support is required
Distributed/clustered deployment is needed
β
Migration Path
Current State:
SimpleProvider fully implemented and production-ready
Clean VectorProvider interface enables easy provider switching
Provider registry supports multiple implementations
Future Options:
OpenAI Vector Store: Already implemented for semantic search
Local Vector Databases: ChromaDB, FAISS, Qdrant (when needed)
Cloud Vector Stores: Pinecone, Weaviate (for scale)
Hybrid Solutions: Multiple providers with intelligent routing
Migration Process:
Last updated
Was this helpful?