A comprehensive, self-hosted knowledge management system built with modern Python microservices architecture
Transform scattered information into an intelligent, searchable, and interconnected knowledge base
Features β’ Quick Start β’ Architecture β’ Roadmap β’ Contributing
Second Brain Stack is the ultimate self-hosted knowledge management platform that combines the power of modern AI with complete data ownership. Built from the ground up with a microservices architecture, it provides intelligent content ingestion, semantic search, knowledge graph construction, and conversational interfaces.
- π Self-Hosted: Complete control over your data - no third-party dependencies
- π§ AI-Powered: Semantic search, entity extraction, and intelligent relationships
- β‘ Production Ready: Docker containers, monitoring, and scalable architecture
- π Multi-Modal Search: Full-text, semantic, and hybrid search capabilities
- πΈοΈ Knowledge Graph: Automatic entity and relationship discovery
- π¬ Conversational: Chat with your knowledge base using RAG
- π¨ Beautiful Interfaces: Rich CLI, modern web UI, and terminal interfaces
- π³ Cloud Native: Kubernetes ready with comprehensive monitoring
| Feature | Status | Description |
|---|---|---|
| Document Ingestion | β Complete | Multi-source ingestion (filesystem, web, git, APIs) |
| Content Processing | β Complete | Automatic deduplication, metadata extraction |
| Full-Text Search | β Complete | SQLite FTS5 powered search |
| Vector Search | π In Progress | Semantic search with embeddings |
| Knowledge Graph | π Planned | Entity extraction and relationship mapping |
| Chat Interface | π Planned | RAG-powered conversational AI |
| Web Interface | π Planned | Modern FastAPI + HTMX frontend |
| API Gateway | π Planned | Unified API with authentication |
- π₯οΈ Rich CLI: Beautiful terminal interface with progress bars and tables
- π± Terminal UI: Full-featured TUI with Textual framework
- π Web Interface: Modern responsive web app
- π REST API: Complete RESTful API for integrations
- π¬ Chat Interface: Interactive conversational experience
- π Filesystem: Documents, code, notes, PDFs
- π Web Scraping: Websites, blogs, articles
- π Git Repositories: Code analysis with commit history
- π External APIs: Third-party integrations
- π Databases: SQL and NoSQL database connectors
- Python 3.9+
- Docker & Docker Compose (optional)
- Git
# Clone the repository
git clone https://github.com/your-username/second-brain-stack.git
cd second-brain-stack
# Set up development environment
make setup-dev
source .venv/bin/activate
# Create configuration
make create-sample-config
# Initialize database
python -m interfaces.cli db init# Ingest documents from filesystem
python -m interfaces.cli ingest add --source filesystem --path ~/documents --recursive
# Search your knowledge base
python -m interfaces.cli search query "machine learning algorithms"
# Start interactive chat
python -m interfaces.cli chat
# View database statistics
python -m interfaces.cli db stats
# Configuration management
python -m interfaces.cli config show# Development environment
make run-dev
# Production deployment
make deploy-prod
# View logs
make logsSecond Brain Stack follows a modern microservices architecture designed for scalability and maintainability.
graph TB
subgraph "User Interfaces"
CLI[Rich CLI]
WEB[Web Interface]
TUI[Terminal UI]
end
subgraph "API Gateway"
GW[FastAPI Gateway]
AUTH[Authentication]
end
subgraph "Core Services"
INGEST[Ingestion Service]
SEARCH[Search Service]
KNOWLEDGE[Knowledge Service]
CHAT[Chat Service]
end
subgraph "Data Layer"
DB[(SQLite Database)]
VECTORS[(Vector Storage)]
CACHE[(Redis Cache)]
end
subgraph "External Sources"
FS[Filesystem]
WS[Web Sources]
GIT[Git Repos]
API[External APIs]
end
CLI --> GW
WEB --> GW
TUI --> GW
GW --> INGEST
GW --> SEARCH
GW --> KNOWLEDGE
GW --> CHAT
INGEST --> DB
SEARCH --> DB
SEARCH --> VECTORS
KNOWLEDGE --> DB
CHAT --> DB
CHAT --> CACHE
INGEST --> FS
INGEST --> WS
INGEST --> GIT
INGEST --> API
| Component | Technology | Purpose |
|---|---|---|
| Backend | FastAPI + Python | High-performance async APIs |
| Database | SQLite + FTS5 | Primary storage with full-text search |
| ORM | SQLModel | Type-safe database operations |
| Vector Storage | sqlite-vec | Semantic search capabilities |
| Caching | Redis | Session and query caching |
| CLI | Click + Rich | Beautiful command-line interface |
| Web UI | FastAPI + HTMX | Modern, responsive frontend |
| Containerization | Docker + Compose | Easy deployment and scaling |
| Monitoring | Prometheus + Grafana | Comprehensive observability |
- Core Foundation: Database models, configuration system, logging
- Document Ingestion: Filesystem scanner with content processing
- CLI Interface: Rich terminal interface with full command set
- Database Operations: Async SQLite operations with FTS5
- Configuration Management: YAML-based configuration system
- Development Environment: Complete Docker setup with monitoring
- Content Processing: File type detection, deduplication, metadata extraction
- Vector Embeddings: Sentence transformer integration
- Semantic Search: Vector similarity search implementation
- Full-Text Search: FTS5 query optimization
- Ingestion Service: FastAPI microservice for document processing
- Knowledge Graph: Entity extraction and relationship mapping
- Chat Interface: RAG-powered conversational AI
- Web Interface: Modern FastAPI + HTMX frontend
- Additional Connectors: Web scraping, Git analysis, API integrations
- Core database models and operations
- Configuration and logging systems
- CLI interface with Rich styling
- Filesystem ingestion pipeline
- Docker containerization
- Development environment setup
- Vector embeddings with sentence-transformers
- Semantic search implementation
- Knowledge graph construction
- Entity and relationship extraction
- Search result ranking and relevance
- Complete microservices implementation
- API Gateway with authentication
- Search service with hybrid capabilities
- Knowledge service for graph operations
- Service discovery and health monitoring
- Modern web interface with FastAPI + HTMX
- Full-featured Terminal UI with Textual
- Mobile-responsive design
- Real-time collaboration features
- Advanced search filters and faceting
- Conversational AI with RAG
- Multi-modal content support (images, audio)
- Advanced analytics and insights
- Plugin architecture for extensibility
- Enterprise features and SSO
- Kubernetes deployment manifests
- Advanced monitoring and alerting
- Performance optimizations
- Mobile applications
- Community marketplace for plugins
Second Brain Stack uses a flexible YAML-based configuration system:
database:
path: "storage/brain.db"
wal_mode: true
fts_enabled: true
services:
ingestion:
port: 8001
workers: 4
search:
port: 8002
embedding_model: "sentence-transformers/all-MiniLM-L6-v2"
knowledge:
port: 8003
entity_model: "en_core_web_sm"
embeddings:
model_path: "storage/models/"
cache_size: 1000
batch_size: 32
connectors:
supported_file_types: [".txt", ".md", ".pdf", ".py"]
max_file_size: "50MB"- Document Processing: 1000+ documents/minute
- Search Performance: Sub-100ms query response times
- Storage Efficiency: Deduplication and compression
- Memory Usage: Configurable caching and batch processing
- Concurrent Users: Supports multiple simultaneous operations
- Data Ownership: Complete control over your information
- Local Processing: No data sent to external services
- Access Control: Authentication and authorization ready
- Encryption: At-rest and in-transit encryption support
- Audit Logging: Comprehensive activity tracking
We welcome contributions! Here's how to get started:
- Fork the repository
- Set up development environment:
make setup-dev - Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and add tests
- Run tests:
make test - Submit a pull request
# Setup development environment
make setup-dev
source .venv/bin/activate
# Run tests
make test
# Code formatting and linting
make format
make lint
# Start development services
make run-dev
# View logs
make logs- Bug Reports: GitHub Issues
- Feature Requests: Discussions
- Documentation: Wiki
This project is licensed under the MIT License - see the LICENSE file for details.
- FastAPI: For the amazing async web framework
- SQLModel: For type-safe database operations
- Rich: For beautiful terminal interfaces
- Sentence Transformers: For powerful embeddings
- SQLite: For reliable local storage
If you find Second Brain Stack helpful, please consider:
- β Starring the repository
- π΄ Forking and contributing
- π’ Sharing with your network
- π¬ Joining our community discussions
Built with β€οΈ for knowledge workers, researchers, and lifelong learners
Get Started β’ Documentation β’ Community