Skip to content

🧠 Ultimate self-hosted knowledge management system with AI-powered search, knowledge graphs, and conversational interfaces

License

Notifications You must be signed in to change notification settings

Trafexofive/second-brain-stack

Repository files navigation

🧠 Second Brain Stack

Python FastAPI SQLModel License: MIT Docker

A comprehensive, self-hosted knowledge management system built with modern Python microservices architecture

Transform scattered information into an intelligent, searchable, and interconnected knowledge base

Features β€’ Quick Start β€’ Architecture β€’ Roadmap β€’ Contributing


🌟 Overview

Second Brain Stack is the ultimate self-hosted knowledge management platform that combines the power of modern AI with complete data ownership. Built from the ground up with a microservices architecture, it provides intelligent content ingestion, semantic search, knowledge graph construction, and conversational interfaces.

Why Second Brain Stack?

  • 🏠 Self-Hosted: Complete control over your data - no third-party dependencies
  • 🧠 AI-Powered: Semantic search, entity extraction, and intelligent relationships
  • ⚑ Production Ready: Docker containers, monitoring, and scalable architecture
  • πŸ” Multi-Modal Search: Full-text, semantic, and hybrid search capabilities
  • πŸ•ΈοΈ Knowledge Graph: Automatic entity and relationship discovery
  • πŸ’¬ Conversational: Chat with your knowledge base using RAG
  • 🎨 Beautiful Interfaces: Rich CLI, modern web UI, and terminal interfaces
  • 🐳 Cloud Native: Kubernetes ready with comprehensive monitoring

✨ Features

Core Capabilities

Feature Status Description
Document Ingestion βœ… Complete Multi-source ingestion (filesystem, web, git, APIs)
Content Processing βœ… Complete Automatic deduplication, metadata extraction
Full-Text Search βœ… Complete SQLite FTS5 powered search
Vector Search πŸ”„ In Progress Semantic search with embeddings
Knowledge Graph πŸ“‹ Planned Entity extraction and relationship mapping
Chat Interface πŸ“‹ Planned RAG-powered conversational AI
Web Interface πŸ“‹ Planned Modern FastAPI + HTMX frontend
API Gateway πŸ“‹ Planned Unified API with authentication

Interface Options

  • πŸ–₯️ Rich CLI: Beautiful terminal interface with progress bars and tables
  • πŸ“± Terminal UI: Full-featured TUI with Textual framework
  • 🌐 Web Interface: Modern responsive web app
  • πŸ”Œ REST API: Complete RESTful API for integrations
  • πŸ’¬ Chat Interface: Interactive conversational experience

Data Sources

  • πŸ“ Filesystem: Documents, code, notes, PDFs
  • 🌐 Web Scraping: Websites, blogs, articles
  • πŸ“š Git Repositories: Code analysis with commit history
  • πŸ”— External APIs: Third-party integrations
  • πŸ“Š Databases: SQL and NoSQL database connectors

πŸš€ Quick Start

Prerequisites

  • Python 3.9+
  • Docker & Docker Compose (optional)
  • Git

Installation

# Clone the repository
git clone https://github.com/your-username/second-brain-stack.git
cd second-brain-stack

# Set up development environment
make setup-dev
source .venv/bin/activate

# Create configuration
make create-sample-config

# Initialize database
python -m interfaces.cli db init

Basic Usage

# Ingest documents from filesystem
python -m interfaces.cli ingest add --source filesystem --path ~/documents --recursive

# Search your knowledge base
python -m interfaces.cli search query "machine learning algorithms"

# Start interactive chat
python -m interfaces.cli chat

# View database statistics
python -m interfaces.cli db stats

# Configuration management
python -m interfaces.cli config show

Docker Deployment

# Development environment
make run-dev

# Production deployment
make deploy-prod

# View logs
make logs

πŸ—οΈ Architecture

Second Brain Stack follows a modern microservices architecture designed for scalability and maintainability.

graph TB
    subgraph "User Interfaces"
        CLI[Rich CLI]
        WEB[Web Interface]
        TUI[Terminal UI]
    end
    
    subgraph "API Gateway"
        GW[FastAPI Gateway]
        AUTH[Authentication]
    end
    
    subgraph "Core Services"
        INGEST[Ingestion Service]
        SEARCH[Search Service]
        KNOWLEDGE[Knowledge Service]
        CHAT[Chat Service]
    end
    
    subgraph "Data Layer"
        DB[(SQLite Database)]
        VECTORS[(Vector Storage)]
        CACHE[(Redis Cache)]
    end
    
    subgraph "External Sources"
        FS[Filesystem]
        WS[Web Sources]
        GIT[Git Repos]
        API[External APIs]
    end
    
    CLI --> GW
    WEB --> GW
    TUI --> GW
    
    GW --> INGEST
    GW --> SEARCH
    GW --> KNOWLEDGE
    GW --> CHAT
    
    INGEST --> DB
    SEARCH --> DB
    SEARCH --> VECTORS
    KNOWLEDGE --> DB
    CHAT --> DB
    CHAT --> CACHE
    
    INGEST --> FS
    INGEST --> WS
    INGEST --> GIT
    INGEST --> API
Loading

Technology Stack

Component Technology Purpose
Backend FastAPI + Python High-performance async APIs
Database SQLite + FTS5 Primary storage with full-text search
ORM SQLModel Type-safe database operations
Vector Storage sqlite-vec Semantic search capabilities
Caching Redis Session and query caching
CLI Click + Rich Beautiful command-line interface
Web UI FastAPI + HTMX Modern, responsive frontend
Containerization Docker + Compose Easy deployment and scaling
Monitoring Prometheus + Grafana Comprehensive observability

πŸ“Š Current Status

βœ… Completed Features

  • Core Foundation: Database models, configuration system, logging
  • Document Ingestion: Filesystem scanner with content processing
  • CLI Interface: Rich terminal interface with full command set
  • Database Operations: Async SQLite operations with FTS5
  • Configuration Management: YAML-based configuration system
  • Development Environment: Complete Docker setup with monitoring
  • Content Processing: File type detection, deduplication, metadata extraction

πŸ”„ In Progress

  • Vector Embeddings: Sentence transformer integration
  • Semantic Search: Vector similarity search implementation
  • Full-Text Search: FTS5 query optimization
  • Ingestion Service: FastAPI microservice for document processing

πŸ“‹ Upcoming

  • Knowledge Graph: Entity extraction and relationship mapping
  • Chat Interface: RAG-powered conversational AI
  • Web Interface: Modern FastAPI + HTMX frontend
  • Additional Connectors: Web scraping, Git analysis, API integrations

πŸ›£οΈ Roadmap

Phase 1: Foundation (Q4 2024) βœ…

  • Core database models and operations
  • Configuration and logging systems
  • CLI interface with Rich styling
  • Filesystem ingestion pipeline
  • Docker containerization
  • Development environment setup

Phase 2: Intelligence Layer (Q1 2025) πŸ”„

  • Vector embeddings with sentence-transformers
  • Semantic search implementation
  • Knowledge graph construction
  • Entity and relationship extraction
  • Search result ranking and relevance

Phase 3: Services Architecture (Q2 2025) πŸ“‹

  • Complete microservices implementation
  • API Gateway with authentication
  • Search service with hybrid capabilities
  • Knowledge service for graph operations
  • Service discovery and health monitoring

Phase 4: User Interfaces (Q3 2025) πŸ“‹

  • Modern web interface with FastAPI + HTMX
  • Full-featured Terminal UI with Textual
  • Mobile-responsive design
  • Real-time collaboration features
  • Advanced search filters and faceting

Phase 5: Advanced Features (Q4 2025) πŸ“‹

  • Conversational AI with RAG
  • Multi-modal content support (images, audio)
  • Advanced analytics and insights
  • Plugin architecture for extensibility
  • Enterprise features and SSO

Phase 6: Scale & Polish (2026) πŸ“‹

  • Kubernetes deployment manifests
  • Advanced monitoring and alerting
  • Performance optimizations
  • Mobile applications
  • Community marketplace for plugins

πŸ”§ Configuration

Second Brain Stack uses a flexible YAML-based configuration system:

database:
  path: "storage/brain.db"
  wal_mode: true
  fts_enabled: true

services:
  ingestion:
    port: 8001
    workers: 4
  search:
    port: 8002
    embedding_model: "sentence-transformers/all-MiniLM-L6-v2"
  knowledge:
    port: 8003
    entity_model: "en_core_web_sm"

embeddings:
  model_path: "storage/models/"
  cache_size: 1000
  batch_size: 32

connectors:
  supported_file_types: [".txt", ".md", ".pdf", ".py"]
  max_file_size: "50MB"

πŸ“ˆ Performance & Scale

  • Document Processing: 1000+ documents/minute
  • Search Performance: Sub-100ms query response times
  • Storage Efficiency: Deduplication and compression
  • Memory Usage: Configurable caching and batch processing
  • Concurrent Users: Supports multiple simultaneous operations

πŸ›‘οΈ Security & Privacy

  • Data Ownership: Complete control over your information
  • Local Processing: No data sent to external services
  • Access Control: Authentication and authorization ready
  • Encryption: At-rest and in-transit encryption support
  • Audit Logging: Comprehensive activity tracking

🀝 Contributing

We welcome contributions! Here's how to get started:

  1. Fork the repository
  2. Set up development environment: make setup-dev
  3. Create a feature branch: git checkout -b feature/amazing-feature
  4. Make your changes and add tests
  5. Run tests: make test
  6. Submit a pull request

Development Workflow

# Setup development environment
make setup-dev
source .venv/bin/activate

# Run tests
make test

# Code formatting and linting
make format
make lint

# Start development services
make run-dev

# View logs
make logs

πŸ“š Documentation

πŸ› Issues & Support

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • FastAPI: For the amazing async web framework
  • SQLModel: For type-safe database operations
  • Rich: For beautiful terminal interfaces
  • Sentence Transformers: For powerful embeddings
  • SQLite: For reliable local storage

⭐ Show Your Support

If you find Second Brain Stack helpful, please consider:

  • ⭐ Starring the repository
  • 🍴 Forking and contributing
  • πŸ“’ Sharing with your network
  • πŸ’¬ Joining our community discussions

Built with ❀️ for knowledge workers, researchers, and lifelong learners

Get Started β€’ Documentation β€’ Community

About

🧠 Ultimate self-hosted knowledge management system with AI-powered search, knowledge graphs, and conversational interfaces

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published