RepoMentor 🤖

Your AI GitHub Assistant – Ask questions about GitHub repositories and get intelligent answers powered by advanced AI.

📋 Overview

Repomentor is an intelligent AI-powered assistant that helps developers understand and interact with GitHub repositories. Using advanced natural language processing and hybrid search capabilities, aihero can answer questions about repository content, provide code explanations, and assist with development workflows.

Key Features

🤖 AI-Powered Q&A: Get intelligent answers about repository content using Google Gemini AI.
🔍 Hybrid Search: Combines text-based and semantic search for accurate results.
💬 Interactive Chat: User-friendly Streamlit web interface.
📊 Repository Analysis: Indexes and analyzes GitHub repository data.
🔧 CLI & Web Interfaces: Choose between command-line or graphical interface.
📝 Comprehensive Logging: Track interactions and performance metrics.
🛡️ Error Handling: Robust error handling with user-friendly messages.

Target Audience

Developers looking to understand large codebases quickly.
Open Source Contributors exploring new projects.
Students learning from real-world code examples.
Teams needing quick answers about their repositories.

🚀 Installation

Prerequisites

Python 3.13 or higher
OPENAI API key

Step-by-Step Setup

Clone the repository

git clone https://github.com/itsmuriuki/aihero.git
cd aihero/projects

Install dependencies

# Using poetry (recommended)
poetry install

# Or using pip
pip install -r requirements.txt

Set up environment variables Create a .env file in the project root:
```
OPENAI_API_KEY=your_gemini_api_key_here
```
Run the application

Web Interface (Recommended):
```
streamlit run app.py
```
Open http://localhost:8501 in your browser.

Command Line Interface:
```
python main.py
```

🎮 Usage

Web Interface

The Streamlit app provides an intuitive chat interface where you can:

Ask questions about GitHub repositories.
Get AI-powered explanations and code insights.
View conversation history.
Access comprehensive error handling.

Example Questions:

"How do I set up a Kafka producer in Python?"
"What are the best practices for error handling?"
"Explain the repository structure and main components."

Command Line Interface

For programmatic or headless usage:

python main.py

Type your questions interactively. Type stop to exit.

Programmatic Usage

import asyncio
from ingest import index_data
from search_agent import init_agent

# Initialize the system
index = index_data("DataTalksClub", "faq")  # Replace with your repo
agent = init_agent(index, "DataTalksClub", "faq")

async def ask_question(question):
    response = await agent.run(user_prompt=question)
    return response.output

answer = asyncio.run(ask_question("How do I run Kafka with Python?"))
print(answer)

📁 Project Structure

aihero/projects
├── app.py               # Streamlit web application
├── main.py              # Command-line interface
├── ingest.py            # Data ingestion and indexing
├── search_agent.py      # Pydantic AI agent configuration
├── search_tools.py      # Search functionality and tools
├── logs.py              # Logging utilities
├── pyproject.toml       # Project configuration and dependencies
├── requirements.txt     # Alternative dependency management
├── .env                 # Environment variables (create this)
├── src/                 # Core modules
│   ├── __init__.py
│   ├── agent_logic.py
│   ├── data_processing.py
│   ├── evaluation.py
│   ├── logging_utils.py
│   └── search_engine.py
├── tests/               # Test suite
│   ├── __init__.py
│   ├── run_tests.py
│   ├── test_*.py
├── logs/                # Generated log files
└── __pycache__/         # Python cache (generated)

🛠️ Technologies Used

Core Dependencies

Pydantic AI – AI agent framework with tool calling.
Google Gemini AI – Large language model for responses.
Streamlit – Web application framework.
minsearch – Lightweight text search engine.

Development Tools

Python 3.13+ – Core programming language.
uv – Fast Python package manager.
pytest – Testing framework.
Jupyter – Interactive development notebooks.

Infrastructure

GitHub API – Repository data access.
Environment Variables – Configuration management.
JSON Logging – Structured interaction logging.

🧪 Evaluations

We evaluate the agent using the following criteria:

instructions_follow: The agent followed the user's instructions
instructions_avoid: The agent avoided doing things it was told not to do
answer_relevant: The response directly addresses the user's question
answer_clear: The answer is clear and correct
answer_citations: The response includes proper citations or sources when required
completeness: The response is complete and covers all key aspects of the request
tool_call_search: Is the search tool invoked?

We do this in two steps:

First, we generate synthetic questions (see eval/data-gen.ipynb)
Next, we run our agent on the generated questions and check the criteria (see eval/evaluations.ipynb)

Current evaluation metrics

instructions_follow    79.3
instructions_avoid     96.3
answer_relevant        100.0
answer_clear           100.0
answer_citations       74.1
completeness           100.0
tool_call_search       88.9

The most important metric for this project is answer_relevant. This measures whether the system's answer is relevant to the user. It's currently 100%, meaning all answers were relevant.

🤝 Contributing

We welcome contributions!

Development Setup

Fork the repository.
Create a feature branch:
```
git checkout -b feature/your-feature
```
Install development dependencies:
```
poetry install
```
Run tests:
```
python tests/run_tests.py
```

Guidelines

Code Style: Follow PEP 8 standards.
Testing: Add tests for new features.
Documentation: Update README for significant changes.
Commits: Use clear, descriptive commit messages.

Pull Request Process

Ensure all tests pass.
Update documentation if needed.
Create a pull request with a clear description.
Wait for review and address feedback.

📄 License

This project is licensed under the MIT License – see the LICENSE file for details.

The MIT License allows for:

✅ Commercial use
✅ Modification
✅ Distribution
✅ Private use
⚠️ No liability or warranty

🙏 Acknowledgments

Core Technologies

Google Gemini AI – Powerful language model.
Pydantic AI – Agent framework.
Streamlit – Web interface framework.
minsearch – Efficient text search functionality.

Inspiration & Resources

DataTalksClub – FAQ repository used in demos.
Open Source Community – Amazing tools and libraries.
AI Research Community – Advancing AI assistants.

Special Thanks

Built with ❤️ for developers who want to understand codebases faster.
Special acknowledgment to the AI agent development community.

Support

GitHub Issues: Report bugs or request features
GitHub Discussions: Ask questions and join discussions
Documentation: Check this README and inline code documentation.

Author

Gerald Muriuki

GitHub: @itsmuriuki
Repository: RepoMentor

Transform how you understand and interact with GitHub repositories 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.devcontainer		.devcontainer
course		course
projects		projects
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

RepoMentor 🤖

📋 Overview

Key Features

Target Audience

🚀 Installation

Prerequisites

Step-by-Step Setup

🎮 Usage

Web Interface

Command Line Interface

Programmatic Usage

📁 Project Structure

🛠️ Technologies Used

Core Dependencies

Development Tools

Infrastructure

🧪 Evaluations

Current evaluation metrics

🤝 Contributing

Development Setup

Guidelines

Pull Request Process

📄 License

🙏 Acknowledgments

Core Technologies

Inspiration & Resources

Special Thanks

Support

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages