Your AI GitHub Assistant โ Ask questions about GitHub repositories and get intelligent answers powered by advanced AI.
Repomentor is an intelligent AI-powered assistant that helps developers understand and interact with GitHub repositories. Using advanced natural language processing and hybrid search capabilities, aihero can answer questions about repository content, provide code explanations, and assist with development workflows.
- ๐ค AI-Powered Q&A: Get intelligent answers about repository content using Google Gemini AI.
- ๐ Hybrid Search: Combines text-based and semantic search for accurate results.
- ๐ฌ Interactive Chat: User-friendly Streamlit web interface.
- ๐ Repository Analysis: Indexes and analyzes GitHub repository data.
- ๐ง CLI & Web Interfaces: Choose between command-line or graphical interface.
- ๐ Comprehensive Logging: Track interactions and performance metrics.
- ๐ก๏ธ Error Handling: Robust error handling with user-friendly messages.
- Developers looking to understand large codebases quickly.
- Open Source Contributors exploring new projects.
- Students learning from real-world code examples.
- Teams needing quick answers about their repositories.
- Python 3.13 or higher
- OPENAI API key
-
Clone the repository
git clone https://github.com/itsmuriuki/aihero.git cd aihero/projects -
Install dependencies
# Using poetry (recommended) poetry install # Or using pip pip install -r requirements.txt
-
Set up environment variables Create a
.envfile in the project root:OPENAI_API_KEY=your_gemini_api_key_here
-
Run the application
Web Interface (Recommended):
streamlit run app.py
Open http://localhost:8501 in your browser.
Command Line Interface:
python main.py
The Streamlit app provides an intuitive chat interface where you can:
- Ask questions about GitHub repositories.
- Get AI-powered explanations and code insights.
- View conversation history.
- Access comprehensive error handling.
Example Questions:
- "How do I set up a Kafka producer in Python?"
- "What are the best practices for error handling?"
- "Explain the repository structure and main components."
For programmatic or headless usage:
python main.pyType your questions interactively. Type stop to exit.
import asyncio
from ingest import index_data
from search_agent import init_agent
# Initialize the system
index = index_data("DataTalksClub", "faq") # Replace with your repo
agent = init_agent(index, "DataTalksClub", "faq")
async def ask_question(question):
response = await agent.run(user_prompt=question)
return response.output
answer = asyncio.run(ask_question("How do I run Kafka with Python?"))
print(answer)aihero/projects
โโโ app.py # Streamlit web application
โโโ main.py # Command-line interface
โโโ ingest.py # Data ingestion and indexing
โโโ search_agent.py # Pydantic AI agent configuration
โโโ search_tools.py # Search functionality and tools
โโโ logs.py # Logging utilities
โโโ pyproject.toml # Project configuration and dependencies
โโโ requirements.txt # Alternative dependency management
โโโ .env # Environment variables (create this)
โโโ src/ # Core modules
โ โโโ __init__.py
โ โโโ agent_logic.py
โ โโโ data_processing.py
โ โโโ evaluation.py
โ โโโ logging_utils.py
โ โโโ search_engine.py
โโโ tests/ # Test suite
โ โโโ __init__.py
โ โโโ run_tests.py
โ โโโ test_*.py
โโโ logs/ # Generated log files
โโโ __pycache__/ # Python cache (generated)
- Pydantic AI โ AI agent framework with tool calling.
- Google Gemini AI โ Large language model for responses.
- Streamlit โ Web application framework.
- minsearch โ Lightweight text search engine.
- Python 3.13+ โ Core programming language.
- uv โ Fast Python package manager.
- pytest โ Testing framework.
- Jupyter โ Interactive development notebooks.
- GitHub API โ Repository data access.
- Environment Variables โ Configuration management.
- JSON Logging โ Structured interaction logging.
We evaluate the agent using the following criteria:
- instructions_follow: The agent followed the user's instructions
- instructions_avoid: The agent avoided doing things it was told not to do
- answer_relevant: The response directly addresses the user's question
- answer_clear: The answer is clear and correct
- answer_citations: The response includes proper citations or sources when required
- completeness: The response is complete and covers all key aspects of the request
- tool_call_search: Is the search tool invoked?
We do this in two steps:
- First, we generate synthetic questions (see
eval/data-gen.ipynb) - Next, we run our agent on the generated questions and check the criteria (see
eval/evaluations.ipynb)
instructions_follow 79.3
instructions_avoid 96.3
answer_relevant 100.0
answer_clear 100.0
answer_citations 74.1
completeness 100.0
tool_call_search 88.9
The most important metric for this project is answer_relevant. This measures whether the system's answer is relevant to the user. It's currently 100%, meaning all answers were relevant.
We welcome contributions!
-
Fork the repository.
-
Create a feature branch:
git checkout -b feature/your-feature
-
Install development dependencies:
poetry install
-
Run tests:
python tests/run_tests.py
- Code Style: Follow PEP 8 standards.
- Testing: Add tests for new features.
- Documentation: Update README for significant changes.
- Commits: Use clear, descriptive commit messages.
- Ensure all tests pass.
- Update documentation if needed.
- Create a pull request with a clear description.
- Wait for review and address feedback.
This project is licensed under the MIT License โ see the LICENSE file for details.
The MIT License allows for:
- โ Commercial use
- โ Modification
- โ Distribution
- โ Private use
โ ๏ธ No liability or warranty
- Google Gemini AI โ Powerful language model.
- Pydantic AI โ Agent framework.
- Streamlit โ Web interface framework.
- minsearch โ Efficient text search functionality.
- DataTalksClub โ FAQ repository used in demos.
- Open Source Community โ Amazing tools and libraries.
- AI Research Community โ Advancing AI assistants.
- Built with โค๏ธ for developers who want to understand codebases faster.
- Special acknowledgment to the AI agent development community.
- GitHub Issues: Report bugs or request features
- GitHub Discussions: Ask questions and join discussions
- Documentation: Check this README and inline code documentation.
Gerald Muriuki
- GitHub: @itsmuriuki
- Repository: RepoMentor
Transform how you understand and interact with GitHub repositories ๐