Stars
AgentKernelArena provides an end-to-end siloed-benchmarking environment where different LLM-powered agents—such as Cursor Agent, Claude Code, Codex, SWE-agent, and GEAK—can be evaluated side-by-sid…
A lightweight, general-purpose framework for evaluating GPU kernel correctness and performance.
Banishing LLM Hallucinations Requires Rethinking Generalization
Personal project on evaluating deep generative models (inspired by deep image prior)
Distributed Asynchronous Hyperparameter Optimization in Python
Conda-autoenv: Environments by Directory for Conda. My PyPI package!
PHP code that analyzes Latin and Greek words' parts of speech, tenses, genders, moods, etc.
Lookup Latin/Greek vocabulary from the command line using Python/Beautiful Soup.





