[Agentic & GenAI Evaluation KDD2025] Benchmark Dataset Generation and Evaluation for Excel Formula Repair with LLMs
This repository contains the benchmark files associated with Benchmark Dataset Generation and Evaluation for Excel Formula Repair with LLMs by Ananya Singha, Harshita Sahijwani, Emmanuel Aboah Boateng, Nicholas Hausman, Vu Le, Tianwei Chen, Sulaiman Vesal, Sadid A. Hasan
We have released the dataset that we used as seed and the one that was generated by our pipeline.
dataset\seed_data.json contains manually annotated data scraped from open-source forums.
dataset\FoRepBenchmarks.json is generated by our pipeline and contains 618 benchmark points.
We will soon release the full codebase along with demo scripts. Stay tuned!