AdaPatcher

Paper Title: Adaptive Program Repair with Bug Localization and Preference Learning

Overview of AdaPatcher:(a) Illustration of the Self-Debug Learning process. (b) Illustration of the Hybrid Training for Selective Reference process. (c) Illustration of the Adaptive Preference Learning process.

Installation

Git clone our repository
creating conda environment:

conda create -n LLMenv python=3.10
conda activate LLMenv
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install sympy==1.13.1
pip install --upgrade transformers
conda install psutil
pip install peft
pip install sentencepiece
pip install deepspeed

git clone https://github.com/huggingface/alignment-handbook.git
cd ./alignment-handbook/
python -m pip install .

Dataset download

you can download dataset from https://huggingface.co/datasets/ZhenlongDai/ACPR.

1. test cases

Download it and place it in the directory ./merged_test_cases.

2. Programming problem file

Problem file in "repairDataset/Program_Question_Data/English_Program_Question_StringVersion.json"

3. train/dev/test file

file in the directory "repairDataset/RepairData-PythonLevel/CRFLPDataset/"

Code execution environment construction

View "codeTool/use.md"

LLM weight download

Please download the pre-trained CodeLlama-7b-Instruct-hf weights from huggingFace and put it in the same directory "./CodeLlama-7b-Instruct-hf".

Stage I Training

step 1.firstly need train a base Program Modifier(Location-Aware Repair Learning) for eval bug locator

DATA_FILE="CRFLPDataset"

bash script/pipeline/process_fixbycrflp.sh

step 2.trian bug locator

The script is executed after specific parameter values are specified based on the result of step 1. DATA_FILE="CRFLPDataset"

bash script/pipeline/process_trace_CRFLP.sh

Stage II Training

step 1.firstly need train a Program Modifier with Hybrid Training for Selective Reference

The script is executed after specific parameter values are specified. DATA_FILE="mixCRFLPDataset"

bash script/pipeline/process_SecondFix.sh

step 2. generate perference data by Program Modifier

step 2.1. Merge PEFT weights with base LLM weights

bash script/merge_sft.sh

step 2.2. Adaptive Preference Learning

The script is executed after specific parameter values are specified.

bash script/Prefer/generate_prefer_data.sh

then you should constrcut generation result to training data. you can refer to "./utils/ConstructPerferDataset.py", we screened the data pairs with the highest and lowest consistency greater than 0.1 difference. You also can directly use perference data "dataNew/FixPerferdataset"

step 3.training Program Modifier with Adaptive Preference Learning

PerferDATA_FILE="dataNew/FixPerferdataset"

bash script/pipeline/Prefer/process_Prefer_FixModel.sh

Evaluate generated results

1. Test the generated code using test cases

The script is executed after specific parameter values are specified.

bash cript/Execution/Execution_Eval.sh

2. Evaluate generated results(ACC/Improve/Consistency) by script

The script is executed after specific parameter values are specified.

bash script/Postprocessing.sh

Program Modifier and Bug Locator weight download

you can download dataset from https://huggingface.co/ZhenlongDai/AdaPatcher.

base Program Modifier
the file of weight: "output_dir/loraWeight/fixbycrflp/checkpoint-8000"
the file of predict result: "predict_dir/loraWeight/fixbycrflp/test-checkpoint-8000.json"
bug locator
the file of weight: "output_dir/loraWeight/trace_CRFLP/checkpoint-14000"
the file of predict result: "predict_dir/loraWeight/trace_CRFLP/test-checkpoint-14000.json"
Program Modifier with Hybrid Training for Selective Reference
the file of weight: "output_dir/loraWeight/fixbycrflp2/checkpoint-12000"
the file of predict result: "predict_dir/loraWeight/fixbycrflp2/test-checkpoint-12000.json"
merge_sft
the file of weight: "output_dir/fix_codeLlama"
Program Modifier with Adaptive Preference Learning
the file of weight: "output_dir/DpoWeight/DPOP_Fix_ND3V1/checkpoint-1300"
the file of predict result: "predict_dir/DpoWeight/DPOP_Fix_ND3V1-GEN/test-checkpoint-1300.json"

citation

BibTeX:

@article{dai2025less,
  title={Less is More: Adaptive Program Repair with Bug Localization and Preference Learning},
  author={Dai, Zhenlong and Chen, Bingrui and Zhao, Zhuoluo and Tang, Xiu and Wu, Sai and Yao, Chang and Gao, Zhipeng and Chen, Jingyuan},
  journal={arXiv preprint arXiv:2503.06510},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
DpoTrainer		DpoTrainer
Eval		Eval
Lora		Lora
LoraTrainer		LoraTrainer
Model		Model
baselineEval		baselineEval
codeTool		codeTool
configs		configs
drawTool		drawTool
images		images
predict_dir		predict_dir
script		script
utils		utils
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AdaPatcher

Paper Title: Adaptive Program Repair with Bug Localization and Preference Learning

Installation

Dataset download

1. test cases

2. Programming problem file

3. train/dev/test file

Code execution environment construction

LLM weight download

Stage I Training

step 1.firstly need train a base Program Modifier(Location-Aware Repair Learning) for eval bug locator

step 2.trian bug locator

Stage II Training

step 1.firstly need train a Program Modifier with Hybrid Training for Selective Reference

step 2. generate perference data by Program Modifier

step 2.1. Merge PEFT weights with base LLM weights

step 2.2. Adaptive Preference Learning

step 3.training Program Modifier with Adaptive Preference Learning

Evaluate generated results

1. Test the generated code using test cases

2. Evaluate generated results(ACC/Improve/Consistency) by script

Program Modifier and Bug Locator weight download

citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AdaPatcher

Paper Title: Adaptive Program Repair with Bug Localization and Preference Learning

Installation

Dataset download

1. test cases

2. Programming problem file

3. train/dev/test file

Code execution environment construction

LLM weight download

Stage I Training

step 1.firstly need train a base Program Modifier(Location-Aware Repair Learning) for eval bug locator

step 2.trian bug locator

Stage II Training

step 1.firstly need train a Program Modifier with Hybrid Training for Selective Reference

step 2. generate perference data by Program Modifier

step 2.1. Merge PEFT weights with base LLM weights

step 2.2. Adaptive Preference Learning

step 3.training Program Modifier with Adaptive Preference Learning

Evaluate generated results

1. Test the generated code using test cases

2. Evaluate generated results(ACC/Improve/Consistency) by script

Program Modifier and Bug Locator weight download

citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages