🚨 Data Duplication Alert
- A recent study has identified a data duplication issue in this benchmark dataset, specifically between the training and testing sets. We thank Anurag Swarnim Yadav and Joseph N. Wilson for bringing this issue to our attention. For further details, please refer to their ACSAC 2024 paper.
Vision Transformer-Inspired Automated Vulnerability Repair
First of all, clone this repository to your local machine and access the main dir via the following command:
git clone https://github.com/awsm-research/VQM.git
cd VQM
Then, install the python dependencies via the following command:
pip install -r requirements.txt
cd VQM/transformers
pip install .
cd ../..
-
We highly recommend you check out this installation guide for the "torch" library so you can install the appropriate version on your device.
-
To utilize GPU (optional), you also need to install the CUDA library, you may want to check out this installation guide.
-
Python 3.9.7 is recommended, which has been fully tested without issues.
Download necessary data and unzip via the following command:
cd data
sh download_data.sh
cd ..
-
VQM (proposed approach)
- Inference
cd VQM/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test.sh cd ..- Retrain Localization Model
cd VQM sh run_pretrain_loc.sh sh run_train_loc.sh cd ..- Retrain Repair Model
cd VQM sh run_pretrain.sh sh run_train.sh sh run_test.sh cd .. -
VulRepair
- Inference
cd baselines/VulRepair/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test.sh cd ../..- Retrain
cd baselines/VulRepair sh run_pretrain.sh sh run_train.sh sh run_test.sh cd ../.. -
TFix
- Inference
cd baselines/TFix/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test.sh cd ../..- Retrain
cd baselines/TFix sh run_pretrain.sh sh run_train.sh sh run_test.sh cd ../.. -
GraphCodeBERT
- Inference
cd baselines/GraphCodeBERT/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test.sh cd ../..- Retrain
cd baselines/GraphCodeBERT sh run_pretrain.sh sh run_train.sh sh run_test.sh cd ../.. -
CodeBERT
- Inference
cd baselines/CodeBERT/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test.sh cd ../..- Retrain
cd baselines/CodeBERT sh run_pretrain.sh sh run_train.sh sh run_test.sh cd ../.. -
VRepair
- Inference
cd baselines/VRepair/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test.sh cd ../..- Retrain
cd baselines/VRepair sh run_pretrain.sh sh run_train.sh sh run_test.sh cd ../.. -
SequenceR
- Inference
cd baselines/SequenceR/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test.sh cd ../..- Retrain
cd baselines/SequenceR sh run_pretrain.sh sh run_train.sh sh run_test.sh cd ../..
-
Vul Mask Encoder + Vul Mask Decoder (proposed approach - VQM)
- Inference
cd VQM/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test_no_bug.sh cd ..- Retrain
cd VQM sh run_pretrain.sh sh run_train.sh sh run_test.sh cd .. -
Vul Mask Encoder
- Inference
cd ablation_mask/Vul_mask_enc_only/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test.sh cd ../..- Retrain
cd ablation_mask/Vul_mask_enc_only/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_pretrain.sh sh run_train.sh sh run_test.sh cd ../.. -
Vul Mask Decoder
- Inference
cd ablation_mask/Vul_mask_cross_only/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test.sh cd ../..- Retrain
cd ablation_mask/Vul_mask_enc_only/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_pretrain.sh sh run_train.sh sh run_test.sh cd ../.. -
Perfect Mask Encoder + Perfect Mask Decoder
- Inference
cd ablation_mask/Vul_perfect_mask/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test.sh cd ../..- Retrain
cd ablation_mask/Vul_mask_enc_only/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_pretrain.sh sh run_train.sh sh run_test.sh cd ../..
-
VQM (proposed approach)
- Inference
cd VQM/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test_no_bug.sh cd ..- Retrain
sh run_train_no_bug.sh sh run_test_no_bug.sh cd .. -
VulRepair
- Inference
cd baselines/VulRepair/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test_no_bug.sh cd ../..- Retrain
sh run_train_no_bug.sh sh run_test_no_bug.sh cd ../.. -
TFix
- Inference
cd baselines/TFix/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test_no_bug.sh cd ../..- Retrain
sh run_train_no_bug.sh sh run_test_no_bug.sh cd ../.. -
GraphCodeBERT
- Inference
cd baselines/GraphCodeBERT/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test_no_bug.sh cd ../..- Retrain
sh run_train_no_bug.sh sh run_test_no_bug.sh cd ../.. -
CodeBERT
- Inference
cd baselines/CodeBERT/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test_no_bug.sh cd ../..- Retrain
sh run_train_no_bug.sh sh run_test_no_bug.sh cd ../.. -
VRepair
- Inference
cd baselines/VRepair/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test_no_bug.sh cd ../..- Retrain
sh run_train_no_bug.sh sh run_test_no_bug.sh cd ../.. -
SequenceR
- Inference
cd baselines/SequenceR/saved_models/checkpoint-best-loss sh download_models.sh cd ../.. sh run_test_no_bug.sh cd ../..- Retrain
sh run_train_no_bug.sh sh run_test_no_bug.sh cd ../..
@article{fu2023vision,
title={Vision Transformer-Inspired Automated Vulnerability Repair},
author={Fu, Michael and Nguyen, Van and Tantithamthavorn, Chakkrit and Phung, Dinh and Le, Trung},
journal={ACM Transactions on Software Engineering and Methodology},
year={2023},
publisher={ACM New York, NY}
}