Name	Name	Last commit message	Last commit date
parent directory ..
code	code
data/cloze-maxmin	data/cloze-maxmin
evaluator	evaluator
README.md	README.md

CodeXGLUE -- ClozeTest-maxmin

We present two kinds of cloze testing: ClozeTest-maxmin and ClozeTest-all. Here is the ClozeTest-maxmin task.

Task Description

Cloze tests are widely adopted in Natural Languages Processing to evaluate the performance of the trained language models. The task is aimed to predict the answers for the blank with the context of the blank, which can be formulated as a multi-choice classification problem.

Here we present the two cloze testing datasets in code domain with six different programming languages: ClozeTest-maxmin and ClozeTest-all. Each instance in the dataset contains a masked code function, its docstring and the target word.

The only difference between ClozeTest-maxmin and ClozeTest-all is their selected words sets, where ClozeTest-maxmin only contains two words while ClozeTest-all contains 930 words.

Dependency

python 3.6 or 3.7
torch==1.5.0
transformers>=2.5.0

Data

The data for cloze testing are collected from the the validation and test sets of CodeSearchNet, which contains six different programming languages including RUBY, JAVASCRIPT, GO, PYTHON, JAVA and PHP. Each instance contains a masked code function, its docstring and the target word. An example in PYTHON is shown below:

We present the preprocessed data in data/cloze-maxmin directory.

Data statistics of ClozeTest-maxmin are shown in the below table:

RUBY	JAVASCRIPT	GO	PYTHON	JAVA	PHP	ALL
38	272	152	1264	482	407	2615

Run ClozeTest

You can run ClozeTest-maxmin by the following command. It will automatically generate predictions to --output_dir.

python code/run_cloze.py \
			--model microsoft/codebert-base-mlm \
			--cloze_mode maxmin \
			--output_dir evaluator/predictions/

Evaluator

We provide a script to evaluate predictions for ClozeTest-maxmin, and report accuracy for the task. You can run by the following command:

python evaluator/evaluator.py \
			--answers evaluator/answers \
			--predictions evaluator/predictions

Result

The results on ClozeTest-maxmin are shown as below:

	RUBY	JAVASCRIPT	GO	PYTHON	JAVA	PHP	ALL
RoBERTa-base	73.68	64.71	71.71	59.18	59.75	69.78	62.45
CodeBERT(MLM)	86.84	86.40	90.79	82.20	90.46	88.21	85.66

Cite

ClozeTest-maxmin is built upon CodeSearchNet dataset. If you use this code or our ClozeTest-maxmin dataset, please considering citing CodeXGLUE, CodeBERT and CodeSearchNet:

@article{CodeXGLUE,
  title={CodeXGLUE: An Open Challenge for Code Intelligence},
  journal={arXiv},
  year={2020},
}

@article{feng2020codebert,
  title={CodeBERT: A Pre-Trained Model for Programming and Natural Languages},
  author={Feng, Zhangyin and Guo, Daya and Tang, Duyu and Duan, Nan and Feng, Xiaocheng and Gong, Ming and Shou, Linjun and Qin, Bing and Liu, Ting and Jiang, Daxin and others},
  journal={arXiv preprint arXiv:2002.08155},
  year={2020}
}

@article{husain2019codesearchnet,
  title={CodeSearchNet Challenge: Evaluating the State of Semantic Code Search},
  author={Husain, Hamel and Wu, Ho-Hsiang and Gazit, Tiferet and Allamanis, Miltiadis and Brockschmidt, Marc},
  journal={arXiv preprint arXiv:1909.09436},
  year={2019}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

CodeXGLUE -- ClozeTest-maxmin

Task Description

Dependency

Data

Run ClozeTest

Evaluator

Result

Cite

FilesExpand file tree

ClozeTesting-maxmin

Directory actions

More options

Directory actions

More options

Latest commit

History

ClozeTesting-maxmin

Folders and files

parent directory

README.md

CodeXGLUE -- ClozeTest-maxmin

Task Description

Dependency

Data

Run ClozeTest

Evaluator

Result

Cite