Name	Name	Last commit message	Last commit date
parent directory ..
Benchmarks	Benchmarks
Pretraining Data	Pretraining Data
README.md	README.md

Name

Last commit message

Last commit date

CodeFusion Data and Benchmarks

This repository contains the conditional formatting benchmarks used in CodeFusion (CodeFusion: A Pre-trained Diffusion Model for Code Generation)

Folder Structure

Benchmarks
- Training and Testing data for Python, Bash and CF Rules
Pretraining Data
- Unannotated code used for pretraining for Python, Bash and CF Rules
- CF Rules only contain a subset of data due to compliance.

File Structure

For each language, there is a json file, in the Benchmarks folder with the following structure

[
    {
        "ID": <Benchmark ID>,
        "NL Utterance": <Natural Language instruction for generating the code>,
        "Code": <The target code corresponding to the utterance in the language>
    },
    {
        ...
    }
]

Note: Conditional Formatting benchmarks additionally also have Data and Labels field for execution match.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

CodeFusion Data and Benchmarks

Folder Structure

File Structure

FilesExpand file tree

CodeFusion

Directory actions

More options

Directory actions

More options

Latest commit

History

CodeFusion

Folders and files

parent directory

README.md

CodeFusion Data and Benchmarks

Folder Structure

File Structure