cedar/codex/framework at main · prompt-learning/cedar

Name	Name	Last commit message	Last commit date
parent directory ..
codex_api	codex_api
dataset	dataset
demonstration	demonstration
embedding-generation	embedding-generation
embedding	embedding
evaluation	evaluation
prompt_codex	prompt_codex
prompts	prompts
template	template
util	util
__init__.py	__init__.py
README.md	README.md
atlas_generate_embedding.py	atlas_generate_embedding.py
embedding-prereq.txt	embedding-prereq.txt
main_atlas.py	main_atlas.py
main_atlas_bm25_part_1.py	main_atlas_bm25_part_1.py
main_atlas_bm25_part_1.sh	main_atlas_bm25_part_1.sh
main_atlas_bm25_part_2.py	main_atlas_bm25_part_2.py
main_atlas_bm25_part_2.sh	main_atlas_bm25_part_2.sh
main_atlas_generate_stats.py	main_atlas_generate_stats.py
main_tfix.py	main_tfix.py
models.py	models.py
tfix_generate_embedding.py	tfix_generate_embedding.py
timer.py	timer.py

Name

Last commit message

Last commit date

atlas_generate_embedding.py

embedding-prereq.txt

main_atlas.py

main_atlas_bm25_part_1.py

main_atlas_bm25_part_1.sh

main_atlas_bm25_part_2.py

main_atlas_bm25_part_2.sh

main_atlas_generate_stats.py

main_tfix.py

models.py

tfix_generate_embedding.py

timer.py

Setup environment for framework

# The following steps create an isolated environment codex-env and install required dependencies.

conda create --name codex-env python=3.9 # creates the codex-env environment
conda activate codex-env                 # activates the codex-env environment
conda install openai                     # installs dependency - openai

pip install backoff
pip install edit_distance
conda install difflib
conda install matplotlib
conda install plotly
conda install scipy
conda install sklearn

pip install tenacity
pip install suffix-trees
pip install lizard
conda install gensim
pip install rank_bm25

Setup environment for embedding search

We use sentence-transformers to generate embeddings for the code snippets.

conda create -n semantic-embedding --file embedding-prereq.txt
conda install -c pytorch faiss-cpu # install faiss-cpu

To install sentence-transformers, please follow the instructions from here. For linux environment, as stated in the above link, sentence-transformers gets installed using the following command: pip install -U sentence-transformers.

However, for mac with m1 chip, we had to run the following commands to get it installed:

conda list openmp
conda unistall intel-openmp
conda install -c conda-forge sentence-transformers

Install vector database

We use vector database lite for vector search. vdblite library details could be found here.

pip install vdblite

Generate embeddings for the code snippets

Run the following command to generate embeddings for ATLAS.

python atlas_generate_embedding.py

Run the following command to generate embeddings for TFix.

python tfix_generate_embedding.py

Running evaluation

Results from experiments are saved in the folder ./codex/framework/result-analysis/final-results/.

python evaluation/result_analysis_atlas.py ./results.csv
exact_match_count: 9021 match_count: 10368
exact match_count (%): 47.946, match_count(%): 55.105

Dataset acknowledgements

We use the ATLAS, and TFix dataset for our experiments. For the sake of simplicity, we have included the dataset in the repository. However, we would like to acknowledge the authors for making the dataset publicly available.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Setup environment for framework

Setup environment for embedding search

Install vector database

Generate embeddings for the code snippets

Running evaluation

Dataset acknowledgements

FilesExpand file tree

framework

Directory actions

More options

Directory actions

More options

Latest commit

History

framework

Folders and files

parent directory

README.md

Setup environment for framework

Setup environment for embedding search

Install vector database

Generate embeddings for the code snippets

Running evaluation

Dataset acknowledgements