References
- Crowdsourcing Multiple Choice Science Questions by Johannes Welbl, Nelson F. Liu, Matt Gardner: http://arxiv.org/abs/1707.06209.
- MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers by Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei: https://arxiv.org/abs/2012.15828.
- Hugging Face Llama model documentation: https://huggingface.co/docs/transformers/main/en/model_doc/llama.
- ONNX: https://onnxruntime.ai/.