This repository contains scripts for processing and generating datasets relevant to animal advocacy AI development. These tools help create, clean, and validate data used in training AI systems aligned with animal advocacy values. A key component is the synthetic feedback generator, which was used to create part of the Animal Alignment Feedback Dataset.
Note: The datasets processed by these scripts have now been incorporated into a much larger and more comprehensive dataset: Continued Pre-Training.