You need to agree to share your contact information to access this dataset
The information you provide will be collected, stored, processed and shared in accordance with the Meta Privacy Policy.
Log in or Sign Up to review the conditions and access this dataset content.
SA-FARI Dataset
License CC-BY-NC 4.0
SA-FARI is a wildlife camera dataset collected through a collaboration between Meta and CXL.
All videos and pre-processed JPEGImages can be found in cxl-public-camera-trap, which contains the following contents:
sa_fari/
├── sa_fari_test_tars/
│ ├── JPEGImages_6fps/
│ ├── videos/
├── sa_fari_test/
│ ├── JPEGImages_6fps/
│ ├── videos/
├── sa_fari_train_tars/
│ ├── JPEGImages_6fps/
│ ├── videos/
└── sa_fari_train/
├── JPEGImages_6fps/
└── videos/
videos: The original full fps videos.JPEGImages_6fps: For annotation, the videos have been downsampled to 6fps. This folder contains the downsampled frames compatible with the annotation json files below.
This Hugging Face dataset repo contains the annotations:
datasets/facebook/SA-FARI/tree/main/
└── annotation/
├── sa_fari_test.json
├── sa_fari_test_ext.json
├── sa_fari_train.json
└── sa_fari_train_ext.json
sa_fari_test.jsonandsa_fari_train.json- Follow the same format as SA-Co/VEval
sa_fari_test_ext.jsonandsa_fari_train_ext.json- In additional to the [SA-Co/VEval] format, we added additional metadata to the following fields:
videos:video_num_frames,video_fps,video_creation_datetimeandlocation_idhave been added as additional metadata to thevideosfield.
categories:Kingdom,Phylum,Class,Order,Family,GenusandSpecieshave been added when applicable as additional metadata to thecategoriesfield.
- In additional to the [SA-Co/VEval] format, we added additional metadata to the following fields:
All the SA-FARI annotation files are compatible to use the visualization notebook and offline evaluator developed in SAM 3 Github.
Annotation Format
A format breakdown for sa_fari_test.json and sa_fari_train.json. The format is similar to the YTVIS format.
In the annotation json, e.g. sa_fari_test.json there are 5 fields:
- info:
- A dict containing the dataset info
- E.g. {'version': 'v1', 'date': '2025-09-24', 'description': 'SA-FARI Test'}
- videos
- A list of videos that are used in the current annotation json
- It contains {id, video_name, file_names, height, width, length}
- annotations
- A list of positive masklets and their related info
- It contains {id, segmentations, bboxes, areas, iscrowd, video_id, height, width, category_id, noun_phrase}
- video_id should match to the
videos - idfield above - category_id should match to the
categories - idfield below - segmentations is a list of RLE
- video_id should match to the
- categories
- A globally used noun phrase id map, which is true across all 3 domains.
- It contains {id, name}
- name is the noun phrase
- video_np_pairs
- A list of video-np pairs, including both positive and negative used in the current annotation json
- It contains {id, video_id, category_id, noun_phrase, num_masklets}
- video_id should match the
videos - idabove - category_id should match the
categories - idabove - when
num_masklets > 0it is a positive video-np pair, and the presenting masklets can be found in the annotations field - when
num_masklets = 0it is a negative video-np pair, meaning no masklet presenting at all
- video_id should match the
data {
"info": info
"videos": [video]
"annotations": [annotation]
"categories": [category]
"video_np_pairs": [video_np_pair]
}
video {
"id": int
"video_name": str # e.g. sav_000000
"file_names": List[str]
"height": int
"width": width
"length": length
}
annotation {
"id": int
"segmentations": List[RLE]
"bboxes": List[List[int, int, int, int]]
"areas": List[int]
"iscrowd": int
"video_id": str
"height": int
"width": int
"category_id": int
"noun_phrase": str
}
category {
"id": int
"name": str
}
video_np_pair {
"id": int
"video_id": str
"category_id": int
"noun_phrase": str
"num_masklets" int
}
- Downloads last month
- 39