Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Reinforced Token Optimization
Activity Feed
Follow
4
AI & ML interests
None defined yet.
Team members
1
models
10
Sort: Recently updated
RTO-RL/Llama3-8B-RTO_RPP
8B
•
Updated
Apr 10
•
7
•
1
RTO-RL/Llama3-8B-RPP
8B
•
Updated
Apr 10
•
7
•
1
RTO-RL/Llama3-8B-TDPO
8B
•
Updated
Feb 11
•
4
•
1
RTO-RL/Llama3-8B-SimPO
8B
•
Updated
Feb 11
•
4
RTO-RL/Llama3-8B-RDPO
8B
•
Updated
Feb 11
•
4
•
1
RTO-RL/Llama3-8B-PPO
8B
•
Updated
Feb 11
•
13
•
1
RTO-RL/Llama3-8B-RTO
8B
•
Updated
Feb 11
•
7
•
1
RTO-RL/Llama3.2-1B-RewardModel
1B
•
Updated
Feb 11
•
969
RTO-RL/Llama3-8B-RewardModel
8B
•
Updated
Feb 11
•
6
RTO-RL/Llama3-8B-DPO
8B
•
Updated
Feb 11
•
1.42k
datasets
0
None public yet