PaLM + RLHF - Pytorch download

PaLM-rlhf-pytorch is a PyTorch implementation of Pathways Language Model (PaLM) with Reinforcement Learning from Human Feedback (RLHF). It is designed for fine-tuning large-scale language models with human preference alignment, similar to OpenAI’s approach for training models like ChatGPT.

Features

Implements RLHF for fine-tuning large-scale language models
Uses PPO (Proximal Policy Optimization) for reinforcement learning stability
Optimized for training on distributed hardware like GPUs and TPUs
Supports both pretraining and reward model fine-tuning
Built on PyTorch with modular and extensible components
Designed for experimenting with human-aligned AI training

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow PaLM + RLHF - Pytorch

PaLM + RLHF - Pytorch Web Site

Other Useful Business Software

Grafana: The open and composable observability platform

Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

Grafana is the open source analytics & monitoring solution for every database.

Learn More

Rate This Project

User Reviews

Be the first to post a review of PaLM + RLHF - Pytorch!

Additional Project Details

Programming Language

Python

Related Categories

Python Reinforcement Learning Frameworks

Registered

2025-03-13

Similar Business Software

Vertex AI

Build, deploy, and scale machine learning (ML) models faster, with fully managed ML tools for any use case. Through Vertex AI Workbench, Vertex AI is natively integrated with BigQuery, Dataproc, and Spark. You can use BigQuery ML to create and execute machine learning models in BigQuery...

See Software
Ango Hub

Ango Hub is a quality-focused, enterprise-ready data annotation platform for AI teams, available on cloud and on-premise. It supports computer vision, medical imaging, NLP, audio, video, and 3D point cloud annotation, powering use cases from autonomous driving and robotics to healthcare...

See Software
OORT DataHub

Data Collection and Labeling for AI Innovation. Transform your AI development with our decentralized platform that connects you to worldwide data contributors. We combine global crowdsourcing with blockchain verification to deliver diverse, traceable datasets. Global Network: Ensure AI...

See Software
Lamini

Lamini makes it possible for enterprises to turn proprietary data into the next generation of LLM capabilities, by offering a platform for in-house software teams to uplevel to OpenAI-level AI teams and to build within the security of their existing infrastructure. Guaranteed structured output...

See Software
Hugging Face

Hugging Face is a leading platform for AI and machine learning, offering a vast hub for models, datasets, and tools for natural language processing (NLP) and beyond. The platform supports a wide range of applications, from text, image, and audio to 3D data analysis. Hugging Face fosters...

See Software
Sapien

High-quality training data is essential for all large language models, whether you build the data yourself or use pre-existing models. A human-in-the-loop labeling process delivers real-time feedback for fine-tuning datasets to build the most performant and differentiated AI models. We provide...

See Software

Report inappropriate content

PaLM + RLHF - Pytorch

Implementation of RLHF (Reinforcement Learning with Human Feedback)

Get an email when there's a new version of PaLM + RLHF - Pytorch