Skip to content

Navigation Menu

PKU-Alignment

Explore
For
- Enterprise
- Teams
- Startups
- Education
By Solution
Resources
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

PKU-Alignment

Loves Sharing and Open-Source, Making AI Safer.

275 followers
China
yaodong.yang@outlook.com

Overview
Repositories
Projects
Packages
People

More

Overview
Repositories
Projects
Packages
People

README.md

PKU-Alignment

Large language models (LLM) have immense potential in the field of general intelligence but come with significant risks. As a research team at Peking University, we actively focus on alignment techniques for large language models, such as safety alignment, to enhance the model's safety and reduce toxicity.

Welcome to follow our AI Safety project:

safe-rlhf
omnisafe
safepo
safety-gymnasium

Pinned

omnisafe omnisafe Public

OmniSafe is an infrastructural framework for accelerating SafeRL research.

Python 867 126
safety-gymnasium safety-gymnasium Public

NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

Python 336 47
safe-rlhf safe-rlhf Public

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Python 1.2k 106
Safe-Policy-Optimization Safe-Policy-Optimization Public

NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms

Python 299 41

Repositories

Type

Select type

All Public Sources Forks Archived Mirrors Templates

Language

Select language

All JavaScript Makefile Python

Sort

Select order

Last updated Name Stars

Showing 10 of 11 repositories

.github Public

0 0 0 0 Updated Jun 12, 2024
llms-resist-alignment Public
Repo for paper "Language Models Resist Alignment"

Python 2 0 0 0 Updated Jun 9, 2024
omnisafe Public
OmniSafe is an infrastructural framework for accelerating SafeRL research.

Python 867 Apache-2.0 126 12 6 Updated May 16, 2024
safety-gymnasium Public
NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

Python 336 Apache-2.0 47 1 0 Updated May 14, 2024
safe-rlhf Public
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Python 1,202 Apache-2.0 106 14 0 Updated Apr 20, 2024
ProAgent Public
ProAgent: Building Proactive Cooperative Agents with Large Language Models

JavaScript 38 MIT 3 1 0 Updated Apr 8, 2024
SafeDreamer Public
ICLR 2024: SafeDreamer: Safe Reinforcement Learning with World Models

Python 31 Apache-2.0 2 0 0 Updated Apr 8, 2024
Safe-Policy-Optimization Public
NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms

Python 299 Apache-2.0 41 0 0 Updated Mar 20, 2024
AlignmentSurvey Public
AI Alignment: A Comprehensive Survey

117 0 0 0 Updated Nov 2, 2023
beavertails Public
BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).

Makefile 87 Apache-2.0 3 1 1 Updated Oct 27, 2023

View all repositories

People

Top languages

Loading…

Most used topics

reinforcement-learning safe-reinforcement-learning llms rlhf safe-rlhf

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.