AI Research Wiki

Tag: rlhf

4 items with this tag.

Apr 11, 2026
Reinforcement Learning from Human Feedback
Apr 11, 2026
Constitutional AI: Harmlessness from AI Feedback
Apr 11, 2026
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Apr 11, 2026
InstructGPT: Training Language Models to Follow Instructions with Human Feedback

Created with Quartz v4.5.2 © 2026

GitHub
Discord Community