AI Research Wiki
Search
Search
Dark mode
Light mode
Explorer
Tag: rlhf
4 items with this tag.
Apr 11, 2026
Reinforcement Learning from Human Feedback
rlhf
alignment
reinforcement-learning
Apr 11, 2026
Constitutional AI: Harmlessness from AI Feedback
constitutional-ai
alignment
rlhf
ai-feedback
Apr 11, 2026
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
direct-preference-optimization
alignment
rlhf
preference-learning
Apr 11, 2026
InstructGPT: Training Language Models to Follow Instructions with Human Feedback
rlhf
alignment
instruction-following
language-model