AI Research Wiki
Search
Search
Dark mode
Light mode
Explorer
Tag: alignment
9 items with this tag.
Apr 11, 2026
Constitutional AI
alignment
safety
rlaif
Apr 11, 2026
Direct Preference Optimization
alignment
preference-learning
optimization
Apr 11, 2026
Reinforcement Learning from Human Feedback
rlhf
alignment
reinforcement-learning
Apr 11, 2026
Anthropic
organization
ai-lab
alignment
ai-safety
Apr 11, 2026
OpenAI
organization
ai-lab
alignment
Apr 11, 2026
Constitutional AI: Harmlessness from AI Feedback
constitutional-ai
alignment
rlhf
ai-feedback
Apr 11, 2026
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
direct-preference-optimization
alignment
rlhf
preference-learning
Apr 11, 2026
GPT-4 Technical Report
multimodal
frontier-model
scaling
alignment
Apr 11, 2026
InstructGPT: Training Language Models to Follow Instructions with Human Feedback
rlhf
alignment
instruction-following
language-model