AI Research Wiki
Search
Search
Dark mode
Light mode
Explorer
Tag: preference-learning
2 items with this tag.
Apr 11, 2026
Direct Preference Optimization
alignment
preference-learning
optimization
Apr 11, 2026
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
direct-preference-optimization
alignment
rlhf
preference-learning