AI Research Wiki

Tag: preference-learning

2 items with this tag.

  • Apr 11, 2026

    Direct Preference Optimization

    • alignment
    • preference-learning
    • optimization
  • Apr 11, 2026

    Direct Preference Optimization: Your Language Model is Secretly a Reward Model

    • direct-preference-optimization
    • alignment
    • rlhf
    • preference-learning

Created with Quartz v4.5.2 © 2026

  • GitHub
  • Discord Community