Mudit Bachhawat

Tag: human labeling

On Preference Optimization and DPO

May 10, 2024

Reading Time: 6 minutes

Introduction Training with preference data has allowed large language models (LLMs) to be optimized for specific qualities such as trust, safety, and harmfulness. Preference optimization is the process of using this data to enhance LLMs. This method is particularly useful for tuning the model to emphasize certain features or for training scenarios where relative feedback…