91青青草

Events

Learning Dynamics of LLM Finetuning

LLM seminar event about the paper "Learning Dynamics of LLM Finetuning" by University of British Columbia.
Image with writing about the presenter name, title, time and place of the event. Black background with a book

Title: Learning Dynamics of LLM Finetuning

Presenter:  Linli Zhang

Abstract: Learning dynamics, which describes how the learning of specific training examples influences the model鈥檚 predictions on other examples, gives us a powerful tool for understanding the behavior of deep learning systems. The authors study the learning dynamics of large language models during different types of finetuning, by analyzing the step-wise decomposition of how influence accumulates among different potential responses. Their framework allows a
uniform interpretation of many interesting observations about the training of popular algorithms for both instruction tuning and preference tuning. In particular, they propose a hypothetical explanation of why specific types of hallucination are strengthened after finetuning, e.g., the model might use phrases or facts in the response for question B to answer question A, or the model might keep repeating similar simple phrases when generating responses. They also extend their framework and highlight a unique 鈥渟queezing effect鈥 to explain a previously observed phenomenon in off-policy direct preference optimization (DPO), where running DPO for too long makes even the desired outputs less likely. This framework also provides insights into where the benefits of on-policy DPO and other variants come from. The analysis not only provides a novel perspective of understanding LLM鈥檚 finetuning but also inspires a simple, effective method to improve alignment performance. 

Paper link:

Disclaimer: The presenter is not part of the authors!

LLM seminar

Seminar on Large Language Models in the CS Department

  • Updated:
  • Published:
Share
URL copied!