Optimizing Learning Rate Parameters in Model Fine-tuning #9215

csetanmayjain · 2024-05-16T06:58:30Z

I've been working on fine-tuning the model within a specific domain and tweaking various hyperparameters, including learning rates, via a YAML file during training. However, upon completing the training process, I noticed that the toolkit defaults to using the learning rate from the last pre-trained model instead of considering the parameters specified in the YAML file. Is there a way to ensure that it prioritizes the learning rate parameter I've passed from the Yaml file?

Additionally, post fine-tuning, I've observed that the model fails to learn effectively. Despite a significant increase in training duration (from around 1200 hours in the baseline to approximately 4000 hours in the fine-tuned model), and the use of noise augmentation during fine-tuning compared to the clean data used in the baseline, the results remain unchanged. Would it be reasonable to conclude that fine-tuning isn't yielding significant improvements, or it's more because of the learning rate default value which is considered from the last pre-trained model?

csetanmayjain added the bug Something isn't working label May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizing Learning Rate Parameters in Model Fine-tuning #9215

Optimizing Learning Rate Parameters in Model Fine-tuning #9215

csetanmayjain commented May 16, 2024

Optimizing Learning Rate Parameters in Model Fine-tuning #9215

Optimizing Learning Rate Parameters in Model Fine-tuning #9215

Comments

csetanmayjain commented May 16, 2024