Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizing Learning Rate Parameters in Model Fine-tuning #9215

Open
csetanmayjain opened this issue May 16, 2024 · 0 comments
Open

Optimizing Learning Rate Parameters in Model Fine-tuning #9215

csetanmayjain opened this issue May 16, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@csetanmayjain
Copy link

I've been working on fine-tuning the model within a specific domain and tweaking various hyperparameters, including learning rates, via a YAML file during training. However, upon completing the training process, I noticed that the toolkit defaults to using the learning rate from the last pre-trained model instead of considering the parameters specified in the YAML file. Is there a way to ensure that it prioritizes the learning rate parameter I've passed from the Yaml file?

Additionally, post fine-tuning, I've observed that the model fails to learn effectively. Despite a significant increase in training duration (from around 1200 hours in the baseline to approximately 4000 hours in the fine-tuned model), and the use of noise augmentation during fine-tuning compared to the clean data used in the baseline, the results remain unchanged. Would it be reasonable to conclude that fine-tuning isn't yielding significant improvements, or it's more because of the learning rate default value which is considered from the last pre-trained model?

@csetanmayjain csetanmayjain added the bug Something isn't working label May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant