You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been working on fine-tuning the model within a specific domain and tweaking various hyperparameters, including learning rates, via a YAML file during training. However, upon completing the training process, I noticed that the toolkit defaults to using the learning rate from the last pre-trained model instead of considering the parameters specified in the YAML file. Is there a way to ensure that it prioritizes the learning rate parameter I've passed from the Yaml file?
Additionally, post fine-tuning, I've observed that the model fails to learn effectively. Despite a significant increase in training duration (from around 1200 hours in the baseline to approximately 4000 hours in the fine-tuned model), and the use of noise augmentation during fine-tuning compared to the clean data used in the baseline, the results remain unchanged. Would it be reasonable to conclude that fine-tuning isn't yielding significant improvements, or it's more because of the learning rate default value which is considered from the last pre-trained model?
The text was updated successfully, but these errors were encountered:
I've been working on fine-tuning the model within a specific domain and tweaking various hyperparameters, including learning rates, via a YAML file during training. However, upon completing the training process, I noticed that the toolkit defaults to using the learning rate from the last pre-trained model instead of considering the parameters specified in the YAML file. Is there a way to ensure that it prioritizes the learning rate parameter I've passed from the Yaml file?
Additionally, post fine-tuning, I've observed that the model fails to learn effectively. Despite a significant increase in training duration (from around 1200 hours in the baseline to approximately 4000 hours in the fine-tuned model), and the use of noise augmentation during fine-tuning compared to the clean data used in the baseline, the results remain unchanged. Would it be reasonable to conclude that fine-tuning isn't yielding significant improvements, or it's more because of the learning rate default value which is considered from the last pre-trained model?
The text was updated successfully, but these errors were encountered: