Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

[RFC] Deprecate the unfreeze_milestones finetuning strategy? #1261

Open
ethanwharris opened this issue Mar 29, 2022 · 6 comments
Open

[RFC] Deprecate the unfreeze_milestones finetuning strategy? #1261

ethanwharris opened this issue Mar 29, 2022 · 6 comments
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers Refactor (Functional)

Comments

@ethanwharris
Copy link
Collaborator

Motivation

The unfreeze_milestones finetuning strategy is confusing:

  • how is the layer number interpretted? Does this include e.g. batch norm and non-linearity layers? Not documented
  • what's the use case? Not aware of any time this would be recommended (also not documented)

Alternatives

At least document the answers to the above questions if there are any.

@karthikrangasai
Copy link
Contributor

Hello @ethanwharris ,

I think UnfreezeMilestines strategy is somewhat close to the Gradual Unfreezing idea from this paper Universal Language Model Fine-tuning for Text Classification by Jeremy Howard and Sebastian Ruder.

We might need to change how it works of add more arguments for users to customize it and it definitely needs more documentation.

And we can also change the name to GradualUnfreezing if that is okay.

@ethanwharris
Copy link
Collaborator Author

ethanwharris commented Mar 30, 2022

Hey @karthikrangasai, yes that could work. If we re-implement that then we can change the name and cite their paper in the code / docs and so people can read about why they may want to do it 😃

... plus documenting the usage of course

@ethanwharris ethanwharris added the documentation Improvements or additions to documentation label Mar 30, 2022
@karthikrangasai
Copy link
Contributor

Great. Sounds good.
I will try to get some work done on this then.

@krshrimali
Copy link
Contributor

Hi, my 2 cents on this:

  1. As a part of this issue, let's only document this further as @ethanwharris rightly pointed out. We definitely need to show the usage and explain a little about how it works. If we are sure that there is an exact reference that exists, would be great to cite it.
  2. I would prefer not to modify/add/edit the current strategy, as long as it works and does the job (on unfreezing layers). But if there is some motivation, please share any reference implementation (if any library already implements it) here, and we can (very shortly) take a look and go ahead if all sounds good.

Hope it sounds good! :)

@stale
Copy link

stale bot commented Jun 5, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the won't fix This will not be worked on label Jun 5, 2022
@stale stale bot closed this as completed Jun 15, 2022
@ethanwharris ethanwharris removed the won't fix This will not be worked on label Jun 30, 2022
@ethanwharris ethanwharris reopened this Jun 30, 2022
@krshrimali krshrimali removed this from the 0.8.0 milestone Jul 29, 2022
@stale stale bot added the won't fix This will not be worked on label Dec 24, 2022
@Borda
Copy link
Member

Borda commented Jan 5, 2023

@ethanwharris lets do it... 🦦

@stale stale bot removed the won't fix This will not be worked on label Jan 5, 2023
@Borda Borda added the good first issue Good for newcomers label Jan 5, 2023
@Lightning-Universe Lightning-Universe deleted a comment from stale bot Jan 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Improvements or additions to documentation good first issue Good for newcomers Refactor (Functional)
Projects
None yet
Development

No branches or pull requests

4 participants