Added train by epoch for Trainer and added support for texts #12

MarcusLoppe · 2023-12-13T22:06:10Z

Trainer

Added train function that will train with epochs instead of steps
- Added option to display graph (maybe remove since it requires matlib?)
Added checkpoint_every_epoch to init so checkpoints can be saved every X.

Data.py

Modified custom_collate so it wont pad if the data isn't a tensor

MeshAutoencoder

Was crashing if texts was in the args, added dummy parameterto prevent this.

Setup.py

The setup file was missing a comma

lucidrains · 2023-12-13T22:11:04Z

this is awesome Marcus! will take a look at it tomorrow morning and get it merged!

This reverts commit 5a68772.

This reverts commit fc49064.

This reverts commit fa515a6.

adeerAI · 2023-12-14T14:30:21Z

Awesome work, thanks for including your suggestions to the main, this allows better understanding on the user's side.

lucidrains · 2023-12-14T16:23:09Z

meshgpt_pytorch/meshgpt_pytorch.py

@@ -741,6 +741,7 @@ def forward(
        vertices:       TensorType['b', 'nv', 3, float],
        faces:          TensorType['b', 'nf', 3, int],
        face_edges:     Optional[TensorType['b', 'e', 2, int]] = None,
+        texts: Optional[List[str]] = None,


ah, so the text is actually only conditioned through the transformer stage through cross attention

basically the autoencoder is given the job of only compressing meshes

Yes I know :) But if you pass it a dict with texts it will give a error since the arg doesnt exist.
So then you would need two dataset classes.

Either replace the model(**forward_args) so it uses the prarameters directly:
model(vertices = data["vertices"], faces = data["faces"])

Or just implement a dummy texts :) There is probably a better solution

oh got it! yea, i can take care of that within the trainer class (just scrub out the text and text_embed keys)

I don't think that will work, I'm not 100% since the dataloader passes the data and maybe copies it(?).

But it won't work if you access it without copying it since the dataset is returning the data and not copying/cloning, when you do del on a key, it will remove it completely from the dataset.
So if you train the encoder and then want to train a transformer, you'll need to recreate the dataset since the texts key is removed.

I prefer if the dataset returns the text with each vertices and faces.

lucidrains · 2023-12-14T16:23:20Z

meshgpt_pytorch/trainer.py

@@ -367,7 +370,63 @@ def forward(self):
            self.wait()

        self.print('training complete')
+    def train(self, num_epochs, diplay_graph = False):


small typo here diplay

lucidrains · 2023-12-14T16:25:00Z

meshgpt_pytorch/trainer.py

+
+
+        self.print('Training complete') 
+        if diplay_graph:


so i haven't documented this, but you can already use wandb.ai experiment tracker

you just have to do

trainer = Trainer(..., use_wandb_tracking = True) with trainer.trackers('meshgpt', 'one-experiment-name'): trainer.train()

MarcusLoppe · 2023-12-14T16:43:08Z

Btw since I don't really think grad_accum_every is very useful I removed it from the train function, what is your option?

I forgot and left grad_accum_every in the loss function, so if it wont be used in the train function it should be removed from:

self.accelerator.backward(loss / self.grad_accum_every)

lucidrains · 2023-12-14T16:46:04Z

Btw since I don't really think grad_accum_every is very useful I removed it from the train function, what is your option?

I forgot and left grad_accum_every in the loss function, so if it wont be used in the train function it should be removed from:

self.accelerator.backward(loss / self.grad_accum_every)

i'm sure researchers will want to stretch to the next level if this approach pans out (multiple meshes, scenes etc)

probably good to keep it for the gpu poor

MarcusLoppe · 2023-12-14T16:49:07Z

Another thing :) I'm not very experienced in using github forks but it seems like the pull request added later commits then when I made the request.

I made bit of a error and replaced entire meshgpt_pytorch.py since there was some weird stash thing and I wanted to ensure it was up to date. I reverted but it seems like that stash thing messed it up bit, please double check if this is the case

MarcusLoppe and others added 10 commits December 13, 2023 21:56

added progress bar for trainer + accept texts args for encoder

4192b5f

added epoch checkpoint saver for trainer

0ae54e9

setup.py forgot comma

a09b2e8

custom_collate can now accecpt texts

83baaee

bug fix - save every_epoch

195d14e

bug fix - save every_epoch

1d4b705

bug fix - save every_epoch extra info

b5d9b1e

bug fix - every_epoch

d20e3d8

final bug fix - every_epoch

4085b70

Merge branch 'lucidrains:main' into main

3715e3a

MarcusLoppe and others added 10 commits December 13, 2023 23:34

Merge branch 'lucidrains:main' into main

99cfa22

Merge branch 'lucidrains:main' into main

57900a6

Merge branch 'lucidrains:main' into main

ee44068

removed grad_accum_every

fa515a6

removed grad_accum_every

fc49064

fix error

5a68772

Revert "fix error"

cdb8b52

This reverts commit 5a68772.

Revert "removed grad_accum_every"

3a46455

This reverts commit fc49064.

Merge branch 'lucidrains:main' into main

0717547

Revert "removed grad_accum_every"

7833b5b

This reverts commit fa515a6.

Merge branch 'lucidrains:main' into main

c8a82dc

lucidrains reviewed Dec 14, 2023

View reviewed changes

MarcusLoppe and others added 3 commits December 14, 2023 18:31

Merge branch 'main' into main

60ea5ec

Merge branch 'lucidrains:main' into main

0b2b0ee

return avg loss

ea40170

MarcusLoppe and others added 22 commits March 12, 2024 22:07

training stuff

4f9ddd5

checkpoint name

e5db7d9

checkpoint name

e9c6bdf

pad

6cfc6f9

pad

89010a9

pad

6c197af

pad

8231e7b

Update README.md

2940a46

Update README.md

201ce43

Update README.md

e1f7ffb

tuple error

cdb0d7f

Merge branch 'main' of https://github.com/MarcusLoppe/meshgpt-pytorch

0daebd3

padding issue

e5afd9d

multi-gpu checkpoint fix

fa71927

gateloop_use_heinsen fix

780431b

updated notebook with better instructions

b4fcd8d

added fp16 training

989faa9

Merge branch 'main' of https://github.com/lucidrains/meshgpt-pytorch

8288354

Merge branch 'main' of https://github.com/lucidrains/meshgpt-pytorch

521f6f7

remove unused testing things

19f616d

small comment

6e3909b

Merge branch 'main' of https://github.com/lucidrains/meshgpt-pytorch

391704b

lucidrains force-pushed the main branch from 6e3909b to 1dba31d Compare May 11, 2024 01:19

MarcusLoppe and others added 7 commits May 11, 2024 04:57

Merge branch 'main' of https://github.com/lucidrains/meshgpt-pytorch

0e4a38e

Merge branch 'lucidrains:main' into main

871f418

Merge branch 'lucidrains:main' into main

dd513a6

Merge remote-tracking branch 'upstream/main'

30c601e

Merge branch 'lucidrains:main' into main

674fa18

Merge branch 'lucidrains:main' into main

efb420f

Merge branch 'lucidrains:main' into main

14ad997

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added train by epoch for Trainer and added support for texts #12

Added train by epoch for Trainer and added support for texts #12

MarcusLoppe commented Dec 13, 2023

lucidrains commented Dec 13, 2023

adeerAI commented Dec 14, 2023

lucidrains Dec 14, 2023

MarcusLoppe Dec 14, 2023

lucidrains Dec 14, 2023

MarcusLoppe Dec 14, 2023

fire Dec 17, 2023

lucidrains Dec 14, 2023

lucidrains Dec 14, 2023

MarcusLoppe commented Dec 14, 2023

lucidrains commented Dec 14, 2023

MarcusLoppe commented Dec 14, 2023

Added train by epoch for Trainer and added support for texts #12

Are you sure you want to change the base?

Added train by epoch for Trainer and added support for texts #12

Conversation

MarcusLoppe commented Dec 13, 2023

lucidrains commented Dec 13, 2023

adeerAI commented Dec 14, 2023

lucidrains Dec 14, 2023

Choose a reason for hiding this comment

MarcusLoppe Dec 14, 2023

Choose a reason for hiding this comment

lucidrains Dec 14, 2023

Choose a reason for hiding this comment

MarcusLoppe Dec 14, 2023

Choose a reason for hiding this comment

fire Dec 17, 2023

Choose a reason for hiding this comment

lucidrains Dec 14, 2023

Choose a reason for hiding this comment

lucidrains Dec 14, 2023

Choose a reason for hiding this comment

MarcusLoppe commented Dec 14, 2023

lucidrains commented Dec 14, 2023

MarcusLoppe commented Dec 14, 2023