whats the difference between the Consistent self-attention and Cross Frame attention in Text2Video-Zero? #99

Yushuyang1994 · 2024-05-17T06:14:05Z

It seems they are somehow similar and could you please describe the difference between them? Thank you!

brentjohnston · 2024-05-17T06:34:43Z

I searched entire repo and code for "Text2Video-Zero" the video weights have still not been released and I don't see any code related to the text to video yet. The dev said it's just for comic generation for now in another comment. Not sure where you are seeing this?

Z-YuPeng · 2024-05-17T06:55:25Z

Thank you for your attention. Both Consistent Self-Attention and Cross-Frame Attention make use of the key and value from self-attention, which was also introduced in Imagen. However, the subjects and purposes of their self-attention operations differ. Cross-frame attention is applied to video generation models, utilizing the first frame as a reference image, while Consistent Self-Attention is based on image generation models, leveraging sampled tokens from various character images to facilitate interaction among character features, thus ensuring character consistency. We will update our paper to make readers more aware of this distinction.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

whats the difference between the Consistent self-attention and Cross Frame attention in Text2Video-Zero? #99

whats the difference between the Consistent self-attention and Cross Frame attention in Text2Video-Zero? #99

Yushuyang1994 commented May 17, 2024

brentjohnston commented May 17, 2024 •

edited

Z-YuPeng commented May 17, 2024

whats the difference between the Consistent self-attention and Cross Frame attention in Text2Video-Zero? #99

whats the difference between the Consistent self-attention and Cross Frame attention in Text2Video-Zero? #99

Comments

Yushuyang1994 commented May 17, 2024

brentjohnston commented May 17, 2024 • edited

Z-YuPeng commented May 17, 2024

brentjohnston commented May 17, 2024 •

edited