Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider a default behavior of reading mkv tags after the first cluster if indexed in the seekhead #252

Open
dericed opened this issue Apr 11, 2024 · 3 comments

Comments

@dericed
Copy link

dericed commented Apr 11, 2024

I understand that perhaps for speed that exiftool stops parsing at the first cluster; however as mkv is a fairly editable format, often chapters and tags are found at the end of the file.

Here are two samples:
tags_after_clusters.mkv.zip
tags_before_clusters.mkv.zip

With exiftool tags_before_clusters.mkv I can see the embedded metadata

Tag 1                           : value1
Tag 2                           : value2
Willtest                        : in_exiftool

but not in exiftool tags_after_clusters.mkv. I've been using -ee as a workaround, but I think the tags element is pretty critical to a matroska file and if its position is clarified by the SeekHead so a parser doesn't need to do a long search for it, then I'd suggest including it in a preliminary parse on default.

Thanks

@MasterInQuestion
Copy link

MasterInQuestion commented Apr 18, 2024

    Similar may also happen for MP4 "moov atom".
    For archive purpose, the Matroska Cues alike should be placed to the beginning.
    Which may be accomplished by "-cues_to_front 1" ("-movflags faststart" for MP4/MOV) of `ffmpeg`.

    See also:
    https://github.com/MasterInQuestion/talk/discussions/3
    https://bugzilla.mozilla.org/show_bug.cgi?id=1892185#c0

@dericed
Copy link
Author

dericed commented Apr 19, 2024

Hi @MasterInQuestion, I agree that storing the tags at the front adds more advantages. I've been working with a lot of large files where the metadata will be edited. When the tags become larger, the editors will void the tags element at the beginning and rewrite it at the end (that's a lot faster than rewriting the entire file). So most matroska files that have been edited will have metadata (or chapters or attachments, etc) at the end. To partly work around this I may add more header space to the beginning of mkv by default to reduce the need to edit tags to the end.

@MasterInQuestion
Copy link

    Low-level in the storage, perhaps not much difference. (implementation dependent) [1]
[ [1]
    Notably for transistor based ones. (i.e. SSD alike)
    There may be huge difference between the logical data layout and actual. ]

    Personally I tend to avoid metadata at all. (deeming which hardly interoperable; and at many times merely as side-channel of privacy leak)
    What kind of use do you have on such metadata, would you inform?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants