{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":499830255,"defaultBranch":"main","name":"open-metric-learning","ownerLogin":"OML-Team","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2022-06-04T13:12:25.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/104944039?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1718122707.0","currentOid":""},"activityList":{"items":[{"before":"ae372886ffb8d18cdd6a754acd526c56797a24b9","after":"f280613b875bb74e60845f6f3d81fe1104d7f83a","ref":"refs/heads/main","pushedAt":"2024-06-12T02:21:36.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"updated link in readme","shortMessageHtmlLink":"updated link in readme"}},{"before":"0f3e49d73370257760a232616d3c837865f2954a","after":"ae372886ffb8d18cdd6a754acd526c56797a24b9","ref":"refs/heads/main","pushedAt":"2024-06-11T16:24:24.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"fixed linters, minor","shortMessageHtmlLink":"fixed linters, minor"}},{"before":"b79617f44579188b02d827ee09e9f25641af8e94","after":"0f3e49d73370257760a232616d3c837865f2954a","ref":"refs/heads/main","pushedAt":"2024-06-11T16:20:58.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"Add ONNX info to the FAQ\n\nAdd ONNX info to the FAQ","shortMessageHtmlLink":"Add ONNX info to the FAQ"}},{"before":"b500081b7ec0bb68ef2388fa6a29381babe9290a","after":null,"ref":"refs/heads/docs","pushedAt":"2024-06-11T16:18:27.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"}},{"before":"05bbd7b9e0e05483db8b19855b3f66f992cb77fa","after":"b79617f44579188b02d827ee09e9f25641af8e94","ref":"refs/heads/main","pushedAt":"2024-06-11T16:18:23.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"Minor docs updated","shortMessageHtmlLink":"Minor docs updated"}},{"before":null,"after":"b500081b7ec0bb68ef2388fa6a29381babe9290a","ref":"refs/heads/docs","pushedAt":"2024-06-11T16:12:30.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"updated_docs","shortMessageHtmlLink":"updated_docs"}},{"before":"0334705affd486faec6a9d0e06a6f4993c5e0c66","after":"05bbd7b9e0e05483db8b19855b3f66f992cb77fa","ref":"refs/heads/main","pushedAt":"2024-06-09T21:22:28.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"release 3.1.0: supported separated queries and galleries in RetrievalResults","shortMessageHtmlLink":"release 3.1.0: supported separated queries and galleries in Retrieval…"}},{"before":"51f87923e0102d981e6f49f8335a311cd23a244b","after":null,"ref":"refs/heads/separated_qg","pushedAt":"2024-06-09T20:44:05.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"}},{"before":"c8f3f923584c18b03e9451c4bd166be1cba24f87","after":"0334705affd486faec6a9d0e06a6f4993c5e0c66","ref":"refs/heads/main","pushedAt":"2024-06-09T20:43:59.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"Supporting separated queries and galleries\n\nChangelog:\r\n* Added `batched_knn_qg`. So `batched_knn` is just an adapter for `batched_knn_qg`.\r\n* Added a few methods to `RetrievalResults`: `from_embeddings_qg`, `visuaze_qg`, `visualize_with_functions`. Also added these methods to docs.\r\n* Added an example to Readme and docs where new methods are shown.\r\n* Updated tests: added text modality to `test_retrieval_results`, added new `test_retrieval_results_separated_qg`","shortMessageHtmlLink":"Supporting separated queries and galleries"}},{"before":"6b46ae846b2de377846921a17a7ff3c76b194b9e","after":"51f87923e0102d981e6f49f8335a311cd23a244b","ref":"refs/heads/separated_qg","pushedAt":"2024-06-09T20:42:54.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"minor","shortMessageHtmlLink":"minor"}},{"before":null,"after":"8309c5a62ea77c21135eeeced983f6462957905e","ref":"refs/heads/add_dataset_to_config_api","pushedAt":"2024-06-09T15:59:23.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"leoromanovich","name":"Verkhovtsev Leonid","path":"/leoromanovich","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/15211391?s=80&v=4"},"commit":{"message":"simple example (tmp, just approach testing)","shortMessageHtmlLink":"simple example (tmp, just approach testing)"}},{"before":"6fadd4d639ec4318034511cab1e804a85b658289","after":"6b46ae846b2de377846921a17a7ff3c76b194b9e","ref":"refs/heads/separated_qg","pushedAt":"2024-06-09T04:20:31.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"upd","shortMessageHtmlLink":"upd"}},{"before":"795b36370d0af43cb86fcbfdd1208bd30797ea9b","after":"6fadd4d639ec4318034511cab1e804a85b658289","ref":"refs/heads/separated_qg","pushedAt":"2024-06-09T04:10:45.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"upd","shortMessageHtmlLink":"upd"}},{"before":"037bcced8e35ae6d546fbb260fd268ad4227df69","after":"795b36370d0af43cb86fcbfdd1208bd30797ea9b","ref":"refs/heads/separated_qg","pushedAt":"2024-06-09T04:01:17.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"upd","shortMessageHtmlLink":"upd"}},{"before":"2f181060e6be250c3a4427dab1a7a6cdf8322da1","after":"037bcced8e35ae6d546fbb260fd268ad4227df69","ref":"refs/heads/separated_qg","pushedAt":"2024-06-09T03:54:33.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"upd","shortMessageHtmlLink":"upd"}},{"before":"c1e746fc337fe01174d093f80da6e3f64d34f07c","after":"2f181060e6be250c3a4427dab1a7a6cdf8322da1","ref":"refs/heads/separated_qg","pushedAt":"2024-06-09T03:50:40.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"upd","shortMessageHtmlLink":"upd"}},{"before":null,"after":"c1e746fc337fe01174d093f80da6e3f64d34f07c","ref":"refs/heads/separated_qg","pushedAt":"2024-06-09T02:26:18.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"upd","shortMessageHtmlLink":"upd"}},{"before":"3f9177a14d00ea796e6375d90399ff5b9f3c3276","after":null,"ref":"refs/heads/oml_3.0_release","pushedAt":"2024-06-07T12:29:53.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"}},{"before":"40249f267f771a79ffbbbd39e3b59d9cdc92318b","after":null,"ref":"refs/heads/polishing","pushedAt":"2024-06-07T12:29:52.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"}},{"before":"cd11f9c68b9f901a117cf99bc9bc83a907fd39ba","after":null,"ref":"refs/heads/readme","pushedAt":"2024-06-07T12:29:51.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"}},{"before":"f76e710e8ea4a1a1ed4a8134d9e44dababebb346","after":null,"ref":"refs/heads/ci","pushedAt":"2024-06-07T12:29:49.000Z","pushType":"branch_deletion","commitsCount":0,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"}},{"before":"1a53c4899738872e2e1219db79efb42ae9a19993","after":"c8f3f923584c18b03e9451c4bd166be1cba24f87","ref":"refs/heads/main","pushedAt":"2024-06-07T12:29:11.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"release 3.0: Texts support, modality agnostic classes and functions, RetrievalResults, Docs and others\n\n=================================\r\nIntroducing container for storing Retrieval Results and memory optimized kNN\r\n\r\n=================================\r\nRework retrieval metrics\r\nChangelog:\r\n\r\n- Separated map, precision, cmc and fnmr and pfc metrics (there is no more single function that computes all of them)\r\n- Changed signature of the calc_retrieval_metrics (it works with the top k closest items instead of the full gallery)\r\n- Removed repeated calculations in EmbeddingMetrics when slicing over categories\r\n\r\n==========================================\r\n\r\n==============================================================================\r\n> Made inference modality agnostic\r\n\r\n**Changelog**\r\n\r\nAll the functions and classes on the right side are modality agnostic:\r\n* `EmbeddingPairsDataset`, `ImagePairsDataset` -> `PairDataset`\r\n* `pairwise_inference_on_images`,  `pairwise_inference_on_embeddings` -> `pairwise_inference`\r\n* `IDistancesPostprocessor` ->  (mostly renamed) -> `IRetrievalPostprocessor`\r\n* `PairwisePostprocessor`, `PairwiseEmbeddingsPostprocessor`, `PairwiseImagesPostprocessor` ->  `PairwiseReranker`\r\n* `inference_on_images` -> `inference`\r\n* `inference_on_dataframe` -> `inference_cached`\r\n\r\nAlso: \r\n* `EmbeddingMetrics` takes optional `dataset` argument in order to perform postprocessing. \r\n* Made postprocessing tests a bit more informative via making dummy models a bit less trivial (added bias to their outputs)\r\n\r\nExamples changed:\r\n* `train + val` and `prediction` for postprocessor\r\n* `retrieval usage`\r\n* added `global_paths` parameter to `download_mock_dataset` so it looks nicer\r\n\r\n==============================================================\r\n> Integrated the previous changes with RetrievalResults class. Removed keys. Changed signature of EmbeddingMetrics.\r\n\r\n**CHANGELOG**\r\n\r\n* removed keys: `IS_QUERY_KEY`, `IS_GALLERY_KEY`, `CATEGORIES_KEY`, `PATHS_KEY`, `X1_KEY`, `X2_KEY`, `Y1_KEY`, `Y2_KEY`, `SEQUENCE_KEY`. Categories and sequences are passed through `extra_data` instead. The rest is incapsulated in Dataset.\r\n* Removed `IMetricDDP`, `EmbeddingMetricsDDP`. Reason: having `EmbeddingMetrics` is enough, because we do accumulator sync there anyway.\r\n* Changed signatures of `EmbeddingMetrics`: keys replaces by providing dataset, removed `.sync()` and `.visualisation()` methods and so on.\r\n* Updated `.md` examples and `.rst` docs\r\n\r\nMinor:\r\n* removed: `calc_distance_matrix`, `validate_dataset`  -- this logic happens in `RetrievalResults`, also removed `find_first_occurrences` -- we have `unique_by_ids` instead.\r\n* removed `DummyDataset` in tests (used `EmbeddingsQueryGalleryLabeledDataset` instead).\r\n\r\n==============================================================\r\n> Removed MetricValCallbackDDP and samples_in_getitem \r\n\r\n* `MetricValCallback` is enough to handle DDP\r\n* `samples_in_getitem` is not used\r\n\r\n* Removed visualization.ipynb\r\n\r\n==============================================================\r\n> RetrievalResults as sequence of tensors\r\n\r\n**CHANGELOG**\r\n\r\n* `RetrievalResults` uses Sequence of Tensors which may have different size. In other words, it allows us to support the case when queries have different number of retrieved items.\r\n* Consequently, changed `batched_knn`, `retrieval_metrics` and `PairwiseReranker` to support new input type.\r\n* Added assert that distances arrive sorted to `RetrievalResults`, retrieved ids are unique and other checks. \r\n\r\nNew tests:\r\n* Added tests on corner cases for `RetrievalResults` creation.\r\n* Added tests on visualization when queries in `RetrievalResults` have different number of retrieved items.\r\n* Added new test with predefined values for `batched_knn` to make debugging easier.\r\n* Changed existing postprocessor tests: used `sequence` in datasets so queries have different number of retrieved items and we actually test new functionality.\r\n\r\n@leoromanovich and I also checked that using Sequence of Tensors doesn't lead to poor performance on validation.\r\n\r\n==============================================================\r\n> Metrics: categories, empty predictions, tests, fnmr@fmr\r\n\r\n**CHANGELOG**\r\n\r\n* Added support of empty predictions to the retrieval metrics. For example, it may be useful when we cut retrieval results by distance threshold).\r\n* Moved categories handling to functional metrics from `EmbeddingMetrics` class, also updated `.md` example to show how to deal with categories.\r\n* Added `calc_fnmr_at_fmr_rr`, removed `extract_pos_neg_dists` and `calc_fnmr_at_fmr_from_matrices`. Returned `fnmr@fmr` to `EmbeddingMetrics` (there was a todo).\r\n\r\nTESTS\r\n* Moved tests that use old formats of retrieval metrics to a separate folder: `...test_metrics/test_outdated/...`.\r\n* Added a few new tests on retrieval metrics: test handling categories and empty predictions.\r\n* Added test on `calc_fnmr_at_fmr_rr`.\r\n* Added test that `EmbeddingMetrics` calculate all expected metrics.\r\n\r\n==============================================================\r\n> Moved outdated matrix functions to tests (mask_gt, mask_to_ignore and so on)\r\n\r\n==============================================================\r\n> Misc small changes for OML 3.0\r\n\r\n**CHANGLELOG**\r\n\r\n* Simplified handling nans in bboxes\r\n* Improved typings in training pipelines \r\n* Added show argument to `RetrievalResults.visualise()`\r\n* Specified reason for skipping cloud logging tests\r\n* Polished md examples\r\n* Added `mode_for_checkpointing` to Pipeline config\r\n* Added verbose parameter.\r\n* changed examples optimizer to Adam\r\n\r\n==============================================================\r\n> Added texst support\r\n\r\n* Added `TextBaseDataset`, `TextLabeledDataset`, `TextQueryGalleryLabeledDataset`, `TextQueryGalleryDataset`, `get_mock_texts_dataset`, `visualise_text`\r\n* Added `HFWrapper` to wrap models from HuggingFace library\r\n* `download_mock_dataset` -> `download_mock_dataset` (the original name is also kept for back compatibility)\r\n\r\n==============================================================\r\n> Docs, Readme and examples for OML 3.0 (#570)\r\n\r\nGeneral\r\n* Made imports shorter in all examples (updated the corresponding `__init__.py` files and `__all__` variables)\r\n* `train.md`, `val_md` -> `train_val_img_txt.md`\r\n* Joined example of using pre-trained image models, pre-trained HF text models (just added), and zoo table (moved) into one file.\r\n* Removed links to colab notebooks for all examples except for the `train_val_img_txt.md` code snippet.\r\n* Updated dataset format description: added info about `text` column.\r\n* Updated mock dataset of texts so we have more data in train.\r\n* Added handling categories example (train + val)\r\n\r\nRenaming:\r\n* `download_mock_dataset` got a short link `get_mock_images_dataset` so it looks similar to `get_mock_texts_dataset`\r\n* `RetrievalResults.compute_from_embeddings(embeddings, dataset, n_items_to_retrieve=5)` -> `RetrievalResults.from_embeddings(embeddings, dataset, n_items=5)`\r\n\r\nREADME:\r\n* Added release notes for OML 3.0\r\n* Added side-by-side example of training and validation text and image models\r\n* Added `OML Features` section\r\n* Zoo section is updated\r\n* Updated FAQ, moved to Documentation section\r\n\r\nReadTheDocs:\r\n* Added new text Datasets to docs\r\n* Added `OML Features` section to the home page \r\n* Moved getters for mock datasets from utils to Datasets page\r\n* Python examples: hide most of the examples under details. Added an example of handling categories.\r\n* Updated the page about logging.\r\n* Split post-processing section into re-ranking by model (the old content of post-processing) and algo post-processing (just a page holder for the moment).\r\n* Removed `zoo` section from post-processing by model.\r\n\r\n==============================================================\r\n> Misc improvements for OML 3.0\r\n\r\n* Updated `check_retrieval_format` so it works with text dataset.\r\n* Added code example of usage `check_retrieval_format` (+ the corresponding test)\r\n* Made `last_logs` property and added docs for it (for triplet & arcface losses, bank miner)\r\n* Made `distances`, `retrieved_ids`, `gt_ids` documented properties of `RetrievalResults`.\r\n\r\n==============================================================\r\n> Adaptive and constant thresholding as postprocessors \r\n\r\nALGO POST-PROCESSING\r\n\r\n* Added `AdaptiveThresholding`, `ConstantThresholding`:\r\n    * classes implementation\r\n    * updated registry and configs\r\n    * readthedocs: contents and algo postprocessing page with a new example\r\n    * pytests and pipelines test\r\n    * added thresholding to a few existing python examples\r\n\r\n==============================================================\r\n> MISC\r\n\r\n* Added `is_empty` and `deepcopy` methods to `RetrievalResults`, also updated readthedocs\r\n* Removed `top_n` from `IRetrievalPostprocessor` interface\r\n* Updated code so it can work with the old NN postprocessing and new algorithmic ones. Added todo so we refactor it in future after we have more postprocessors.\r\n* Added drawing test for empty `RetrievalResults`\r\n* Used mock text dataset in categories example instead of the image one (because it has bigger categories)\r\n* Fixed categories handling in pcf metric\r\n* Added docs to calculating metrics by rr\r\n* Added verbose argument\r\n* fixed text visualisation and linters; temprorary turned on full CI on branch\r\n* made sorting check more robust\r\n* updated tolerance when concat distances\r\n\r\n===================================\r\nLast small PRS:\r\n\r\n> Updated links in docs\r\n> Updated main examples cover notes\r\n> Restored normal CI logic","shortMessageHtmlLink":"release 3.0: Texts support, modality agnostic classes and functions, …"}},{"before":"49c04c19ddba4b0c42d39248e547cf45bdd74599","after":"1a53c4899738872e2e1219db79efb42ae9a19993","ref":"refs/heads/main","pushedAt":"2024-06-07T12:22:55.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"Restored normal CI logic\n\nRestored normal CI logic","shortMessageHtmlLink":"Restored normal CI logic"}},{"before":null,"after":"f76e710e8ea4a1a1ed4a8134d9e44dababebb346","ref":"refs/heads/ci","pushedAt":"2024-06-07T12:22:15.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"upd","shortMessageHtmlLink":"upd"}},{"before":"abd3c814f1743549c4ec42ef309609b620171d57","after":"49c04c19ddba4b0c42d39248e547cf45bdd74599","ref":"refs/heads/main","pushedAt":"2024-06-07T12:20:17.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"Updated main examples cover notes\n\nUpdated main examples cover notes","shortMessageHtmlLink":"Updated main examples cover notes"}},{"before":null,"after":"cd11f9c68b9f901a117cf99bc9bc83a907fd39ba","ref":"refs/heads/readme","pushedAt":"2024-06-07T12:19:09.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"upd","shortMessageHtmlLink":"upd"}},{"before":"ca114d6164cf87f4ae68a39665e5e964bf10f793","after":"abd3c814f1743549c4ec42ef309609b620171d57","ref":"refs/heads/main","pushedAt":"2024-06-07T11:55:20.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"Updated links in docs \n\nUpdated links in docs","shortMessageHtmlLink":"Updated links in docs"}},{"before":null,"after":"40249f267f771a79ffbbbd39e3b59d9cdc92318b","ref":"refs/heads/polishing","pushedAt":"2024-06-07T11:52:48.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"updated links","shortMessageHtmlLink":"updated links"}},{"before":"70230cad2ed48ea920d136b19f07d88b1517c23e","after":"ca114d6164cf87f4ae68a39665e5e964bf10f793","ref":"refs/heads/main","pushedAt":"2024-06-07T11:08:28.000Z","pushType":"pr_merge","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"Changes for OML 3.0: modality agnostic, RetrievalResults, TextSupport, reoworked docs and others\n\n==============================================================================\r\n> Made inference modality agnostic\r\n\r\n**Changelog**\r\n\r\nAll the functions and classes on the right side are modality agnostic:\r\n* `EmbeddingPairsDataset`, `ImagePairsDataset` -> `PairDataset`\r\n* `pairwise_inference_on_images`,  `pairwise_inference_on_embeddings` -> `pairwise_inference`\r\n* `IDistancesPostprocessor` ->  (mostly renamed) -> `IRetrievalPostprocessor`\r\n* `PairwisePostprocessor`, `PairwiseEmbeddingsPostprocessor`, `PairwiseImagesPostprocessor` ->  `PairwiseReranker`\r\n* `inference_on_images` -> `inference`\r\n* `inference_on_dataframe` -> `inference_cached`\r\n\r\nAlso: \r\n* `EmbeddingMetrics` takes optional `dataset` argument in order to perform postprocessing. \r\n* Made postprocessing tests a bit more informative via making dummy models a bit less trivial (added bias to their outputs)\r\n\r\nExamples changed:\r\n* `train + val` and `prediction` for postprocessor\r\n* `retrieval usage`\r\n* added `global_paths` parameter to `download_mock_dataset` so it looks nicer\r\n\r\n==============================================================\r\n> Integrated the previous changes with RetrievalResults class. Removed keys. Changed signature of EmbeddingMetrics.\r\n\r\n**CHANGELOG**\r\n\r\n* removed keys: `IS_QUERY_KEY`, `IS_GALLERY_KEY`, `CATEGORIES_KEY`, `PATHS_KEY`, `X1_KEY`, `X2_KEY`, `Y1_KEY`, `Y2_KEY`, `SEQUENCE_KEY`. Categories and sequences are passed through `extra_data` instead. The rest is incapsulated in Dataset.\r\n* Removed `IMetricDDP`, `EmbeddingMetricsDDP`. Reason: having `EmbeddingMetrics` is enough, because we do accumulator sync there anyway.\r\n* Changed signatures of `EmbeddingMetrics`: keys replaces by providing dataset, removed `.sync()` and `.visualisation()` methods and so on.\r\n* Updated `.md` examples and `.rst` docs\r\n\r\nMinor:\r\n* removed: `calc_distance_matrix`, `validate_dataset`  -- this logic happens in `RetrievalResults`, also removed `find_first_occurrences` -- we have `unique_by_ids` instead.\r\n* removed `DummyDataset` in tests (used `EmbeddingsQueryGalleryLabeledDataset` instead).\r\n\r\n==============================================================\r\n> Removed MetricValCallbackDDP and samples_in_getitem \r\n\r\n* `MetricValCallback` is enough to handle DDP\r\n* `samples_in_getitem` is not used\r\n\r\n* Removed visualization.ipynb\r\n\r\n==============================================================\r\n> RetrievalResults as sequence of tensors\r\n\r\n**CHANGELOG**\r\n\r\n* `RetrievalResults` uses Sequence of Tensors which may have different size. In other words, it allows us to support the case when queries have different number of retrieved items.\r\n* Consequently, changed `batched_knn`, `retrieval_metrics` and `PairwiseReranker` to support new input type.\r\n* Added assert that distances arrive sorted to `RetrievalResults`, retrieved ids are unique and other checks. \r\n\r\nNew tests:\r\n* Added tests on corner cases for `RetrievalResults` creation.\r\n* Added tests on visualization when queries in `RetrievalResults` have different number of retrieved items.\r\n* Added new test with predefined values for `batched_knn` to make debugging easier.\r\n* Changed existing postprocessor tests: used `sequence` in datasets so queries have different number of retrieved items and we actually test new functionality.\r\n\r\n@leoromanovich and I also checked that using Sequence of Tensors doesn't lead to poor performance on validation.\r\n\r\n==============================================================\r\n> Metrics: categories, empty predictions, tests, fnmr@fmr\r\n\r\n**CHANGELOG**\r\n\r\n* Added support of empty predictions to the retrieval metrics. For example, it may be useful when we cut retrieval results by distance threshold).\r\n* Moved categories handling to functional metrics from `EmbeddingMetrics` class, also updated `.md` example to show how to deal with categories.\r\n* Added `calc_fnmr_at_fmr_rr`, removed `extract_pos_neg_dists` and `calc_fnmr_at_fmr_from_matrices`. Returned `fnmr@fmr` to `EmbeddingMetrics` (there was a todo).\r\n\r\nTESTS\r\n* Moved tests that use old formats of retrieval metrics to a separate folder: `...test_metrics/test_outdated/...`.\r\n* Added a few new tests on retrieval metrics: test handling categories and empty predictions.\r\n* Added test on `calc_fnmr_at_fmr_rr`.\r\n* Added test that `EmbeddingMetrics` calculate all expected metrics.\r\n\r\n==============================================================\r\n> Moved outdated matrix functions to tests (mask_gt, mask_to_ignore and so on)\r\n\r\n==============================================================\r\n> Misc small changes for OML 3.0\r\n\r\n**CHANGLELOG**\r\n\r\n* Simplified handling nans in bboxes\r\n* Improved typings in training pipelines \r\n* Added show argument to `RetrievalResults.visualise()`\r\n* Specified reason for skipping cloud logging tests\r\n* Polished md examples\r\n* Added `mode_for_checkpointing` to Pipeline config\r\n* Added verbose parameter.\r\n* changed examples optimizer to Adam\r\n\r\n==============================================================\r\n> Added texst support\r\n\r\n* Added `TextBaseDataset`, `TextLabeledDataset`, `TextQueryGalleryLabeledDataset`, `TextQueryGalleryDataset`, `get_mock_texts_dataset`, `visualise_text`\r\n* Added `HFWrapper` to wrap models from HuggingFace library\r\n* `download_mock_dataset` -> `download_mock_dataset` (the original name is also kept for back compatibility)\r\n\r\n==============================================================\r\n> Docs, Readme and examples for OML 3.0 (#570)\r\n\r\nGeneral\r\n* Made imports shorter in all examples (updated the corresponding `__init__.py` files and `__all__` variables)\r\n* `train.md`, `val_md` -> `train_val_img_txt.md`\r\n* Joined example of using pre-trained image models, pre-trained HF text models (just added), and zoo table (moved) into one file.\r\n* Removed links to colab notebooks for all examples except for the `train_val_img_txt.md` code snippet.\r\n* Updated dataset format description: added info about `text` column.\r\n* Updated mock dataset of texts so we have more data in train.\r\n* Added handling categories example (train + val)\r\n\r\nRenaming:\r\n* `download_mock_dataset` got a short link `get_mock_images_dataset` so it looks similar to `get_mock_texts_dataset`\r\n* `RetrievalResults.compute_from_embeddings(embeddings, dataset, n_items_to_retrieve=5)` -> `RetrievalResults.from_embeddings(embeddings, dataset, n_items=5)`\r\n\r\nREADME:\r\n* Added release notes for OML 3.0\r\n* Added side-by-side example of training and validation text and image models\r\n* Added `OML Features` section\r\n* Zoo section is updated\r\n* Updated FAQ, moved to Documentation section\r\n\r\nReadTheDocs:\r\n* Added new text Datasets to docs\r\n* Added `OML Features` section to the home page \r\n* Moved getters for mock datasets from utils to Datasets page\r\n* Python examples: hide most of the examples under details. Added an example of handling categories.\r\n* Updated the page about logging.\r\n* Split post-processing section into re-ranking by model (the old content of post-processing) and algo post-processing (just a page holder for the moment).\r\n* Removed `zoo` section from post-processing by model.\r\n\r\n==============================================================\r\n> Misc improvements for OML 3.0\r\n\r\n* Updated `check_retrieval_format` so it works with text dataset.\r\n* Added code example of usage `check_retrieval_format` (+ the corresponding test)\r\n* Made `last_logs` property and added docs for it (for triplet & arcface losses, bank miner)\r\n* Made `distances`, `retrieved_ids`, `gt_ids` documented properties of `RetrievalResults`.\r\n\r\n==============================================================\r\n> Adaptive and constant thresholding as postprocessors \r\n\r\nALGO POST-PROCESSING\r\n\r\n* Added `AdaptiveThresholding`, `ConstantThresholding`:\r\n    * classes implementation\r\n    * updated registry and configs\r\n    * readthedocs: contents and algo postprocessing page with a new example\r\n    * pytests and pipelines test\r\n    * added thresholding to a few existing python examples\r\n\r\n==============================================================\r\n> MISC\r\n\r\n* Added `is_empty` and `deepcopy` methods to `RetrievalResults`, also updated readthedocs\r\n* Removed `top_n` from `IRetrievalPostprocessor` interface\r\n* Updated code so it can work with the old NN postprocessing and new algorithmic ones. Added todo so we refactor it in future after we have more postprocessors.\r\n* Added drawing test for empty `RetrievalResults`\r\n* Used mock text dataset in categories example instead of the image one (because it has bigger categories)\r\n* Fixed categories handling in pcf metric\r\n* Added docs to calculating metrics by rr\r\n* Added verbose argument\r\n* fixed text visualisation and linters; temprorary turned on full CI on branch\r\n* made sorting check more robust\r\n* updated tolerance when concat distances","shortMessageHtmlLink":"Changes for OML 3.0: modality agnostic, RetrievalResults, TextSupport…"}},{"before":"16d5839594b498d687ce175d703e6eadcb51cc44","after":"3f9177a14d00ea796e6375d90399ff5b9f3c3276","ref":"refs/heads/oml_3.0_release","pushedAt":"2024-06-07T11:00:34.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"AlekseySh","name":"Aleksei Shabanov","path":"/AlekseySh","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/20143575?s=80&v=4"},"commit":{"message":"updated tolerance when concat distances","shortMessageHtmlLink":"updated tolerance when concat distances"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEYtKeeAA","startCursor":null,"endCursor":null}},"title":"Activity · OML-Team/open-metric-learning"}