Update _read_bytes function in images.py support for URL #548

stoensin · 2024-04-26T07:44:31Z

Add support for fetching images from URLs in _read_bytes_image

Previously, the function only supported reading image files from local paths. This commit extends its functionality to also fetch images from URLs if the path starts with 'http://' or 'https://'. The function now uses the 'requests' library to download the image content when a URL is provided.

If the URL fetch is successful, the image content is returned as bytes. If the status code is not 200, a ValueError is raised with a message indicating the failure.

The existing file reading functionality remains unchanged for non-URL paths, ensuring backward compatibility.

Changes:

Modified _read_bytes_image to check if the path is a URL
Added conditional logic to handle URL paths using requests.get
Included error handling for failed URL fetch attempts

For now, we don't push you to follow any predefined schema of issue, but ensure you've already read our contribution guide: https://open-metric-learning.readthedocs.io/en/latest/from_readme/contributing.html.

Add support for fetching images from URLs in _read_bytes_image Previously, the function only supported reading image files from local paths. This commit extends its functionality to also fetch images from URLs if the path starts with 'http://' or 'https://'. The function now uses the 'requests' library to download the image content when a URL is provided. If the URL fetch is successful, the image content is returned as bytes. If the status code is not 200, a ValueError is raised with a message indicating the failure. The existing file reading functionality remains unchanged for non-URL paths, ensuring backward compatibility. Changes: - Modified _read_bytes_image to check if the path is a URL - Added conditional logic to handle URL paths using requests.get - Included error handling for failed URL fetch attempts

AlekseySh · 2024-04-27T02:35:30Z

Hey, @stoensin ! Thank you for the interest in OML.

I think we need some tests for this. As an input you can use this url: https://i.ibb.co/wsmD5r4/photo-2022-06-06-17-40-52.jpg. You can put it here: https://github.com/OML-Team/open-metric-learning/tree/main/tests/test_oml/test_datasets

DaloroAT · 2024-05-10T14:28:28Z

oml/datasets/images.py

-            return fin.read()
+        if isinstance(path, str) and path.startswith(('http://', 'https://')):
+            # If the path is a URL, use requests to fetch the content
+            response = requests.get(path)


From a performance point of view, it would be more efficient to create a session instance self.session = self.session or requests.Session() and make requests using self.session.get(...). Otherwise, for each call of __getitem__, you will need to establish a new connection with the remote server (DNS resolution, certificate exchange, etc). Establishing a connection can take a huge amount of time.

I'm not sure about an exact implementation but doing requests.get is not a good idea for dataset scenario, because we can reuse a session.

AlekseySh · 2024-06-07T15:58:40Z

Hey, @stoensin ! Let's move on with this PR. Please, see the comment above: #548 (comment)

AlekseySh assigned stoensin Apr 27, 2024

AlekseySh added the rework label Apr 27, 2024

AlekseySh added this to In progress in OML-planning via automation Apr 27, 2024

AlekseySh moved this from In progress to Review in progress in OML-planning Apr 27, 2024

DaloroAT reviewed May 10, 2024

View reviewed changes

AlekseySh removed this from Review in progress in OML-planning Jun 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update _read_bytes function in images.py support for URL #548

Update _read_bytes function in images.py support for URL #548

stoensin commented Apr 26, 2024

AlekseySh commented Apr 27, 2024

DaloroAT May 10, 2024

AlekseySh commented Jun 7, 2024

Update _read_bytes function in images.py support for URL #548

Are you sure you want to change the base?

Update _read_bytes function in images.py support for URL #548

Conversation

stoensin commented Apr 26, 2024

AlekseySh commented Apr 27, 2024

DaloroAT May 10, 2024

Choose a reason for hiding this comment

AlekseySh commented Jun 7, 2024