Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use largest available size for images in Wikipedia articles #42

Open
danburzo opened this issue Oct 15, 2018 · 3 comments
Open

Use largest available size for images in Wikipedia articles #42

danburzo opened this issue Oct 15, 2018 · 3 comments
Assignees
Labels
Available This issue is up for grabs Feature New feature or enhancement

Comments

@danburzo
Copy link
Owner

The idea of the imagesAtFullSize enhancement is to get the largest available image from blogs using Blogspot, WordPress, and the like:

function imagesAtFullSize(doc) {
/*
Replace:
<a href='original-size.png'>
<img src='small-size.png'/>
</a>
With:
<img src='original-size.png'/>
*/
Array.from(doc.querySelectorAll('a > img:only-child')).forEach(img => {
let anchor = img.parentNode;
let original = anchor.href;
// only replace if the HREF matches an image file
if (original.match(/\.(png|jpg|jpeg|gif|svg)$/)) {
img.setAttribute('src', original);
anchor.parentNode.replaceChild(img, anchor);
}
});

However, Wikipedia images are an exception:

<a href="/wiki/File:Perkulator.jpg" class="image">
  <img alt="" src="//upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Perkulator.jpg/250px-Perkulator.jpg" class="thumbimage" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Perkulator.jpg/375px-Perkulator.jpg 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Perkulator.jpg/500px-Perkulator.jpg 2x" data-file-width="1944" data-file-height="2592" width="250" height="333">
</a>

They link to what looks like an image file, but is in fact a HTML page for that image. How can we handle this situation gracefully?

@danburzo danburzo added the Bug Something isn't working label Oct 15, 2018
@danburzo danburzo self-assigned this Oct 15, 2018
@bekicot
Copy link

bekicot commented Oct 15, 2018

Here we go.
https://upload.wikimedia.org/wikipedia/commons/3/3a/Perkulator.jpg
Remove the thumb from url :)

may i help with it?

@danburzo
Copy link
Owner Author

@bekicot sure thing! I looked into it a bit and apparently the "canonical" way to get the image's original URL is to make a query to the Wikipedia API:

https://en.wikipedia.org/w/api.php?action=query&titles=File:Albert_Einstein_(Nobel).png&prop=imageinfo&iiprop=url&format=json

Maybe a good first step is just making imagesAtFullSize ignore wiki image files?

danburzo added a commit that referenced this issue Oct 22, 2018
@danburzo
Copy link
Owner Author

Maybe a good first step is just making imagesAtFullSize ignore wiki image files?

I added this in the commit above.

@danburzo danburzo changed the title imagesAtFullSize breaks images from Wikipedia articles Use largest available size for images in Wikipedia articles Oct 22, 2018
@danburzo danburzo added Feature New feature or enhancement Available This issue is up for grabs and removed Bug Something isn't working labels Oct 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Available This issue is up for grabs Feature New feature or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants