Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TheScoreGroup: Cover Images #1831

Open
Ronnie711 opened this issue May 11, 2024 · 5 comments
Open

TheScoreGroup: Cover Images #1831

Ronnie711 opened this issue May 11, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@Ronnie711
Copy link
Contributor

Scraper name: TheScoreGroup

Currently the scraper is grabbing cover image from the scene page, but this essentially just a carousel of screenshots. The correct scene cover can be seen on the home/category/model pages but not on the scene page.

However, thanks to a bit of poking it would seem that the URLs are consistent

Example links

Scene: https://www.scoreland.com/big-boob-videos/Danniella-Levy/50022/
Performer: https://www.scoreland.com/big-boob-models/Danniella-Levy/7951/
Cover Image: (In best quality) https://cdn77.scoreuniverse.com/modeldir/data/posting/50/022/posting_50022_1920.jpg

Note that it's not including the performer id, just splitting the studio code across directories and then including in file name

2nd Example

Scene: https://www.18eighteen.com/xxx-teen-videos/Emma-Bugg/71841/
Performer: https://www.18eighteen.com/teen-babes/Emma-Bugg/9417/
Cover Image: https://cdn77.scoreuniverse.com/modeldir/data/posting/71/841/posting_71841_1920.jpg

(Shoutout to randomuser2022 for pointing me in the right direction)

@Ronnie711 Ronnie711 added the bug Something isn't working label May 11, 2024
@Ronnie711
Copy link
Contributor Author

Been checking back on older scenes & for obvious reasons 1080 images aren't available for everything.

The sliding scale for sizes:

1920x: _1920.jpg
1600x: _1600.jpg
1280x: _1280.jpg
800x: _800.jpg
600x: _xl.jpg
450x: _lg.jpg
225x: _med.jpg
100x: .jpg (No size info = a tiny image!)

Whilst width is consistent, height is variable depending on source as we're dealing with a highly consistent organisation!

@Maista6969
Copy link
Collaborator

This is great research! Am I understanding you right in that the largest size that will be available for all scenes is 800x?

@Ronnie711
Copy link
Contributor Author

This is great research! Am I understanding you right in that the largest size that will be available for all scenes is 800x?

Just checked on the oldest scene on 18eighteen (https://www.18eighteen.com/xxx-teen-videos/Julissa-Delor/11628/) & image is available up to 800. However XL Girls oldest scene (https://www.xlgirls.com/bbw-videos/China/6889/) is only available up to _xl!

Also to note as this is only a 4 digit studio code the directory split is 1 & 3 characters, not 2 & 3 as previously seen (https://cdn77.scoreuniverse.com/modeldir/data/posting/6/889/posting_6889_xl.jpg)

@Ronnie711
Copy link
Contributor Author

Got bored, deep searched 18eighteen.com ...

Currently scraper is using "Poster" for the selector & after searching through 39 pages of scenes, this covers everything back to January 2009. Pre 2009 uses <img src="[https://cdn77.scoreuniverse.com/modeldir/data/posting/12/003/posting_12003_x_med.jpg](view-source:https://cdn77.scoreuniverse.com/modeldir/data/posting/12/003/posting_12003_x_med.jpg)" srcset="https://cdn77.scoreuniverse.com/modeldir/data/posting/12/003/posting_12003_x_med.jpg 169w" (Scene URL: https://www.18eighteen.com/xxx-teen-videos/Alyssa-Star/12003/)

For this a 800x is available: https://cdn77.scoreuniverse.com/modeldir/data/posting/12/003/posting_12003_x_800.jpg so that's annoying.

However, anything from 2009 using the "Poster" selector is showing the largest size image in the selector, we'd only need to modify it for 1280 images to grab the 1920's instead ... Suddenly it's become a lot easier!

@feederbox826
Copy link
Collaborator

I implemented this in python only to realize that sceneScraper is in xPath 😞

Here's the code, it's pretty good *so far

import requests
client = requests.Session()

def test_url(url, quality):
    return client.head(url+quality+".jpg").status_code == 200

def get_best_image(id):
    if len(id) == 4:
        idpath = f"{id[0]}/{id[1:]}"
    elif len(id) == 5:
        idpath = f"{id[0:2]}/{id[2:]}"
    noQualPath = f"https://cdn77.scoreuniverse.com/modeldir/data/posting/{idpath}/posting_{id}"
    # https://github.com/stashapp/CommunityScrapers/issues/1831#issuecomment-2106027395
    for quality in ["_1920", "_1600", "_1280", "_800", "_xl", "_lg", "_med", ""]:
        if test_url(noQualPath, quality):
            print(f"✅ Found {quality} for {id}")
            return noQualPath+quality+".jpg"

print(get_best_image("50022"))
print(get_best_image("11628"))
print(get_best_image("6889"))

output

❯ python .\testimg.py
✅ Found _1920 for 50022
https://cdn77.scoreuniverse.com/modeldir/data/posting/50/022/posting_50022_1920.jpg
✅ Found _800 for 11628
https://cdn77.scoreuniverse.com/modeldir/data/posting/11/628/posting_11628_800.jpg
✅ Found _xl for 6889
https://cdn77.scoreuniverse.com/modeldir/data/posting/6/889/posting_6889_xl.jpg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants