Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subreddit.stream.submissions() stops after an hour or so #16

Closed
AltF02 opened this issue Nov 18, 2020 · 9 comments · Fixed by #18
Closed

subreddit.stream.submissions() stops after an hour or so #16

AltF02 opened this issue Nov 18, 2020 · 9 comments · Fixed by #18
Labels
Bug Something isn't working

Comments

@AltF02
Copy link

AltF02 commented Nov 18, 2020

Describe the bug
subreddit.stream.submissions() stops after an hour or so

To Reproduce
Steps to reproduce the behavior:

  1. Create an submissions stream
  2. Run for a couple of hours
  3. Observe

Expected behavior
submissions.stream() never stops

Code/Logs

        subreddit = await reddit.subreddit("dankmemes+memes+okbuddyretard+specialsnowflake+pewdiepiesubmissions")
        
        async for submission in subreddit.stream.submissions(skip_existing=True):
            keywords = DataBase.get_keywords()
            matching = [s for s in keywords if s[0].lower() in submission.title.lower()]
            if matching:
                await self.send_notification(submission, matching[0])

It reaches the end of the for loop before just dying, so I'm not expecting there being a blocking method that may be causing this

System Info

  • OS: Ubuntu 18.04.5 LTS x86_64
  • Python: 3.8.3
  • Async PRAW Version: 7.1.0
@AltF02
Copy link
Author

AltF02 commented Nov 18, 2020

My current work around is to break the loop after a 100 posts or so. And start it over again

@PythonCoderAS
Copy link
Contributor

This makes sense in part due to Reddit marking the Access Token as invalid after an hour, therefore PRAW needs to re-request a new token. But what doesn't make sense is that this should occur in the background. Is any exception being printed? In async functions, exceptions do not terminate the program, but instead, get printed to STDERR if not handled.

@AltF02
Copy link
Author

AltF02 commented Nov 19, 2020

No, I'm not aware of any exceptions sadly

@cmays90
Copy link
Contributor

cmays90 commented Nov 26, 2020

I ran across this issue. If the last item grabbed from the stream is deleted, asyncPRAW never grabs the next item. It has something to do with the "before" variable, but I don't fully understand why that is. There are no exceptions in the background.

To Reproduce
This might be a tad verbose, but it uses a mod account to both make a post and remove it, so appropriate mod permissions must be set on the account and environment variables set. This will run forever, so Ctrl+C to stop it when done.

import asyncio, asyncpraw
import os

loop = asyncio.get_event_loop()

reddit = asyncpraw.Reddit(client_id=os.getenv("REDDIT_CLIENT_ID"),
                  client_secret=os.getenv("REDDIT_CLIENT_SECRET"),
                  user_agent=os.getenv("REDDIT_USERAGENT"),
                  username=os.getenv("REDDIT_USERNAME"),
                  password=os.getenv("REDDIT_PASSWORD"),
                  loop = loop)
reddit.validate_on_submit = True

async def reddit_submissions():
    subreddit = await reddit.subreddit(os.getenv("REDDIT_SUBREDDIT"))
    while True:
        try:
            async for submission in subreddit.stream.submissions(skip_existing=True):
                print(f"New submission: {str(submission.title)}")
        except:
            print("Unexpected error occured")
            exception += 1

async def reddit_make_posts():
    subreddit = await reddit.subreddit(os.getenv("REDDIT_SUBREDDIT"))
    await asyncio.sleep(5) # 5 seconds to ensure that the reddit_submissions starts successfully
    submission = await subreddit.submit("This post will be deleted", "self")
    print("Submitted 1st")
    await asyncio.sleep(10) # 10 seconds to ensure that the reddit_submissions grabs it
    await submission.mod.remove()
    print("Removed")
    await asyncio.sleep(10) # 10 seconds to ensure that reddit removes this
    submission = await subreddit.submit("This post will never be grabbed", "self")
    print("Submitted 2nd")

loop.create_task(reddit_submissions())
loop.create_task(reddit_make_posts())
loop.run_forever()

@PythonCoderAS
Copy link
Contributor

Is this reproducible in normal PRAW or what?

@LilSpazJoekp
Copy link
Member

This is very interesting and it would also affect main PRAW as well. This could be why some streams randomly stop producing results. I wonder if this is related to praw-dev/praw#1025.

@LilSpazJoekp
Copy link
Member

Does this occur to items deleted as well?

@cmays90
Copy link
Contributor

cmays90 commented Nov 26, 2020

My testing with main PRAW does not have the same bug. Main PRAW seems to occasionally poll the new.json endpoint without the before parameter.

Looking through PRAW vs asyncPRAW, the only difference in the StreamGenerator that could trigger this is one line:

PRAW: if not exclude_before:
https://github.com/praw-dev/praw/blob/a1f7e015a8a80c08ef70069d341e45bd74f9145e/praw/models/util.py#L177

asyncPRAW: if not exclude_before and before_attribute:

if not exclude_before and before_attribute:

@cmays90
Copy link
Contributor

cmays90 commented Nov 27, 2020

Verified the PR works with the sample code provided.

@LilSpazJoekp LilSpazJoekp added the Bug Something isn't working label Jan 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants