Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent stuck videos from carrying over to next download session ["unavailable fragments" ?] #193

Merged
merged 10 commits into from
Jun 19, 2024
6 changes: 6 additions & 0 deletions cps/tasks/download.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,12 @@ def run(self, worker_thread):
if error:
log.error("[xklb] An error occurred while trying to download %s: %s", error[1], error[0])
self.message = f"{error[1]} failed to download: {error[0]}"
else:
log.error("%s failed to download: No path or error found in the database (likely the video failed due to unavailable fragments?)", self.media_url)
self.message = f"{self.media_url_link} failed to download: No path or error found in the database (likely the video failed due to unavailable fragments?)"
media_id = conn.execute("SELECT id FROM media WHERE webpath = ?", (self.media_url,)).fetchone()[0]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is one record with this webpath guaranteed to exist in the media table, even when the query at line 89 returned no records?

Since the query above has the extra condition ... AND path NOT LIKE 'http%', it's possible that this less restrictive query might return one result (or more?).

If the query fails, trying to access field [0] will throw, and you may need to add an extra check:

results = conn.execute("SELECT id FROM media WHERE webpath = ?", (self.media_url,)).fetchone()
if results:
  media_id = results[0]
  conn.execute("DELETE ...")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand your point well. I am confident one record with this webpath will always be present because it's in fact "the" trigger of the actual download task. It's mandatory. However the condition AND path NOT LIKE 'http%' might not be met under certain circumstances, i.e when the video fails to have it path updated with a path.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed how the media table is created and updated by xklb, and it seems that a record with webpath = media_url will be inserted by the lb dl child process at some point...

Could there be cases where the child process fails before updating the database?

If it's possible (even if rare), then this query will return 0 records, and this code should defensively check before trying to access the result.

Copy link
Collaborator Author

@deldesir deldesir Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The running of the child process alone attests the existence of webpath = media_url. Yes there are cases where lb dl can fail, for example live videos, videos without an available/suitable format, videos stuck due to unavailable fragments or network issues. But it will never return 0 record because it's "metadata fetch" that creates the record exists and ensures this specific record is used per

xklb_full_cmd="lb dl ${XKLB_DB_FILE} --video --search ${URL_OR_SEARCH_TERM} ${FORMAT_OPTIONS} --write-thumbnail --subs -o ${OUTTMPL} ${VERBOSITY}"
(notice the --search option)

conn.execute("DELETE FROM media WHERE webpath = ?", (self.media_url,))
conn.execute("DELETE FROM captions WHERE media_id = ?", (media_id,))
return
except sqlite3.Error as db_error:
log.error("An error occurred while trying to connect to the database: %s", db_error)
Expand Down