Large collection of pdf files #85

Elektrik00713 · 2024-10-08T14:45:27Z

How can I feed a large collection of pdf files (about 1000) to Semantra? I'm doing this on WSL2/Ubuntu with a bash script, and when I try to parse through the entire collection, the embedding process completes correctly; but, when I launch the local web server and start typing a query, all I see is "Loading" and nothing else happens. In the CMD, the following two lines appear:
RuntimeWarning: Mean of an empty slice.
RuntimeWarning: Invalid value found during scalar divide

Now, using the bash script to batch the files into 100 at a time, I have no problems and can query the collection, but only for that 100 files.
Has anyone come across this before? How can I feed my collection of files all at once?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large collection of pdf files #85

Large collection of pdf files #85

Elektrik00713 commented Oct 8, 2024

Large collection of pdf files #85

Large collection of pdf files #85

Comments

Elektrik00713 commented Oct 8, 2024