Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large collection of pdf files #85

Open
Elektrik00713 opened this issue Oct 8, 2024 · 0 comments
Open

Large collection of pdf files #85

Elektrik00713 opened this issue Oct 8, 2024 · 0 comments

Comments

@Elektrik00713
Copy link

How can I feed a large collection of pdf files (about 1000) to Semantra? I'm doing this on WSL2/Ubuntu with a bash script, and when I try to parse through the entire collection, the embedding process completes correctly; but, when I launch the local web server and start typing a query, all I see is "Loading" and nothing else happens. In the CMD, the following two lines appear:
RuntimeWarning: Mean of an empty slice.
RuntimeWarning: Invalid value found during scalar divide

Now, using the bash script to batch the files into 100 at a time, I have no problems and can query the collection, but only for that 100 files.
Has anyone come across this before? How can I feed my collection of files all at once?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant