Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Books OCR batch 2 #265

Open
eroux opened this issue Dec 23, 2023 · 5 comments
Open

Google Books OCR batch 2 #265

eroux opened this issue Dec 23, 2023 · 5 comments

Comments

@eroux
Copy link
Contributor

eroux commented Dec 23, 2023

Here's the list of 3719 pechas to import from Google Books (it's all already on S3):

ocr_archive.csv

@kaldan007
Copy link
Contributor

I have imported the google books in batches. these are the three batches that has been complelted till date. I am having some issues with remaining works. I will examine the bug and keep u updated.

batch_01_opfs.csv
batch_02.csv
batch_03_opf_catalog.csv

@eroux
Copy link
Contributor Author

eroux commented Mar 7, 2024

thanks a lot!

@kaldan007
Copy link
Contributor

2024-03-07 11:48:38,191 - ERROR - Error downloading W10206--Batch 2022 missing
2024-03-07 11:48:41,843 - ERROR - Error downloading W19792--Batch 2022 missing
2024-03-07 11:48:48,362 - ERROR - Error downloading W1KG25527--Batch 2022 missing
2024-03-07 11:49:04,086 - ERROR - Error downloading W1PD159442--Batch 2022 missing
2024-03-07 11:49:06,488 - ERROR - Error downloading W1PD159533--Batch 2022 missing
2024-03-07 11:49:26,673 - ERROR - Error downloading W3CN7719--Batch 2022 missing
2024-03-07 11:49:28,995 - ERROR - Error downloading W8LS18002--Batch 2022 missing

@kaldan007
Copy link
Contributor

@eroux these work id doesn't have batch_2022

@kaldan007
Copy link
Contributor

here is the last batch.

batch_04.csv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants