Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Is there a memory leak in huggingface embedding with pipeline mode #2054

Open
mshakirDr opened this issue Aug 10, 2024 · 1 comment
Labels
question Further information is requested

Comments

@mshakirDr
Copy link

Question

I have been trying to ingest about 1000 PDFs through PGPT. After testing I found that pipeline with 1 worker is the fastest option on my system (any more workers hinder the speed). However, I found that the 8 GB VRAM and 32 GB (out of 64 GB) shared memory of my system quickly gets occupied even if I try to ingest 10 PDFs at a time. I tried to circumvent the memory hogging issue by restarting the pipeline every time. See below how I build a chunking solution by using LocalIngestWorker from ingest_folder.py.

    files = get_list_of_combined_files(folders)
    print(len(files))
    split_into_chunks = lambda lst, n: [lst[i:i+n] for i in range(0, len(lst), n)]
    list_of_size_30_chunks = split_into_chunks(files, 10)
    for index, chunk in enumerate(list_of_size_30_chunks):
        print("Chunk number", index, "of", len(list_of_size_30_chunks))
        destination = r"\Temp\\"
        copy_new_files(destination, chunk)
        ingest_service = global_injector.get(IngestService)
        settings = global_injector.get(Settings)
        worker = LocalIngestWorker(ingest_service, settings)
        worker.ingest_folder(Path(destination), irgnored)
        del worker
        del ingest_service
        del settings

However this does not release the memory at the end of for loop and the same problem persists (I even tried del with no luck). I tried to search around about potential memory leak issues with huggingface text embeddings solution: found this memory leak issue.
Is it just me or anyone else also facing the same issue with ingest mode pipeline, huggingface on an nvidia gpu? I would appreciate any solution or suggestions.

@mshakirDr mshakirDr added the question Further information is requested label Aug 10, 2024
@mshakirDr
Copy link
Author

I have found a work around by ingesting 5 pdfs at one time, then clear torch cuda cache, and restart the process again (pipeline mode, mock profile, huggingface embedding model). It is slow, but it works. The memory is reset after every batch. It takes time to write the results to database, the GPU is idle in the meantime but it is the most efficient way I could find based on my hardware. Added the following at the end of my code adapted from ingest_folder.py.

    del worker
    del settings
    del ingest_service
    with torch.no_grad():
        torch.cuda.empty_cache()
        gc.collect()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant