From c0e2c9e4e0453510aaeb651c9b8f1252e406b1da Mon Sep 17 00:00:00 2001 From: Sara Adkins Date: Thu, 20 Jun 2024 16:54:05 +0000 Subject: [PATCH] update README memory requirements --- examples/llama7b_sparse_quantized/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/llama7b_sparse_quantized/README.md b/examples/llama7b_sparse_quantized/README.md index 1a48c01afc..35183345d9 100644 --- a/examples/llama7b_sparse_quantized/README.md +++ b/examples/llama7b_sparse_quantized/README.md @@ -2,7 +2,7 @@ This example uses SparseML and Compressed-Tensors to create a 2:4 sparse and quantized Llama2-7b model. The model is calibrated and trained with the ultachat200k dataset. -At least 75GB of GPU memory is required to run this example. +At least 85GB of GPU memory is required to run this example. Follow the steps below one by one in a code notebook, or run the full example script as `python examples/llama7b_sparse_quantized/llama7b_sparse_w4a16.py`