Implement download subcommand, optional positional model name argument #234

GregoryComer · 2024-04-17T07:23:48Z

Implementing download subcommand to download and convert model from HuggingFace. Add an optional positional argument to other torchchat subcommands to use a downloaded model. The model name can be either a known HF path, such as meta-llama/Llama-2-7b-chat-hf, or an alias, such as llama2. Per-model configuration, including the download channel and model aliases, are under config/models.json.

Example usage:

python torchchat.py generate llama2

# Can also explicitly download via
python torchchat.py download llama2

As a follow up, I intend to refactor the CLI model positional arg handling. It might also be nice to intelligently handle multiple file types with the positional arg, such as a gguf.

Test Plan:

python torchchat generate llama2

rm -rf .model-artifacts
python torchchat download llama2
python torchchat generate llama2

python torchchat generate meta-llama/Llama-2-7b-chat-hf
python torchchat.py generate llama2 --dtype fp16 --device cuda

python torchchat.py generate --checkpoint-path=.model-artifacts/meta-llama/Llama-2-7b-chat-hf/model.pth --dtype fp16 --device cuda

python torchchat.py generate stories15M
python torchchat.py generate stories110M
python torchchat.py generate mistral-7b-instruct

CI for model options are covered here:
--gguf-path:

torchchat/.github/workflows/pull.yml

Line 298 in 5203f0c

- name: Run GGUF export + inference

--dso-path:

torchchat/.ci/scripts/validate.sh

Line 108 in 5203f0c

    
           python3 -W ignore generate.py --dtype ${DTYPE} --checkpoint-path "$CHECKPOINT_PATH" --dso-path "$MODEL_DIR/${MODEL_NAME}.so" --prompt "$PROMPT" --device "$TARGET_DEVICE" > "$MODEL_DIR/output_aoti" || exit 1

--pte-path:

torchchat/.ci/scripts/validate.sh

Line 172 in 5203f0c

    
           python3 -W ignore export.py --checkpoint-path "$CHECKPOINT_PATH" --output-pte-path "$MODEL_DIR/${MODEL_NAME}.pte" -d "fp32" || exit 1

Since there are many ways to load a model, I'm relying on CI to exercise many of the paths.

mergennachin · 2024-04-17T14:26:30Z

Make sure to run linter

Setup:

pip install -r requirements-lintrunner.txt
lintrunnner init

lintrunner -a --all-files

mergennachin · 2024-04-17T20:13:17Z

Can you add a CI test to exercise the download path?

GregoryComer · 2024-04-17T23:05:06Z

Can you add a CI test to exercise the download path?

I'm going to actually defer this because converting some of the larger models takes over an hour. We do need CI coverage, but I might need to experiment with runner size and the choice of model, and I want to land this to unblock others.

Tracking via T186104081.

mikekgfb · 2024-04-18T03:40:03Z

Can you add a CI test to exercise the download path?

I'm going to actually defer this because converting some of the larger models takes over an hour. We do need CI coverage, but I might need to experiment with runner size and the choice of model, and I want to land this to unblock others.

Tracking via T186104081.

Sounds like you should download gguf file that's heavily quantized, and/or stories15M!

GregoryComer · 2024-04-18T04:51:57Z

Can you add a CI test to exercise the download path?

I'm going to actually defer this because converting some of the larger models takes over an hour. We do need CI coverage, but I might need to experiment with runner size and the choice of model, and I want to land this to unblock others.
Tracking via T186104081.

Sounds like you should download gguf file that's heavily quantized, and/or stories15M!

GGUF has it's own conversion logic. Stories is also a little bit special because it has a unique format and I'll have to add special logic to handle the download. That being said, it would be nice to have, so I'll probably do that.

I want to look more into why it takes upwards of an hour to convert a 7B model on the runner, though. Seems like something is wrong. It shouldn't take that long to shuffle around the weights.

Edit:
I've done a bit more refactoring and added support for the stories models via positional argument. There is a new ModelConfig class and dict that encapsulated the differences in download and conversion.

byjlw

Resolve Michael's issue and merge in the changes to support Llama3 and merge.

byjlw · 2024-04-19T14:53:33Z

cli.py

+        "--checkpoint-dir",
+        type=Path,
+        default=None,
+        help="Model checkpoint directory.",


not sure what you mean by developer-only option

.github/workflows/pull.yml

swolchok · 2024-04-19T16:52:27Z

cli.py

+    parser.add_argument(
+        "--gguf-path",
+        type=Path,
+        default=None,
+        help="GGUF file path.",
+    )


I see that currently, specifying this with DSO or pte is only a warning; IMO we should hard error because it's easily fixed and a great way to waste a lot of time

Agreed, though we should probably take this as a follow up.

swolchok · 2024-04-19T16:54:14Z

build/convert_hf_checkpoint.py

+    if model_dir is None:
+        model_dir = Path("checkpoints/meta-Transformer/Transformer-2-7b-chat-hf")


why is the default something that's not even in models.json?

@mikekgfb Do we need this default value anymore?

no, but we should put this or a similarly situated chat model into the models.json.

BTW, I really think it's bad to have even the model name default to something (unless we're so excited about llama3 that we make it that.... but that will require users to have obtained a token)

swolchok · 2024-04-19T16:55:09Z

config/model_config.py

+    if model in model_aliases:
+        model = model_aliases[model]


nit: model = model_aliases.get(model, model) is shorter FWIW

swolchok · 2024-04-19T16:57:20Z

download.py

+        print(f"Downloading {url}...")
+        urllib.request.urlretrieve(url, str(local_path.absolute()))


would be nice to use progressbar or tqdm to show a progress bar since these downloads can be big; can leave for follow-up

I was thinking about this for checkpoint conversion, as well. My only concern was an additional dependency, but if that's not a worry, I can go ahead and add it.

I think makes sense. Let's make sure we lazily import it, maybe I don't want to wait if I am not downloading/converting?

…nfig.py

mikekgfb

Please review and address comments (either why we should do something else, or do as suggested, either works... but we should document why we choose what we choose)

mikekgfb · 2024-04-19T21:25:07Z

build/builder.py

@@ -134,9 +144,12 @@ def from_args(cls, args):  # -> TokenizerArgs:

        if args.tokenizer_path:
            tokenizer_path = args.tokenizer_path
+        elif args.model:  # Using a named, well-known model
+            model_config = resolve_model_config(args.model)
+            tokenizer_path = Path(args.model_directory) / model_config.name / "tokenizer.model"


Well known doesn't mean it's local. how do you know where the tokenizer is?

mikekgfb · 2024-04-19T21:27:27Z

build/convert_hf_checkpoint.py

+    if model_dir is None:
+        model_dir = Path("checkpoints/meta-Transformer/Transformer-2-7b-chat-hf")


no, but we should put this or a similarly situated chat model into the models.json.

BTW, I really think it's bad to have even the model name default to something (unless we're so excited about llama3 that we make it that.... but that will require users to have obtained a token)

mikekgfb · 2024-04-19T21:30:45Z

download.py

+        print(f"Downloading {url}...")
+        urllib.request.urlretrieve(url, str(local_path.absolute()))


I think makes sense. Let's make sure we lazily import it, maybe I don't want to wait if I am not downloading/converting?

mikekgfb · 2024-04-19T21:34:01Z

generate.py

@@ -546,8 +550,6 @@ def callback(x):


 def main(args):
-    is_chat = args.subcommand == "chat"
-
    # If a named model was provided and not downloaded, download it.
    if args.model and not is_model_downloaded(args.model, args.model_directory):


you should intercept this in a central place, like in cli() because all functions basically need to do the same? So we dupe it in a gazillion places?

#234) * Implement download option * Add support for model aliases * Support model name as a positional parameter * Merge GenerateArgs changes * Run lint * Revert chat subcommand/arg changes * Add mistral-7b-instruct alias, fix lints * Add model config for known models * Move known model config to config/models.json * Make model names case-insensitive * Move known model configuration from build/model.py to config/model_config.py * Fix lints * Fixing issues after rebasing * Update README

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 17, 2024

GregoryComer force-pushed the streamline-download branch 3 times, most recently from 9d595bd to 7a60392 Compare April 17, 2024 10:31

GregoryComer changed the title ~~Implement download subcommand (WIP)~~ Implement download subcommand, positional model name argument (WIP) Apr 17, 2024

GregoryComer changed the title ~~Implement download subcommand, positional model name argument (WIP)~~ Implement download subcommand, positional model name argument Apr 17, 2024

GregoryComer force-pushed the streamline-download branch from 7a60392 to 0dc48b0 Compare April 17, 2024 19:26

GregoryComer marked this pull request as ready for review April 17, 2024 19:46

GregoryComer requested review from mikekgfb and byjlw April 17, 2024 19:48

GregoryComer changed the title ~~Implement download subcommand, positional model name argument~~ Implement download subcommand, optional positional model name argument Apr 17, 2024

GregoryComer force-pushed the streamline-download branch 5 times, most recently from 4f56a24 to 29516d5 Compare April 17, 2024 22:09

GregoryComer force-pushed the streamline-download branch 3 times, most recently from 88859fd to 3f6eb29 Compare April 17, 2024 23:37

GregoryComer force-pushed the streamline-download branch 6 times, most recently from 8fbc926 to da23171 Compare April 18, 2024 10:28

byjlw requested changes Apr 19, 2024

View reviewed changes

swolchok reviewed Apr 19, 2024

View reviewed changes

GregoryComer force-pushed the streamline-download branch from 5ce3954 to 39f81c7 Compare April 19, 2024 18:17

GregoryComer and others added 12 commits April 19, 2024 12:56

Implement download option

07ae9c6

Add support for model aliases

a8d145a

Support model name as a positional parameter

73f1019

Merge GenerateArgs changes

a28da17

Run lint

4c3250c

Revert chat subcommand/arg changes

64318e2

Add mistral-7b-instruct alias, fix lints

f40cccc

Add model config for known models

79766dc

Move known model config to config/models.json

3dee14e

Make model names case-insensitive

4e71dfa

Move known model configuration from build/model.py to config/model_co…

8b6fbb1

…nfig.py

Fix lints

08e80ac

GregoryComer force-pushed the streamline-download branch from 39f81c7 to 1c88063 Compare April 19, 2024 19:59

Fixing issues after rebasing

5eab970

GregoryComer force-pushed the streamline-download branch from 1c88063 to 5eab970 Compare April 19, 2024 20:03

byjlw approved these changes Apr 19, 2024

View reviewed changes

Update README

54543e0

GregoryComer merged commit f08eb05 into pytorch:main Apr 19, 2024
19 checks passed

mikekgfb reviewed Apr 19, 2024

View reviewed changes

mikekgfb deleted the streamline-download branch April 19, 2024 21:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement download subcommand, optional positional model name argument #234

Implement download subcommand, optional positional model name argument #234

GregoryComer commented Apr 17, 2024 •

edited

Loading

mergennachin commented Apr 17, 2024

mergennachin commented Apr 17, 2024

GregoryComer commented Apr 17, 2024 •

edited

Loading

mikekgfb commented Apr 18, 2024

GregoryComer commented Apr 18, 2024 •

edited

Loading

byjlw left a comment

byjlw Apr 19, 2024

swolchok Apr 19, 2024

GregoryComer Apr 19, 2024

swolchok Apr 19, 2024

GregoryComer Apr 19, 2024

mikekgfb Apr 19, 2024

swolchok Apr 19, 2024

swolchok Apr 19, 2024

GregoryComer Apr 19, 2024

mikekgfb Apr 19, 2024

mikekgfb left a comment

mikekgfb Apr 19, 2024

mikekgfb Apr 19, 2024

mikekgfb Apr 19, 2024

mikekgfb Apr 19, 2024

		if model_dir is None:
		model_dir = Path("checkpoints/meta-Transformer/Transformer-2-7b-chat-hf")

		print(f"Downloading {url}...")
		urllib.request.urlretrieve(url, str(local_path.absolute()))

Implement download subcommand, optional positional model name argument #234

Implement download subcommand, optional positional model name argument #234

Conversation

GregoryComer commented Apr 17, 2024 • edited Loading

mergennachin commented Apr 17, 2024

mergennachin commented Apr 17, 2024

GregoryComer commented Apr 17, 2024 • edited Loading

mikekgfb commented Apr 18, 2024

GregoryComer commented Apr 18, 2024 • edited Loading

byjlw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mikekgfb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GregoryComer commented Apr 17, 2024 •

edited

Loading

GregoryComer commented Apr 17, 2024 •

edited

Loading

GregoryComer commented Apr 18, 2024 •

edited

Loading