Better error messages #63

brentyi · 2023-08-25T17:06:26Z

This PR takes an initial step in the direction of #62. I expect there's more work to be done; I'd also like to add tests before merging. @vwxyzjn if you have any thoughts or suggestions I'd also be interested!

(1)
We previously relied on argparse for most of the error messages reported by tyro. We continue to do this, but the formatting now better emphasize the actual error.

Before:

After:

(2)
When an invalid value is passed in, we show the argument's documentation + helptext.

Before:

After:

(3)
When an argument that doesn't exist is passed in, we now highlight similar arguments.

Before:

After:

(4)
Finally, when subcommands are used, similar arguments are coupled with the subcommands that they can be found in:

codecov · 2023-08-25T17:13:06Z

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.16% 🎉

Comparison is base (5340498) 98.96% compared to head (4e71903) 99.12%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #63      +/-   ##
==========================================
+ Coverage   98.96%   99.12%   +0.16%     
==========================================
  Files          23       23              
  Lines        1829     1953     +124     
==========================================
+ Hits         1810     1936     +126     
+ Misses         19       17       -2

Flag	Coverage Δ
unittests	`99.12% <100.00%> (+0.16%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed	Coverage Δ
tyro/_fields.py	`99.66% <ø> (ø)`
tyro/_parsers.py	`98.66% <ø> (ø)`
tyro/_argparse_formatter.py	`97.02% <100.00%> (+2.78%)`	⬆️
tyro/_arguments.py	`100.00% <100.00%> (ø)`
tyro/_calling.py	`100.00% <100.00%> (ø)`
tyro/_cli.py	`100.00% <100.00%> (ø)`
tyro/_instantiators.py	`98.72% <100.00%> (+<0.01%)`	⬆️
tyro/_strings.py	`100.00% <100.00%> (ø)`

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

lot!)

brentyi · 2023-08-26T23:43:44Z

(5)

Improvements when subcommands are present.

For example, when we pass in an argument in the wrong location:

# Incorrect
python 03_multiple_subcommands.py dataset:mnist --dataset.binary True optimizer:sgd

# Correct
python 03_multiple_subcommands.py dataset:mnist optimizer:sgd --dataset.binary True

The old error would be completely irrelevant:

Now:

This has consequences for a lot of things. For example, in nerfstudio --data is a valid argument but --dataset is not.

Previously, if you ran ns-train nerfacto --dataset /some_path the error would be super cryptic:

Now, we get:

vwxyzjn · 2023-08-26T23:57:23Z

Love it! Maybe in the error message, also give out the docs for the Similar arguments:, so the user would not necessarily need to do python my.py --help.

vwxyzjn · 2023-08-27T00:00:52Z

Maybe another suggestion is that the Parsing error should appear before the usage section, since the Parsing error is more informative and directly helpful.

brentyi · 2023-08-27T10:24:58Z

Makes sense! I added the help messages, so as an example if you write --model.features-per-level instead of --pipeline.model.features-per-level you now get:

On the ordering, do you feel strongly about this? To me it's slightly more intuitive to have the most relevant (error) message on the bottom, since it reduces the likelihood that the user needs to scroll in order to see it. Kind of like a stack trace; it's also closer to the original argparse error format.

In any case I'm going to let this PR stew for a few days before merging + releasing; I'd like to spend more time thinking about edge cases + test cases.

vwxyzjn · 2023-08-27T13:54:39Z

Nice! Did you push the related changes? I am still getting the same result.

ip install git+https://github.com/brentyi/tyro.git@brent/better_errors
Collecting git+https://github.com/brentyi/tyro.git@brent/better_errors
  Cloning https://github.com/brentyi/tyro.git (to revision brent/better_errors) to /tmp/pip-req-build-a_nma7k_
  Running command git clone --filter=blob:none --quiet https://github.com/brentyi/tyro.git /tmp/pip-req-build-a_nma7k_
  Running command git checkout -b brent/better_errors --track origin/brent/better_errors
  Switched to a new branch 'brent/better_errors'
  Branch 'brent/better_errors' set up to track remote branch 'brent/better_errors' from 'origin'.
  Resolved https://github.com/brentyi/tyro.git to commit ef6e733d301c8f8cd0609d986cab59e4012905c5

brentyi · 2023-08-28T01:39:48Z

Does it work with your original flag, --rewar.flag? The currently similarity metric is just based on difflib and fairly naive, so --reward.track and --track aren't considered similar enough. I can add some heuristics for catching cases like this before merging.

vwxyzjn · 2023-08-30T22:24:36Z

Btw really good work! I am loving the changes.

On the ordering, do you feel strongly about this? To me it's slightly more intuitive to have the most relevant (error) message on the bottom, since it reduces the likelihood that the user needs to scroll in order to see it. Kind of like a stack trace; it's also closer to the original argparse error format.

I do like the fact that the error shows up in the bottom, but at the same time most of the messages do not feel useful. For example, I have the following with the latest PR. It's not until the end that I see the most useful message.

Maybe as an alternative you can say "for usage info please run python xx.py --help` and just show the parse error message?

 python lm_human_preference_details/train_both_accelerate.py --track
2023-08-30 17:10:35.136557: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
usage: train_both_accelerate.py [-h] [--exp-name STR] [--seed INT]
                                [--reward.exp-name STR]
                                [--reward.seed INT]
                                [--reward.track | --reward.no-track]
                                [--reward.wandb-project-name STR]
                                [--reward.wandb-entity {None}|STR]
                                [--reward.cuda | --reward.no-cuda]
                                [--reward.base-model STR]
                                [--reward.label-dataset STR]
                                [--reward.local-batch-size INT]
                                [--reward.gradient-accumulation-steps INT]
                                [--reward.lr FLOAT] [--reward.eps FLOAT]
                                [--reward.rollout-batch-size INT]
                                [--reward.local-normalize-samples INT]
                                [--reward.debug-normalize INT]
                                [--reward.normalize-before | 
--reward.no-normalize-before]
                                [--reward.normalize-after | 
--reward.no-normalize-after]
                                [--reward.print-sample-output-freq INT]
                                [--reward.save-path STR]
                                [--reward.use-tensorflow-adam | 
--reward.no-use-tensorflow-adam]
                                [--reward.task.query-length INT]
                                [--reward.task.query-dataset STR]
                                [--reward.task.query-prefix STR]
                                [--reward.task.query-suffix STR]
                                [--reward.task.start-text {None}|STR]
                                [--reward.task.end-text {None}|STR]
                                [--reward.task.response-length INT]
                                [--reward.task.temperature FLOAT]
                                [--reward.labels.type STR]
                                [--reward.labels.num-train INT]
                                [--reward.labels.num-labels INT]
                                [--reward.labels.source STR]
                                [--policy.exp-name STR]
                                [--policy.seed INT]
                                [--policy.track | --policy.no-track]
                                [--policy.wandb-project-name STR]
                                [--policy.wandb-entity {None}|STR]
                                [--policy.cuda | --policy.no-cuda]
                                [--policy.base-model STR]
                                [--policy.print-sample-output-freq INT]
                                [--policy.save-path STR]
                                [--policy.use-tensorflow-adam | 
--policy.no-use-tensorflow-adam]
                                [--policy.task.query-length INT]
                                [--policy.task.query-dataset STR]
                                [--policy.task.query-prefix STR]
                                [--policy.task.query-suffix STR]
                                [--policy.task.start-text {None}|STR]
                                [--policy.task.end-text {None}|STR]
                                [--policy.task.response-length INT]
                                [--policy.task.truncate-token INT]
                                [--policy.task.truncate-after INT]
                                [--policy.task.penalty-reward-value INT]
                                [--policy.task.temperature FLOAT]
                                [--policy.rewards.kl-coef FLOAT]
                                [--policy.rewards.trained-model 
{None}|STR]
                                [--policy.ppo.total-episodes INT]
                                [--policy.ppo.local-batch-size INT]
                                [--policy.ppo.gradient-accumulation-steps 
INT]
                                [--policy.ppo.nminibatches INT]
                                [--policy.ppo.noptepochs INT]
                                [--policy.ppo.lr FLOAT]
                                [--policy.ppo.eps FLOAT]
                                [--policy.ppo.vf-coef FLOAT]
                                [--policy.ppo.cliprange FLOAT]
                                [--policy.ppo.cliprange-value FLOAT]
                                [--policy.ppo.gamma FLOAT]
                                [--policy.ppo.lam FLOAT]
                                [--policy.ppo.whiten-rewards | 
--policy.ppo.no-whiten-rewards]
                                [{policy.rewards.adaptive-kl:adaptive-kl-p
arams,policy.rewards.adaptive-kl:None}]

╭─ Parsing error ──────────────────────────────────────────────────────────╮
│ Unrecognized arguments: --track                                          │
│ ──────────────────────────────────────────────────────────────────────── │
│ Similar arguments:                                                       │
│     --policy.track, --policy.no-track                                    │
│         if toggled, this experiment will be tracked with Weights and     │
│         Biases (default: False)                                          │
│             in train_both_accelerate.py --help                           │
│     --reward.track, --reward.no-track                                    │
│         if toggled, this experiment will be tracked with Weights and     │
│         Biases (default: False)                                          │
│             in train_both_accelerate.py --help                           │
╰──────────────────────────────────────────────────────────────────────────╯

brentyi · 2023-08-31T17:43:49Z

Ok! We now won't print usage messages that are 400 characters or longer:

vwxyzjn · 2023-09-03T20:58:52Z

Ok! We now won't print usage messages that are 400 characters or longer:

Nice! FYI absl does this:

poetry run python falcon.py 
FATAL Flags parsing error:
  flag --dataset_name=None: Flag --dataset_name must have a value other than None.
  flag --ckpt_path=None: Flag --ckpt_path must have a value other than None.
Pass --helpshort or --helpfull to see help on flags.

Maybe the Pass --helpshort or --helpfull to see help on flags message could be something to consider as well. Also possibly this can be made configurable as well (e.g., output_help_text_on_error=True).

brentyi · 2023-09-04T09:35:42Z

Interesting! To clarify, are you proposing both:

To introduce distinct --helpshort and --helpfull flags;
To include help messages like Flag --dataset_name must have a value other than None.?

vwxyzjn · 2023-09-04T14:01:03Z

To include help messages like Flag --dataset_name must have a value other than None.?

This. Currently

from dataclasses import asdict, dataclass, field
import tyro
@dataclass
class Args:
    exp_name: str
    seed: int = 1
    track: bool = False

args = tyro.cli(Args)

gives

but when it's long, it's hardly readable. E.g.,

Ok! We now won't print usage messages that are 400 characters or longer:

I am also proposing to make this configurable so that the user can choose not print usage messages that are N characters or longer. Personally I would set N=0 because I could just type python teest.py --help if I want the help text — when the command fails, I am only interested in the relevant errors.

Thanks for being patient with my feature requests!

brentyi · 2023-09-04T22:31:20Z

Just made a PR with these suggestions, thanks!

I am also proposing to make this configurable so that the user can choose not print usage messages that are N characters or longer. Personally I would set N=0 because I could just type python teest.py --help if I want the help text — when the command fails, I am only interested in the relevant errors.

For now I just (did the equivalent of) setting N=0, for things like this (especially while I still feel like we're iterating) I'd prefer to err on the side of being prescriptive. It's hard to remove configurable parameters once they've been added. 🙂

Thanks for being patient with my feature requests!

I appreciate your help with making the CLI experience better!

brentyi added 6 commits August 24, 2023 19:11

Improve error messages

664fdd9

Tweak error messages, fix types

847b131

Use difflib, add help messages

a1cfcf8

ruff + black --preview

60a6531

allow_abbrev=False in subparsers

3772c41

Fix exit codes

df189d3

Fix type errors

645075d

brentyi force-pushed the brent/better_errors branch from 1ab9fe2 to 645075d Compare August 25, 2023 17:24

brentyi added 5 commits August 25, 2023 10:27

Tune similarity thresholds

48fd39b

Fix metavar ordering regression

3170d70

Tweak unrecognized argument filter

a8b3ce1

Improve error messages for unrecognized arguments with subparsers (by a

f7a1a9f

lot!)

Fix unrecognized argument false positives

29f6a05

brentyi force-pushed the brent/better_errors branch from 7605184 to 29f6a05 Compare August 26, 2023 23:12

brentyi added 3 commits August 26, 2023 16:14

Python 3.7-compatible mypy

1427f6c

Polish unrecognized argument error message when subcommands are present

9612e01

Fix for duplicate arguments

f432d58

Fix same_exists default

3a8a668

brentyi added 3 commits August 27, 2023 02:14

Print helptext of similar arguments

75d47bf

Refine helptext in error messages

a623dcd

Remove delete

ef6e733

brentyi added 2 commits August 28, 2023 23:29

Add tests + heuristics

1ff595d

Tweaks for coverage

b69307f

brentyi added 8 commits August 28, 2023 23:56

Fix or suppress mypy error

9034b23

Fix test

ef1daea

Bump version

6d17914

Error message cleanup, test fixes

2dbc399

Handle flags / custom actions in unrecognized argument errors

3f3e6f2

Fix Python 3.7

94fa93f

Merge branch 'main' into brent/better_errors

f4e3041

Hack for jax error

057863a

brentyi added 3 commits August 30, 2023 16:05

Revert jax hack

5026e81

Suppress long usage prints

30a28de

Error message consistency

15c3452

brentyi added 2 commits September 1, 2023 00:35

Test coverage

f61cb41

Fix test typo

4e71903

brentyi merged commit 83592d6 into main Sep 1, 2023
10 checks passed

brentyi deleted the brent/better_errors branch September 1, 2023 07:43

brentyi mentioned this pull request Sep 2, 2023

Bump tyro to 0.5.10 nerfstudio-project/nerfstudio#2393

Merged

brentyi mentioned this pull request Sep 4, 2023

More error message tweaks #67

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better error messages #63

Better error messages #63

brentyi commented Aug 25, 2023 •

edited

Loading

codecov bot commented Aug 25, 2023 •

edited

Loading

brentyi commented Aug 26, 2023 •

edited

Loading

vwxyzjn commented Aug 26, 2023

vwxyzjn commented Aug 27, 2023

brentyi commented Aug 27, 2023

vwxyzjn commented Aug 27, 2023

brentyi commented Aug 28, 2023

vwxyzjn commented Aug 30, 2023

brentyi commented Aug 31, 2023

vwxyzjn commented Sep 3, 2023

brentyi commented Sep 4, 2023

vwxyzjn commented Sep 4, 2023

brentyi commented Sep 4, 2023 •

edited

Loading

Better error messages #63

Better error messages #63

Conversation

brentyi commented Aug 25, 2023 • edited Loading

codecov bot commented Aug 25, 2023 • edited Loading

Codecov Report

brentyi commented Aug 26, 2023 • edited Loading

vwxyzjn commented Aug 26, 2023

vwxyzjn commented Aug 27, 2023

brentyi commented Aug 27, 2023

vwxyzjn commented Aug 27, 2023

brentyi commented Aug 28, 2023

vwxyzjn commented Aug 30, 2023

brentyi commented Aug 31, 2023

vwxyzjn commented Sep 3, 2023

brentyi commented Sep 4, 2023

vwxyzjn commented Sep 4, 2023

brentyi commented Sep 4, 2023 • edited Loading

brentyi commented Aug 25, 2023 •

edited

Loading

codecov bot commented Aug 25, 2023 •

edited

Loading

brentyi commented Aug 26, 2023 •

edited

Loading

brentyi commented Sep 4, 2023 •

edited

Loading