A Python base cli tool for tagging images with wd14 models.
I make this repo because I want to caption some images cross-platform (On My old MBP, my game win pc or docker base linux cloud-server(like Google colab))
But I don't want to install a huge webui just for this little work. And some cloud-service are unfriendly to gradio base ui.
So this repo born.
All models are from SmilingWolf&Z3D(👏👏)
Huggingface are original sources, modelscope are pure forks from Huggingface(Because HuggingFace was blocked in Some place).
Model | Hugging Face Link | ModelScope Link |
---|---|---|
wd-eva02-large-tagger-v3 | Hugging Face | ModelScope |
wd-vit-large-tagger-v3 | Hugging Face | ModelScope |
wd-swinv2-tagger-v3 | Hugging Face | ModelScope |
wd-vit-tagger-v3 | Hugging Face | ModelScope |
wd-convnext-tagger-v3 | Hugging Face | ModelScope |
wd-v1-4-moat-tagger-v2 | Hugging Face | ModelScope |
wd-v1-4-swinv2-tagger-v2 | Hugging Face | ModelScope |
wd-v1-4-convnextv2-tagger-v2 | Hugging Face | ModelScope |
wd-v1-4-vit-tagger-v2 | Hugging Face | ModelScope |
wd-v1-4-convnext-tagger-v2 | Hugging Face | ModelScope |
wd-v1-4-vit-tagger | Hugging Face | ModelScope |
wd-v1-4-convnext-tagger | Hugging Face | ModelScope |
Z3D-E621-Convnext | Hugging Face | ModelScope |
make a simple ui by Jupyter widget(When my lazy cancer cured😊)
Python 3.10-3.12 works fine.
Open a shell terminal and follow below steps:
# Clone this repo
git clone https://github.com/fireicewolf/wd14-tagger-cli.git
cd wd14-tagger-cli
# create a Python venv
python -m venv .venv
.\venv\Scripts\activate
# Install dependencies
# Base dependencies, models for inference will download via python request libs.
pip install -U -r requirements.txt
# If you want to download or cache model via huggingface hub, install this.
pip install -U -r huggingface-requirements.txt
# If you want to download or cache model via modelscope hub, install this.
pip install -U -r modelscope-requirements.txt
# If you want to use cuda devices for inference, install one of these two depend on your CUDA version.
# For CUDA 11.8
pip install -U -r cuda118-requirements.txt
# For CUDA 12.x
pip install -U -r cuda12x-requirements.txt
I have added CUDA, ROCm and OpenVino providers in inference code, but I didn't test if all of them are work(ROCm and OpenVINO).
In code the priority of device for inference is CUDA -> ROCm ->OpenVINO ->CPU.
You may need to install extra sdk or pip package for ROCm or OpenVINO to work, Please follow this doc on onnxruntime.ai website.
Make sure your python venv has been activated first!
python caption.py your_datasets_path
To run with more options, You can find help by run with this or see at Options
python caption.py -h
Advance options
--recursive
Will include all support images format in your input datasets path and its subpath.
--force_use_cpu
Force use cpu for inference.
--batch_size N
Batch size for inference, default is 1.
--model_name MODEL_NAME
Onnx model name used for inference, default is wd-swinv2-v3(For more model, please check config/default.json)
--model_site MODEL_SITE
Model site where onnx model download from(huggingface or modelscope), default is huggingface.
--models_save_path MODEL_SAVE_PATH
Path for models to save, default is models(under project folder).
--download_method
Download models via sdk or url, default is sdk.
If huggingface hub or modelscope sdk not installed or download failed, will auto retry with url download.
--use_sdk_cache
Use huggingface or modelscope sdk cache to store models, this option need huggingface_hub or modelscope sdk installed.
If this enabled, --models_save_path
will be ignored.
--custom_onnx_path CUSTOM_ONNX_PATH
--custom_csv_path CUSTOM_CSV_PATH
This two args need to be used together. You can use your exist model.
--custom_caption_save_path CUSTOM_CAPTION_SAVE_PATH
Save caption files to a custom path but not with images(But keep their directory structure)
--log_level LOG_LEVEL
Log level for terminal console and log file, default is INFO
(DEBUG
,INFO
,WARNING
,ERROR
,CRITICAL
)
--save_logs
Save logs to a file, log will be saved at same level with data_dir_path
--caption_extension CAPTION_EXTENSION
Caption file extension, default is .txt
--append_tags APPEND_TAGS
Append tags to caption file if existed.
--not_overwrite
Do not overwrite caption file if it existed.
--remove_underscore
Remove "_" symbol in tags(not include kmoji like o_o).
--undesired_tags UNDESIRED_TAGS
Tags you don't want appeared in captions, seperate them with comma like "black,yellow"
--tags_frequency
Enable this will make a statistics of the tags occurred frequency.
--threshold THRESHOLD
Threshold of confidence to add a tag to caption, default value is 0.35
--general_threshold GENERAL_THRESHOLD
Threshold of confidence to add a tag from general category, if not defined, will use --threshold
as it.
--character_threshold CHARACTER_THRESHOLD
Threshold of confidence to add a tag from character category, if not defined, will use --threshold
as it.
--add_rating_tags_to_first
Add rating tags at the beginning of caption.
--add_rating_tags_to_last
Add rating tags at the end of caption.
--character_tags_first
Make character_tags to the beginning of caption.
--always_first_tags ALWAYS_FIRST_TAGS
Tags(separate with comma like "1boy,solo") you want to put in the beginning of caption.
--caption_separator CAPTION_SEPARATOR
Separator for captions(include space if needed), default is ", "
.
--tag_replacement TAG_REPLACEMENT
tag replacement in the format of "source1,target1;source2,target2; ..."
.
Escape ,
and ;
with \
. e.g. "tag1,tag2;tag3,tag4"
--character_tag_expand
Expand tag tail parenthesis to another tag for character tags.
e.g. character_name_(series)
will be expanded to character_name, series
.
Most tags process code from kohya-ss/sd-scripts
Some image proces code from toriato/stable-diffusion-webui-wd14-tagger and SmilingWolf/wd-tagger
Without their works(👏👏), this repo won't exist.