Skip to content

Latest commit

 

History

History
163 lines (145 loc) · 9.89 KB

README.md

File metadata and controls

163 lines (145 loc) · 9.89 KB

Anime Style Illustration Specific Image Search App with ViT Tagger x BM25/Doc2Vec

image

What's This?

  • Anime Style Illustration Specific Image Search App with ML Technique
    • can be used for photos. but flexible photo search is offered by Google Photos or etc :)
  • Search capabilities of cloud photo album services towards illustration image files are poor for some reason
  • So, I wrote simple scripts

Method

  • Search Images Matching with Query Texts on Latent Semantic Representation Vector Space and with BM25
    • Vectors are generated with embedding model: Tagger Using Visual Transformar (ViT) Internally x Doc2Vec
    • Scores which is calculated with BM25 is used in combination
    • Internal re-ranking method is also introduced
      • Assumption: Users make queries better asymptotically according to top search results and find appropriate queries eventually
      • If you wan to know detail of the method, please read webui.py :)
  • Doc2Vec is Mainly Used for Covering Tagging Presision
    • Simple search logic can be implemented with BM25 only
    • But, you can use tags to search which are difficult for tagging because the index data which is composed of vectors generated with Doc2Vec model
      • implemented with Gensim lib
  • ( Web UI is implemented with StreamLit )

Usage

  • (collect working confirmed environment)
    • (Windows 11 Pro 64bit 23H2)
    • Python 3.10.4
    • pip 22.0.4
  • $ pip install -r requirements.txt
  • $ python tagging.py --dir "IMAGE FILES CONTAINED DIR PATH"
    • The script searches directory structure recursively :)
    • This takes quite a while...
      • About 1.7 sec/file at middle spec desktop PC (GPU is not used)
        • AMD Ryzen 7 5700X 8-Core Processor 4.50 GHz
      • You may speed up with setup libraries and drivers for using GPU :)
        • Plese see here
          • Current pytorch version of this repo is v2.4.1
        • You should install pytorch package supporting CUDA matching wich CUDA library on your machine additionaly. And cuDNN library matching wich the CUDA library should be installed also :)
          • Example of pytorch install command line: $ pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu121
          • If your graphic board is not supported by CUDA 12.1.x library, you should change version of torchXXXXX packages
    • Pathes and tags of image files are saved to tags-wd-tagger.txt
  • $ python genmodel.py
    • This takes quite a while...
  • $ streamlit run webui.py
    • Search app is opend on your web browser

Index Data Updating

  • When you get and store new image files, you should update index data for adding the files to be hitted at search on webui.py
  • Procedure
    • 1 Backup all files genarated by scripts on this repo!
      • Model files on your home directory is exception :)
    • 2 $ python tagging.py --dir "IMAGE FILES CONTAINED DIR PATH" --after "YYYY-MM-DD"
      • Param of --dir doesn't have to be changed
      • Adding --after option is needed. Please specify date after last index data creation or update
        • Tagging target is filtered by specified date: added date (cdate attribute) <= YYYY-MM-DD
    • 3 $ python genmodel.py --update
    • Thats's all!

Use Character Image Feture Based Reranking Mode (Optional)

  • Reranking based on similarity calculation with Quantized CCIP(Contrastive Anime Character Image Pre-Training) model
    • When index data described below exists, this mode becomes selectable at webui.py
  • Additional index data preparation is needed
    • $ python gen_cfeatures.py --dir "IMAGE FILES CONTAINED DIR PATH"
      • PyPi modules on requirements_features.txt are needed insted of modules described on requirements.txt...
      • You should use venv (virtualenv) to use isolated python environment and 'onnx-runtime-gpu' module may be crash at your machine...
        • If 'onnx-runtime-gpu' module is not worked, please uninstall it and install normal 'onnx-runtime'...
      • Best of luck!

Usage (Binary Package of Windows at Release Page)

  • Same with above except that you need not to execute python and execution path (current path) is little bit different :)
  • First, unzip package and launch command prompt or PowerShell :)
  • $ cd anime-illust-image-searcher-pkg
  • $ .\cmd_run\cmd_run.exe tagging --dir "IMAGE FILES CONTAINED DIR PATH"
  • $ .\cmd_run\cmd_run.exe genmodel
    • Same with above :)
  • $ .\run_webui.exe
    • Search app is opend on your web browser!

Tips (Attention)

  • Words (tags) which were not apeeared at tagging are not usable on query
    • Solution
      • Search words you want to use from taggs-wd-tagger.txt with grep, editor or something for existance checking
      • If exist, there is no problem. If not, you should think similar words and search it in same manner :)
  • Specifying Eath Tag Weight (format -> TAG:WEIGHT, WEIGHT shoud be integer)
    • Examples
      • "girl:3 dragon"
      • "girl:2 boy:3"
    • Exclude tag marking
      • Weight specification which starts with '-' indicates that images tagged it should be excluded
      • ex: "girl boy:-3"
        • Images tagged 'boy' are removed from results. Numerical weight value is ignored but can't be omitted :)
    • Required tag marking
      • Weight specification which starts with '+' indicates the tag is required
      • ex: "girl:+3 dragon"
        • Images not tagged 'girl' are removed from results
        • Weight value is NOT ignored at calculation of scores
  • Search Result Exporting feature
    • You can export file paths list which is hitted at search
    • Pressing 'Export' button saves the list as text file to path Web UI executed at
    • File name is query text with timestamp and contents is line break delimited
      • Some viewer tools such as Irfan View can load image files with passing a text file contains path list :)
      • Irfan View can slideshow also. It's nice :)
    • At Windows, charactor code is sjis. At other OSes, charactor code is utf-8
  • Character code of file pathes
    • If file path contains characters which can't be converted to Unicode or utf-8, scripts may ouput error message at processing the file
    • But, it doesn't mean that your script usage is wrong. Though these files is ignored or not displayed at Web UI :|
      • This is problem of current implentation. When you use scripts on Windows and charactor code of directory/file names isn't utf-8, the problem may occur

Information Related to Copyrights

For Busy People

  • Tagging using Google Colab env !
    • 1 Make preprocessed data with utility/make_tensor_files.py
    • 2 Zip the output dir
    • 3 Upload zipped file to Google Drive
    • 4 Use Google Colab env like this
    • 5 Get tags-wd-tagger.txt and replace file pathes on it to be matched with your image files existing pathes :)
    • 6 Execute genmodel.py !

TODO

  • Search on latent representation generated by CLIP model
    • This method was alredy tried but precition was not good because current public available CLIP models are not fitting for anime style illust :|
      • If CLIP models which are fine tuned with anime style illust images are available, this method is better than current one
  • Weight specifying to tags like prompt format of Stable Diffusion Web UI
    • Current implemenataion uses all tags faialy. But there is many cases that users want to emphasize specific tags and can't get appropriate results without that!
  • Fix a bug: some type of tags on tags-wd-tagger.txt can't be used on query
  • Incremental index updating at image files increasing
  • Similar image search with specifying a image file
    • This is realized at 'Character Image Feture Based Reranking Mode' practically :)
  • Exporting found files list feature
    • In text file. Once you get list, many other tools and viewer you like can be used :)
  • Making binary package of this app which doesn't need python environment building

Screenshots of Demo

  • I used about 1000 image files collected from Irasutoya which offers free image materials as search target example

    • Note: image materials of Irasutoya have restrictions at commercial purposes use
  • Partial tagging result: ./tagging_example.txt

    • Generation script was executed in Windows
    • File paths in linked file have been partially masked
  • Search "standing"

    • image
  • Search "standing animal"

    • image
  • Image info page

    • image
  • Slideshow feature

    • Auto slide in 5 sec period (roop)
    • image