Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

d0558ab这个版本的Dockerfile构建时缺少yaml库无法构建 #955

Closed
cyicz123 opened this issue Nov 14, 2024 · 2 comments
Closed

d0558ab这个版本的Dockerfile构建时缺少yaml库无法构建 #955

cyicz123 opened this issue Nov 14, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@cyicz123
Copy link

Description of the bug | 错误描述

按照文档使用Docker构建时,会出现缺少yaml库而构建失败的错误。修改Dockerfile,安装PyYAML后,能够解决此问题。

RUN /bin/bash -c "pip3 install modelscope PyYAML && \ # 安装PyYAML库
    wget https://gitee.com/myhloli/MinerU/raw/master/scripts/download_models.py && \
    python3 download_models.py && \
    sed -i 's|cpu|cuda|g' /root/magic-pdf.json"

How to reproduce the bug | 如何复现

(base) ➜  MinerU docker build -t mineru:latest .
[+] Building 398.4s (10/10) FINISHED                                                                                                                                 docker:default
 => [internal] load build definition from Dockerfile                                                                                                                           2.3s
 => => transferring dockerfile: 2.11kB                                                                                                                                         0.0s
 => [internal] load metadata for docker.io/library/ubuntu:22.04                                                                                                               32.5s
 => [internal] load .dockerignore                                                                                                                                              0.3s
 => => transferring context: 2B                                                                                                                                                0.0s
 => [1/7] FROM docker.io/library/ubuntu:22.04@sha256:0e5e4a57c2499249aafc3b40fcd541e9a456aab7296681a3994d631587203f97                                                          0.0s
 => CACHED [2/7] RUN apt-get update &&     apt-get install -y         software-properties-common &&     add-apt-repository ppa:deadsnakes/ppa &&     apt-get update &&     ap  0.0s
 => CACHED [3/7] RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1                                                                              0.0s
 => CACHED [4/7] RUN python3 -m venv /opt/mineru_venv                                                                                                                          0.0s
 => CACHED [5/7] RUN /bin/bash -c "source /opt/mineru_venv/bin/activate &&     pip3 install --upgrade pip &&     wget https://gitee.com/myhloli/MinerU/raw/master/requirement  0.0s
 => CACHED [6/7] RUN /bin/bash -c "wget https://gitee.com/myhloli/MinerU/raw/master/magic-pdf.template.json &&     cp magic-pdf.template.json /root/magic-pdf.json &&     sou  0.0s
 => ERROR [7/7] RUN /bin/bash -c "pip3 install modelscope &&     wget https://gitee.com/myhloli/MinerU/raw/master/scripts/download_models.py &&     python3 download_models  361.5s
------
 > [7/7] RUN /bin/bash -c "pip3 install modelscope &&     wget https://gitee.com/myhloli/MinerU/raw/master/scripts/download_models.py &&     python3 download_models.py &&     sed -i 's|cpu|cuda|g' /root/magic-pdf.json":
33.05 WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='pypi.org', port=443): Read timed out. (read timeout=15)")': /simple/modelscope/
35.60 Collecting modelscope
37.59   Downloading modelscope-1.20.0-py3-none-any.whl (5.8 MB)
311.2      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.8/5.8 MB 15.4 kB/s eta 0:00:00
313.0 Collecting urllib3>=1.26
313.1   Downloading urllib3-2.2.3-py3-none-any.whl (126 kB)
320.4      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 126.3/126.3 KB 19.3 kB/s eta 0:00:00
321.9 Collecting tqdm>=4.64.0
322.0   Downloading tqdm-4.67.0-py3-none-any.whl (78 kB)
326.3      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.6/78.6 KB 17.7 kB/s eta 0:00:00
327.4 Collecting requests>=2.25
327.5   Downloading requests-2.32.3-py3-none-any.whl (64 kB)
330.5      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.9/64.9 KB 24.8 kB/s eta 0:00:00
331.3 Collecting certifi>=2017.4.17
331.4   Downloading certifi-2024.8.30-py3-none-any.whl (167 kB)
339.0      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 167.3/167.3 KB 22.4 kB/s eta 0:00:00
339.3 Collecting idna<4,>=2.5
339.4   Downloading idna-3.10-py3-none-any.whl (70 kB)
344.0      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 70.4/70.4 KB 13.9 kB/s eta 0:00:00
347.3 Collecting charset-normalizer<4,>=2
347.4   Downloading charset_normalizer-3.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (144 kB)
353.2      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 144.8/144.8 KB 25.3 kB/s eta 0:00:00
353.6 Installing collected packages: urllib3, tqdm, idna, charset-normalizer, certifi, requests, modelscope
359.3 Successfully installed certifi-2024.8.30 charset-normalizer-3.4.0 idna-3.10 modelscope-1.20.0 requests-2.32.3 tqdm-4.67.0 urllib3-2.2.3
359.3 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
359.4 --2024-11-14 04:53:21--  https://gitee.com/myhloli/MinerU/raw/master/scripts/download_models.py
359.4 Resolving gitee.com (gitee.com)... 180.76.198.225, 180.76.198.77
359.4 Connecting to gitee.com (gitee.com)|180.76.198.225|:443... connected.
359.5 HTTP request sent, awaiting response... 200 OK
359.7 Length: 1921 (1.9K) [text/plain]
359.7 Saving to: 'download_models.py'
359.7
359.7      0K .                                                     100%  164M=0s
359.7
359.7 2024-11-14 04:53:21 (164 MB/s) - 'download_models.py' saved [1921/1921]
359.7
359.9 Traceback (most recent call last):
359.9   File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/import_utils.py", line 451, in _get_module
359.9     return importlib.import_module('.' + module_name, self.__name__)
359.9   File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
359.9     return _bootstrap._gcd_import(name[level:], package, level)
359.9   File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
359.9   File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
359.9   File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
359.9   File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
359.9   File "<frozen importlib._bootstrap_external>", line 883, in exec_module
359.9   File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
359.9   File "/usr/local/lib/python3.10/dist-packages/modelscope/fileio/io.py", line 8, in <module>
359.9     from .format import JsonHandler, YamlHandler
359.9   File "/usr/local/lib/python3.10/dist-packages/modelscope/fileio/format/__init__.py", line 5, in <module>
359.9     from .yaml import YamlHandler
359.9   File "/usr/local/lib/python3.10/dist-packages/modelscope/fileio/format/yaml.py", line 2, in <module>
359.9     import yaml
359.9 ModuleNotFoundError: No module named 'yaml'
359.9
359.9 The above exception was the direct cause of the following exception:
359.9
359.9 Traceback (most recent call last):
359.9   File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/import_utils.py", line 451, in _get_module
359.9     return importlib.import_module('.' + module_name, self.__name__)
359.9   File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
359.9     return _bootstrap._gcd_import(name[level:], package, level)
359.9   File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
359.9   File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
359.9   File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
359.9   File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
359.9   File "<frozen importlib._bootstrap_external>", line 883, in exec_module
359.9   File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
359.9   File "/usr/local/lib/python3.10/dist-packages/modelscope/hub/snapshot_download.py", line 11, in <module>
359.9     from modelscope.hub.api import HubApi, ModelScopeConfig
359.9   File "/usr/local/lib/python3.10/dist-packages/modelscope/hub/api.py", line 26, in <module>
359.9     from modelscope.fileio import io
359.9   File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
359.9   File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/import_utils.py", line 432, in __getattr__
359.9     value = self._get_module(name)
359.9   File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/import_utils.py", line 453, in _get_module
359.9     raise RuntimeError(
359.9 RuntimeError: Failed to import modelscope.fileio.io because of the following error (look up to see its traceback):
359.9 No module named 'yaml'
359.9
359.9 The above exception was the direct cause of the following exception:
359.9
359.9 Traceback (most recent call last):
359.9   File "//download_models.py", line 5, in <module>
359.9     from modelscope import snapshot_download
359.9   File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
359.9   File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/import_utils.py", line 434, in __getattr__
359.9     module = self._get_module(self._class_to_module[name])
359.9   File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/import_utils.py", line 453, in _get_module
359.9     raise RuntimeError(
359.9 RuntimeError: Failed to import modelscope.hub.snapshot_download because of the following error (look up to see its traceback):
359.9 Failed to import modelscope.fileio.io because of the following error (look up to see its traceback):
359.9 No module named 'yaml'
------
Dockerfile:44
--------------------
  43 |     # Download models and update the configuration file
  44 | >>> RUN /bin/bash -c "pip3 install modelscope && \
  45 | >>>     wget https://gitee.com/myhloli/MinerU/raw/master/scripts/download_models.py && \
  46 | >>>     python3 download_models.py && \
  47 | >>>     sed -i 's|cpu|cuda|g' /root/magic-pdf.json"
  48 |
--------------------
ERROR: failed to solve: process "/bin/sh -c /bin/bash -c \"pip3 install modelscope &&     wget https://gitee.com/myhloli/MinerU/raw/master/scripts/download_models.py &&     python3 download_models.py &&     sed -i 's|cpu|cuda|g' /root/magic-pdf.json\"" did not complete successfully: exit code: 1

Operating system | 操作系统

Linux

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.9.x

Device mode | 设备模式

cuda

@cyicz123 cyicz123 added the bug Something isn't working label Nov 14, 2024
@myhloli
Copy link
Collaborator

myhloli commented Nov 14, 2024

复测确认是由于modelscope更新1.20.0版本加入了import yaml而没有更新requirements.txt导致的,可以临时通过指定modelscope版本为1.19.2或自行安装pyyaml解决。

@myhloli
Copy link
Collaborator

myhloli commented Nov 14, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants