Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cli.py: encoding='utf-8' #696

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

BaseMax
Copy link

@BaseMax BaseMax commented Nov 3, 2024

The issue happened in our project at SalamLang/Salam#265 in pre-commit for lining YAML files.

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 445: character maps to

@coveralls
Copy link

Coverage Status

coverage: 99.825%. remained the same
when pulling 6ca01fe on MaxFork:supports-utf8
into 95e17b3 on adrienverge:master.

@BaseMax
Copy link
Author

BaseMax commented Nov 3, 2024

cc @adrienverge @jbampton

@adrienverge
Copy link
Owner

Hello and thanks for the proposal. Could you check out other pull requests related to character encoding? How does this one differ from them?

@BaseMax
Copy link
Author

BaseMax commented Nov 3, 2024

Hi @adrienverge, happy connecting.

There are total 3 merge requests related to encoding.
1- #630
2- #240
3- #696 (CURRENT MERGE REQUEST)

The https://github.com/adrienverge/yamllint/pull/630/files#diff-2e0288fc9fc3cda09f90a25f76bedb9ce0cea019d01147b436e575c71a3e674eR222 merge request looks fine but it doesn't have the change I applied.

My problem is that I have Persian UTF8 text in my YAML files and the problem was related to the 'cli.py' file.

Related to my issue https://github.com/adrienverge/yamllint/pull/240/files looks like a good patch as it can automatically detect the encoding and then use that in reading the file but I can see your comments there and it seems you are not happy to add new dependencies. Q: "I'm very against adding dependencies (like chardet)."

@adrienverge
Copy link
Owner

Hello Max, thanks. It looks like #630 solves the same problem but is more complete and future-proof. Also, your PR doesn't fix encoding problems for other files such as configuration. What do you think?

My problem is that I have Persian UTF8 text in my YAML files and the problem was related to the 'cli.py' file.

In the meantime, a solution is to tell Python to read files as UTF-8 by default:

export PYTHONUTF8=1
yamllint your-file.yaml

@BaseMax
Copy link
Author

BaseMax commented Nov 5, 2024

Thank you @adrienverge, I added PYTHONUTF8 var to our pre-commit env config. SalamLang/Salam@db7e870

@jbampton and I will do more testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants