Skip to content

Commit

Permalink
Fix issue #318: test failures when cchardet is installed
Browse files Browse the repository at this point in the history
feedparser imports cchardet or chardet depending on what's installed:
https://github.com/kurtmckee/feedparser/blob/11990ea1d8791acc76c67781f1d2011daf0c3a99/feedparser/encodings.py#L37-L40

Although these libraries are mostly equivalent, they return slightly different
encoding strings, even though both are correct and lead to succesful decoding.
This change allows the tests to be run with either library by accepting both
encoding names as correct.
  • Loading branch information
maksverver committed Aug 29, 2024
1 parent d27a9d9 commit cfd3618
Show file tree
Hide file tree
Showing 4 changed files with 6 additions and 6 deletions.
4 changes: 2 additions & 2 deletions tests/illformed/chardet/big5.xml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
<!--
SkipUnless: __import__('chardet')
Description: Big5 with no encoding information
Expect: bozo and encoding == 'Big5'
Expect: bozo and encoding in ('Big5', 'BIG5')
-->
<feed xmlns="http://www.w3.org/2005/Atom">
<title>我希望??很容易?其翻?成中文,并有助于改??件。 感?您??本文。</title>
</feed>
</feed>
4 changes: 2 additions & 2 deletions tests/illformed/chardet/euckr.xml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<!--
SkipUnless: __import__('chardet')
Description: EUC-KR with no encoding information
Expect: bozo and encoding == 'EUC-KR'
Expect: bozo and encoding in ('EUC-KR', 'UHC')
-->
<rss>
<channel>
Expand All @@ -10,4 +10,4 @@ Expect: bozo and encoding == 'EUC-KR'
<description>TypeKey 시스템이 UTF-8로 돌아가는데, 거기서 한글로 된 닉네임을 정할 경우에, EUC-KR로 된 무버블타입 블록에선 리다이렉트되어 전송되어오는 닉네임이 UTF라 당연히 깨어져 나타난다. 실제 블록 등에서 사용하는 필명 내지는 닉네임은 한글로 사용하는 많은 분들도 타입키에서의 닉네임은 이런 문제때문에 울며겨자먹기로 영어로 짓고 있다....</description>
</item>
</channel>
</rss>
</rss>
2 changes: 1 addition & 1 deletion tests/illformed/chardet/gb2312.xml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<!--
SkipUnless: __import__('chardet')
Description: GB2312 with no encoding information
Expect: bozo and encoding == 'GB2312'
Expect: bozo and encoding in ('GB2312', 'GB18030')
-->
<rss>
<channel>
Expand Down
2 changes: 1 addition & 1 deletion tests/illformed/chardet/windows1255.xml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<!--
SkipUnless: __import__('chardet')
Description: windows-1255 with no encoding information
Expect: bozo and encoding == 'windows-1255'
Expect: bozo and encoding in ('windows-1255', 'WINDOWS-1255')
-->
<rss>
<channel>
Expand Down

0 comments on commit cfd3618

Please sign in to comment.