Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 并不区分大小端 #1

Open
lilydjwg opened this issue Apr 12, 2016 · 1 comment
Open

UTF-8 并不区分大小端 #1

lilydjwg opened this issue Apr 12, 2016 · 1 comment

Comments

@lilydjwg
Copy link

区分大小端的 Unicode 编码有 UTF-16、UCS-2、UTF-32、UCS-4,所以描述不太对啦。看上去你的程序用的是 UTF-16LE?

@lilydjwg
Copy link
Author

那个,README 6.1 节中也有很多问题呢。

「序号」?如果你指的是「code point」的话,这个建议翻译成「码点」。UTF-16 也是变长的,占用二或四个字节(surrogate pairs)。

「unicode、UCS-2、UTF-16,三者在数值上相同」这句话我并不能理解。Unicode 并无法直接存储和传输,而后两者在规定了大小端之后是可以存储和传输的。它们并没有「数值」这个概念。

目前 UTF-8 最长为四个字节,因为现在 Unicode 码点只分配到 U+10FFFF。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant