You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Data needs to be stored in storage (RAM, Disk ...) as bits (0,1)
ASCII maps basic Westen characters to numbers between 0 and 127.
5 characters -> 5 bytes
Unicode was born to handle the vast multitude of languages and complex things with accents, emojis, modifiers, and other strange characters. Grapheme: a single unit of a human writing system (d or 华 or 🙀 ...). Code points: one or more code points are combined to create a grapheme.
Different encoding
UTF-32: each code point converts to binary with 4 bytes:
pros: same size for all characters -> easier to search, index
cons: waste space with simple characters
UTF-8:
pros: ASCII compatibility backward, save space
cons: harder to index because of unequal sizes and bytes
The text was updated successfully, but these errors were encountered:
ASCII maps basic Westen characters to numbers between 0 and 127.
5 characters -> 5 bytes
Unicode was born to handle the vast multitude of languages and complex things with accents, emojis, modifiers, and other strange characters.
Grapheme: a single unit of a human writing system (d or 华 or 🙀 ...).
Code points: one or more code points are combined to create a grapheme.
Different encoding
The text was updated successfully, but these errors were encountered: