Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The Unicode input alphabet is up to 4096 symbols.

Huh?

How did you come up with this number?



ah, I apologize for misdirection. In general it could be much bigger of course. Wikipedia[1] says Utf-8 can encode up to 1,112,064. Anyway, it is quite big "alphabet" than the usual set of Lexis Tokens :)

[1] https://en.wikipedia.org/wiki/UTF-8




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: