https://www.ruanyifeng.com/blog/2007/10/ascii_unicode_and_utf-8.html
阮一峰科普文,讲的浅显易懂
https://en.wikipedia.org/wiki/UTF-8
https://home.unicode.org/
https://www.ietf.org/rfc/rfc3629.txt
曹的unicode point:26361
utf8编码: 11100110 10011011 10111001
- ascii字符占1byte,utf8编码与ascii编码一致
- 多字节时 首字节 连续1的个数代表占用字节数;后面的字节 10开头
- 1110xxxx 10xxxxxx 10xxxxxx x代表有效数据位
- utf8是最常见的变字长编码,占1~4字节
- unicode是编码标准;utf8是unicode的存储编码实现方式,其它还有utf16, utf32等编码方式
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)