zl程序教程

您现在的位置是:首页 >  其它

当前栏目

hexcode of é î Latin-1 Supplement

of &#
2023-09-11 14:14:17 时间

hexcode of é

https://www.codetable.net/hex/e9

Symbol Name: Latin Small Letter E With Acute
Html Entity: é
Hex Code: é
Decimal Code: é
Unicode Group: Latin-1 Supplement

 

http://www.unicode.org/charts/index.html   这个页面搜索的时候,需要输入00E9,字母必须大写。然后就能找到字符是属于Latin-1 Supplement

Latin-1 Supplement  https://www.unicode.org/charts/PDF/U0080.pdf

 

 

Character encoding for French Accents

If intérêt shows up as intérêt you likely (i.e. short of corruption due to double encoding) have UTF-8 encoded text being shown up as if it were ISO-8859-1.

Make sure the headers are correctly formed and present the content as being UTF-8 encoded.

 

 

Double encoded UTF-8 strings in C# 两次编码的utf-8

This article shows how to convert a string that has been double encoded using UTF-8.

For example, say you have the string Müller instead of the string Müller.

How did it happen?

The letter ü is encoded in UTF-8 as 2 bytes: 195 and 188

If you encoded the bytes again then the 195 converts to 195 and 131 which is the Ã

And the 188 converts to 194 and 188 which is the ¼

有一个错误的字符串,转换步骤如下

1.先用utf8,把字符串转换成utf8的字节数组

2.把utf8的字节数组,转换成iso的字节数组

3.再用utf8,把iso的字节数组,转换成utf8对应的字符串

 [Test]
        public void Test20210409003()
        {
            string correctFormat = "125,chaînes";//This is the correct format

            var utf8Str = "125,chaînes";
            Encoding iso = Encoding.GetEncoding("ISO-8859-1");

            Encoding utf8 = Encoding.UTF8;

            byte[] utfBytes = utf8.GetBytes(utf8Str);

            byte[] isoBytes = Encoding.Convert(utf8, iso, utfBytes);

            var result = utf8.GetString(isoBytes);
            Console.WriteLine(result);
        }

 

 

C# Convert string from UTF-8 to ISO-8859-1 (Latin1) H

Use Encoding.Convert to adjust the byte array before attempting to decode it into your destination encoding.

Encoding iso = Encoding.GetEncoding("ISO-8859-1");
Encoding utf8 = Encoding.UTF8;
byte[] utfBytes = utf8.GetBytes(Message);
byte[] isoBytes = Encoding.Convert(utf8, iso, utfBytes);
string msg = iso.GetString(isoBytes);

 

 

 

Latin-1 Supplement (Unicode block)

The Latin-1 Supplement (also called C1 Controls and Latin-1 Supplement) is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). Controls C1 (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

The C1 controls and Latin-1 Supplement block has been included in its present form, with the same character repertoire since version 1.0 of the Unicode Standard.[3] Its block name in Unicode 1.0 was simply Latin1.

 

 

 

Character Encoding Issue UTF-8 and ISO-8859-1

The answer would be you have wrong data in the database. What probably happened is that you did a conversion ISO-8859-1 -> UTF-8 on data that's already in UTF-8. Therefore, doing a conversion UTF-8 -> ISO-8859-1 gives you the original UTF-8 data back.

Make sure you're not calling utf8_encode (which does an ISO-8859-1 -> UTF-8 conversion) on UTF-8 data!  这里是double encoding的问题,已经编码成utf-8的字符串,又做了一次从iso-8859-1到utf-8的转换。

Since every UTF-8 string is also a valid ISO-8859-1 string (well, not quite, but it's commonly extended so that that's the case), you have no errors on the ISO-8859-1 -> UTF-8 conversion over UTF-8 data.

 

 

 î被错误的编码读取

850
ibm850
OEM Multilingual Latin 1; Western European (DOS)

1252
windows-1252
ANSI Latin 1; Western European (Windows)

28591
iso-8859-1
ISO 8859-1 Latin 1; Western European (ISO)

  [Test]
        public void Test20210414001()
        {
            Console.WriteLine(Encoding.Default.EncodingName);
            Console.WriteLine(Encoding.Default.CodePage);

            string str = "î";
            var array = Encoding.UTF8.GetBytes(str);
            var encoding2 = Encoding.GetEncoding(850);
            var str2 = encoding2.GetString(array);
            Console.WriteLine(str2);

            var encoding3 = Encoding.GetEncoding(1252);
            var str3 = encoding3.GetString(array);
            Console.WriteLine(str3);

            var encoding4 = Encoding.GetEncoding(28591);
            var str4 = encoding3.GetString(array);
            Console.WriteLine(str4);
        }

 

code page 850解析的是├«
code page 1252解析的是î

code page 28591解析的是î

 

 测试

 

http://string-functions.com/encodedecode.aspx

QQ农场 encode with gb2312 and decode with Windows-252,the result is QQÅ©³¡

https://github.com/ChuckTest/UnitTest/blob/master/UnitTest/EncodingTest.cs

string QQ农场 with 00936(gb2312)[Chinese Simplified (GB2312)], decode with 01252(Windows-1252)[Western European (Windows)] as: QQÅ©³¡
string QQ农场 with 00936(gb2312)[Chinese Simplified (GB2312)], decode with 01254(windows-1254)[Turkish (Windows)] as: QQÅ©³¡
string QQ农场 with 00936(gb2312)[Chinese Simplified (GB2312)], decode with 01258(windows-1258)[Vietnamese (Windows)] as: QQÅ©³¡
string QQ农场 with 00936(gb2312)[Chinese Simplified (GB2312)], decode with 28591(iso-8859-1)[Western European (ISO)] as: QQÅ©³¡
string QQ农场 with 00936(gb2312)[Chinese Simplified (GB2312)], decode with 28599(iso-8859-9)[Turkish (ISO)] as: QQÅ©³¡
string QQ农场 with 00936(gb2312)[Chinese Simplified (GB2312)], decode with 28605(iso-8859-15)[Latin 9 (ISO)] as: QQÅ©³¡
string QQ农场 with 00936(gb2312)[Chinese Simplified (GB2312)], decode with 65000(utf-7)[Unicode (UTF-7)] as: QQÅ©³¡
string QQ农场 with 10008(x-mac-chinesesimp)[Chinese Simplified (Mac)], decode with 01252(Windows-1252)[Western European (Windows)] as: QQÅ©³¡
string QQ农场 with 10008(x-mac-chinesesimp)[Chinese Simplified (Mac)], decode with 01254(windows-1254)[Turkish (Windows)] as: QQÅ©³¡
string QQ农场 with 10008(x-mac-chinesesimp)[Chinese Simplified (Mac)], decode with 01258(windows-1258)[Vietnamese (Windows)] as: QQÅ©³¡
string QQ农场 with 10008(x-mac-chinesesimp)[Chinese Simplified (Mac)], decode with 28591(iso-8859-1)[Western European (ISO)] as: QQÅ©³¡
string QQ农场 with 10008(x-mac-chinesesimp)[Chinese Simplified (Mac)], decode with 28599(iso-8859-9)[Turkish (ISO)] as: QQÅ©³¡
string QQ农场 with 10008(x-mac-chinesesimp)[Chinese Simplified (Mac)], decode with 28605(iso-8859-15)[Latin 9 (ISO)] as: QQÅ©³¡
string QQ农场 with 10008(x-mac-chinesesimp)[Chinese Simplified (Mac)], decode with 65000(utf-7)[Unicode (UTF-7)] as: QQÅ©³¡
string QQ农场 with 20936(x-cp20936)[Chinese Simplified (GB2312-80)], decode with 01252(Windows-1252)[Western European (Windows)] as: QQÅ©³¡
string QQ农场 with 20936(x-cp20936)[Chinese Simplified (GB2312-80)], decode with 01254(windows-1254)[Turkish (Windows)] as: QQÅ©³¡
string QQ农场 with 20936(x-cp20936)[Chinese Simplified (GB2312-80)], decode with 01258(windows-1258)[Vietnamese (Windows)] as: QQÅ©³¡
string QQ农场 with 20936(x-cp20936)[Chinese Simplified (GB2312-80)], decode with 28591(iso-8859-1)[Western European (ISO)] as: QQÅ©³¡
string QQ农场 with 20936(x-cp20936)[Chinese Simplified (GB2312-80)], decode with 28599(iso-8859-9)[Turkish (ISO)] as: QQÅ©³¡
string QQ农场 with 20936(x-cp20936)[Chinese Simplified (GB2312-80)], decode with 28605(iso-8859-15)[Latin 9 (ISO)] as: QQÅ©³¡
string QQ农场 with 20936(x-cp20936)[Chinese Simplified (GB2312-80)], decode with 65000(utf-7)[Unicode (UTF-7)] as: QQÅ©³¡
string QQ农场 with 50227(x-cp50227)[Chinese Simplified (ISO-2022)], decode with 01252(Windows-1252)[Western European (Windows)] as: QQÅ©³¡
string QQ农场 with 50227(x-cp50227)[Chinese Simplified (ISO-2022)], decode with 01254(windows-1254)[Turkish (Windows)] as: QQÅ©³¡
string QQ农场 with 50227(x-cp50227)[Chinese Simplified (ISO-2022)], decode with 01258(windows-1258)[Vietnamese (Windows)] as: QQÅ©³¡
string QQ农场 with 50227(x-cp50227)[Chinese Simplified (ISO-2022)], decode with 28591(iso-8859-1)[Western European (ISO)] as: QQÅ©³¡
string QQ农场 with 50227(x-cp50227)[Chinese Simplified (ISO-2022)], decode with 28599(iso-8859-9)[Turkish (ISO)] as: QQÅ©³¡
string QQ农场 with 50227(x-cp50227)[Chinese Simplified (ISO-2022)], decode with 28605(iso-8859-15)[Latin 9 (ISO)] as: QQÅ©³¡
string QQ农场 with 50227(x-cp50227)[Chinese Simplified (ISO-2022)], decode with 65000(utf-7)[Unicode (UTF-7)] as: QQÅ©³¡
string QQ农场 with 51936(EUC-CN)[Chinese Simplified (EUC)], decode with 01252(Windows-1252)[Western European (Windows)] as: QQÅ©³¡
string QQ农场 with 51936(EUC-CN)[Chinese Simplified (EUC)], decode with 01254(windows-1254)[Turkish (Windows)] as: QQÅ©³¡
string QQ农场 with 51936(EUC-CN)[Chinese Simplified (EUC)], decode with 01258(windows-1258)[Vietnamese (Windows)] as: QQÅ©³¡
string QQ农场 with 51936(EUC-CN)[Chinese Simplified (EUC)], decode with 28591(iso-8859-1)[Western European (ISO)] as: QQÅ©³¡
string QQ农场 with 51936(EUC-CN)[Chinese Simplified (EUC)], decode with 28599(iso-8859-9)[Turkish (ISO)] as: QQÅ©³¡
string QQ农场 with 51936(EUC-CN)[Chinese Simplified (EUC)], decode with 28605(iso-8859-15)[Latin 9 (ISO)] as: QQÅ©³¡
string QQ农场 with 51936(EUC-CN)[Chinese Simplified (EUC)], decode with 65000(utf-7)[Unicode (UTF-7)] as: QQÅ©³¡
string QQ农场 with 54936(GB18030)[Chinese Simplified (GB18030)], decode with 01252(Windows-1252)[Western European (Windows)] as: QQÅ©³¡
string QQ农场 with 54936(GB18030)[Chinese Simplified (GB18030)], decode with 01254(windows-1254)[Turkish (Windows)] as: QQÅ©³¡
string QQ农场 with 54936(GB18030)[Chinese Simplified (GB18030)], decode with 01258(windows-1258)[Vietnamese (Windows)] as: QQÅ©³¡
string QQ农场 with 54936(GB18030)[Chinese Simplified (GB18030)], decode with 28591(iso-8859-1)[Western European (ISO)] as: QQÅ©³¡
string QQ农场 with 54936(GB18030)[Chinese Simplified (GB18030)], decode with 28599(iso-8859-9)[Turkish (ISO)] as: QQÅ©³¡
string QQ农场 with 54936(GB18030)[Chinese Simplified (GB18030)], decode with 28605(iso-8859-15)[Latin 9 (ISO)] as: QQÅ©³¡
string QQ农场 with 54936(GB18030)[Chinese Simplified (GB18030)], decode with 65000(utf-7)[Unicode (UTF-7)] as: QQÅ©³¡