\x 开头编码的数据解码成中文

在python里，直接decode('utf-8')即可

>>> "\xE5\x85\x84\xE5\xBC\x9F\xE9\x9A\xBE\xE5\xBD\x93 \xE6\x9D\x9C\xE6\xAD\x8C".decode('utf-8')
u'\u5144\u5f1f\u96be\u5f53 \u675c\u6b4c'
>>> print "\xE5\x85\x84\xE5\xBC\x9F\xE9\x9A\xBE\xE5\xBD\x93 \xE6\x9D\x9C\xE6\xAD\x8C".decode('utf-8')
兄弟难当杜歌
>>>

在java里未发现直接解码的函数，不过只要理解了数据是如何编码的，就可以很快的进行解码，推荐阅读http://www.ruanyifeng.com/blog/2007/10/ascii_unicode_and_utf-8.html

UTF-8是unicode编码的一种落地方案：

\x对应的是UTF-8编码的数据，通过转化规则可以转换为Unicode编码，就能得到对应的汉字，转换规则很简单，先将\x去掉，转换为数字，然后进行对应的位移操作即可，需要注意的是先要判断utf-8的位数：

 val pattern = """(\d+\.\d+\.\d+\.\d+) \- (\S+) (\S+) \[([^\]]+)\] \"(\w+) (\S+) \S+\" (\S+) (\S+) \"([^\"]+)\" \"([^\"]+)\" \"([^\"]+)\" \"([^\"]+)""".r

  val decodeDataPattern = """(\\x([0-9A-Z]){2})+""".r

  def decodeUtf8(utf8Str:String):String={

    var data =   decodeDataPattern.replaceAllIn(utf8Str, m=>{

        var item = decodeXdata(m.toString())

        item

     })

     return data

   }

   def decodeXdata(utf8Str:String):String={

     var arr = utf8Str.split("\\\\x")

     var result = new StringBuilder()

     var isMatchEnd = true

     var matchIndex = 0

     var currentWordLength = 0

     var current = 0

     var e0=0xe0;

     for(item <-arr){

        var str = item.trim

        if(str.length()>0){

           var currentCode =  Integer.parseInt(str, 16);

           if(isMatchEnd){

             isMatchEnd = false

             var and = currentCode & e0;

             if(and == 0xe0){

                matchIndex = 1;

                currentWordLength = 3;

                current =  (currentCode & 0x1f) <<12  // 3位编码的

             }else if(and==96){

                matchIndex = 1;

                currentWordLength = 2;

                current =  (currentCode & 0x1f) <<6 // 2位编码的

             }else{

               current = currentCode  // 1位编码的

             }

          }else{

            matchIndex = matchIndex+1;

            if(matchIndex == 2)

            {

              current+=(currentCode & 0x3f) <<6

            }else{

               current+=(currentCode & 0x3f)

            }

          }

           if(matchIndex==currentWordLength){

               var hex = Integer.toHexString(current)

               hex = if(hex.length()<4) "\\u00"+hex else "\\u"+hex  //补0

               result.append(new String(StringEscapeUtils.unescapeJava(hex).getBytes,"utf-8"))

               current = 0

               matchIndex=0

               isMatchEnd = true

           }

        }

     }

     return result.toString()

   }

Javascript \x 反斜杠x 16进制编解码

\x 开头编码的数据解码成中文的相关教程结束。

《\x 开头编码的数据解码成中文.doc》

下载本文的Word格式文档，以方便收藏与打印。

\x 开头编码的数据解码成中文

Javascript \x 反斜杠x 16进制编解码

\x 开头编码的数据解码成中文的相关教程结束。

相关推荐

对python3编码那些事的小小总结

终于搞懂了python2和python3的encode(编码)与decode(解码)

汉字编码新尝试：字理组字编码方案v0.0

python 之路，致那些年，我们依然没搞明白的编码

2023-02-23：请用go语言调用ffmpeg，解码mp4文件并保存为YUV420P格式文件。

python中的一些解码和编码

php获取未解码之前的原始接口请求参数

RDIFramework.NET开发框架WinForm版新增编码管理

\x 开头编码的数据解码成中文

Javascript \x 反斜杠x 16进制 编解码

\x 开头编码的数据解码成中文的相关教程结束。

相关推荐

对python3编码那些事的小小总结

终于搞懂了python2和python3的encode(编码)与decode(解码)

汉字编码新尝试：字理组字编码方案v0.0

python 之路，致那些年，我们依然没搞明白的编码

2023-02-23：请用go语言调用ffmpeg，解码mp4文件并保存为YUV420P格式文件。

python中的一些解码和编码

php获取未解码之前的原始接口请求参数

RDIFramework.NET开发框架WinForm版新增编码管理

Javascript \x 反斜杠x 16进制编解码