简单分析了一下UTF8/GBK/Unicode(UTF16/UCS2)

Unicode: u'\u4e00'
UTF-8: '\xe4\xb8\x80'
GBK: '\xd2\xbb'

Unicode: u'\u9fa5'
UTF-8: '\xe9\xbe\xa5'
GBK: '\xfd\x9b'

Unicode: u'\u70eb'
UTF-8: '\xe7\x83\xab'
GBK: '\xcc\xcc'

Unicode: u'\u951f'
UTF-8: '\xe9\x94\x9f'
GBK: '\xef\xbf'

Unicode: u'\u65a4'
UTF-8: '\xe6\x96\xa4'
GBK: '\xbd\xef'

Unicode: u'\u62f7'
UTF-8: '\xe6\x8b\xb7'
GBK: '\xbf\xbd'

锟斤拷

Unicode: u'\u951f\u65a4\u62f7'
UTF-8: '\xe9\x94\x9f\xe6\x96\xa4\xe6\x8b\xb7'
GBK: '\xef\xbf\xbd\xef\xbf\xbd'

Unicode: u'\u941c'
UTF-8: '\xe9\x90\x9c'
GBK: '\xe7\x8e'

Unicode: u'\u5b2a'
UTF-8: '\xe5\xac\xaa'
GBK: '\x8b\xe5'

Unicode: u'\u608d'
UTF-8: '\xe6\x82\x8d'
GBK: '\xba\xb7'

鐜嬪悍

Unicode: u'\u941c\u5b2a\u608d'
UTF-8: '\xe9\x90\x9c\xe5\xac\xaa\xe6\x82\x8d'
GBK: '\xe7\x8e\x8b\xe5\xba\xb7'

Unicode: u'\u738b'
UTF-8: '\xe7\x8e\x8b'
GBK: '\xcd\xf5'

Unicode: u'\u5eb7'
UTF-8: '\xe5\xba\xb7'
GBK: '\xbf\xb5'

王康

Unicode: u'\u738b\u5eb7'
UTF-8: '\xe7\x8e\x8b\xe5\xba\xb7'
GBK: '\xcd\xf5\xbf\xb5'

王康

UTF-8 misdecode as GBK:
鐜嬪悍
u'\u941c\u5b2a\u608d'

GBK misdecode as UTF-8:
>_<

生成的代码见gist: https://gist.github.com/scateu/faa03c167066bf40d388