[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

>>> x 'AÄBäC'

>>> x = s.encode('utf-8').decode('latin-1') # decoding works, result is garbage >>> x UnicodeEncodeError: 'charmap' codec can't encode character '\xc3' in position 2:...

>>> len(s), len(x) # no longer the same string (5, 7)

>>> s.encode('utf-8') # no longer same code points b'A\xc3\x84B\xc3\xa4C' >>> x.encode('utf-8') b'A\xc3\x83\xc2\x84B\xc3\x83\xc2\xa4C'

>>> s.encode('latin-1') b'A\xc4B\xe4C' >>> x.encode('latin-1') b'A\xc3\x84B\xc3\xa4C'

Curiously, the original string may still be there after a mismatch like this—if we encode
the scrambled bytes back to Latin-1 again (as 8-bit characters) and then decode prop-
erly, we might restore the original (in some contexts this can constitute a sort of second
chance if data is decoded wrong initially):

>>> s 'AÄBäC' >>> s.encode('utf-8').decode('latin-1') UnicodeEncodeError: 'charmap' codec can't encode character '\xc3' in position 2:... >>> s.encode('utf-8').decode('latin-1').encode('latin-1') b'A\xc3\x84B\xc3\xa4C' >>> s.encode('utf-8').decode('latin-1').encode('latin-1').decode('utf-8') 'AÄBäC' >>> s.encode('utf-8').decode('latin-1').encode('latin-1').decode('utf-8') == s True

On the other hand, we can use a different encoding name to decode, as long as it’s
compatible with the format of the data; ASCII, UTF-8, and Latin-1, for instance, all
format ASCII text the same way:

>>> 'spam'.encode('utf8').decode('latin1') 'spam' >>> 'spam'.encode('latin1').decode('ascii') 'spam'

It’s important to remember that a string’s decoded value doesn’t depend on the en-
coding it came from—once decoded, a string has no notion of encoding and is simply
a sequence of Unicode characters (“code points”). Hence, we really only need to care
about encodings at the point of transfer to and from files:

>>> s 'AÄBäC' >>> s.encode('utf-16').decode('utf-16') == s.encode('latin-1').decode('latin-1') True

542 | Chapter 9: A tkinter Tour, Part 2

[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

Get our desktop app

Company

Features

Documentation

Resources