[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1
>>> t = Text()
>>> t.insert('1.0', open('udata', 'rb').read())
>>> t.pack() # string appears in GUI OK
>>> t.get('1.0', 'end')
'AÄBäC\n'

It works the same if we pass a str fetched in text mode, but we then need to know the
encoding type on the Python side of the fence—reads will fail if the encoding type
doesn’t match the stored data:


>>> t = Text()
>>> t.insert('1.0', open('ldata', 'r', encoding='latin-1').read())
>>> t.pack()
>>> t.get('1.0', 'end')
'AÄBäC\n'
>>>
>>> t = Text()
>>> t.insert('1.0', open('udata', 'r', encoding='utf-8').read())
>>> t.pack()
>>> t.get('1.0', 'end')
'AÄBäC\n'

Either way, though, the fetched content is always a Unicode str, so binary mode really
only addresses loads: we still need to know an encoding to store, whether we write in
text mode directly or write in binary mode after manual encoding:


>>> c = t.get('1.0', 'end')
>>> c # content is str
'AÄBäC\n'

>>> open('cdata', 'wb').write(c) # binary mode needs bytes
TypeError: must be bytes or buffer, not str

>>> open('cdata', 'w', encoding='latin-1').write(c) # each write returns 6
>>> open('cdata', 'rb').read()
b'A\xc4B\xe4C\r\n'

>>> open('cdata', 'w', encoding='utf-8').write(c) # different bytes on files
>>> open('cdata', 'rb').read()
b'A\xc3\x84B\xc3\xa4C\r\n'

>>> open('cdata', 'w', encoding='utf-16').write(c)
>>> open('cdata', 'rb').read()
b'\xff\xfeA\x00\xc4\x00B\x00\xe4\x00C\x00\r\x00\n\x00'

>>> open('cdata', 'wb').write( c.encode('latin-1') ) # manual encoding first
>>> open('cdata', 'rb').read() # same but no \r on Win
b'A\xc4B\xe4C\n'

>>> open('cdata', 'w', encoding='ascii').write(c) # still must be compatible
UnicodeEncodeError: 'ascii' codec can't encode character '\xc4' in position 1: o

Notice the last test here: like manual encoding, file writes can still fail if the data cannot
be encoded in the target scheme. Because of that, programs may need to recover from


544 | Chapter 9: A tkinter Tour, Part 2

Free download pdf