- Tcl attempts to convert byte strings to its internal UTF-8 format, and generally
supports translation using the platform and locale encodings in the local operating
system with Latin-1 as a fallback. - Python’s tkinter passes bytes strings to Tcl directly, but copies Python str Unicode
strings to and from Tcl Unicode string objects. - Tk inherits all of Tcl’s Unicode policies, but adds additional font selection policies
for display.
In other words, GUIs that display text in tkinter are somewhat at the mercy of multiple
layers of software, above and beyond the Python language itself. In general, though,
Unicode is broadly supported by Tk’s Text widget for Python str, but not for Python
bytes. As you can probably tell, though, this story quickly becomes very low-level and
detailed, so we won’t explore it further in this book; see the Web and other resources
for more on tkinter, Tk, and Tcl, and the interfaces between them.
Other binary mode considerations
Even in contexts where it’s sufficient, using binary mode files to finesse encodings for
display is more complicated than you might think. We always need to be careful to
write output in binary mode, too, so what we read is what we later write—if we read
in binary mode, content end-lines will be \r\n on Windows, and we don’t want text-
mode files to expand this to \r\r\n. Moreover, there’s another difference in tkinter for
str and bytes. A str read from a text-mode file appears in the GUI as you expect, and
end-lines are mapped on Windows as usual:
C:\...\PP4E\Gui\Tour> python
>>> from tkinter import *
>>> T = Text() # str from text-mode file
>>> T.insert('1.0', open('jack.txt').read()) # platform default encoding
>>> T.pack() # appears in GUI normally
>>> T.get('1.0', 'end')[:75]
'000) All work and no play makes Jack a dull boy.\n001) All work and no pla'
If you pass in a bytes obtained from a binary-mode file, however, it’s odd in the GUI
on Windows—there’s an extra space at the end of each line, which reflects the \r that
is not stripped by binary mode files:
C:\...\PP4E\Gui\Tour> python
>>> from tkinter import *
>>> T = Text() # bytes from binary-mode
>>> T.insert('1.0', open('jack.txt', 'rb').read()) # no decoding occurs
>>> T.pack() # lines have space at end!
>>> T.get('1.0', 'end')[:75]
'000) All work and no play makes Jack a dull boy.\r\n001) All work and no pl'
To use bytes to allow for arbitrary text but make the text appear as expected by users,
we also have to strip the \r characters at line end manually. This assumes that a \r\n
combination doesn’t mean something special in the text’s encoding scheme, though
data in which this sequence does not mean end-of-line will likely have other issues when
546 | Chapter 9: A tkinter Tour, Part 2