Python Programming: An Introduction to Computer Science

(Nora) #1
44 CHAPTER4. COMPUTINGWITHSTRINGS

imagination,youmightusethenumbers1-26torepresentthelettersa–z.Insteadoftheword“sourpuss,” you
wouldwrite“18,14,20,17,15,20,18,18.” To thosewhodon’t know thecode,thislookslike a meaningless
stringofnumbers.Foryouandyourfriend,however, it representsa word.
Thisishowa computerrepresentsstrings. Eachcharacteristranslatedintoa number, andtheentire
stringis storedasa sequenceof(binary)numbersincomputermemory. It doesn’t reallymatterwhatnumber
is usedtorepresentany givencharacteraslongasthecomputeris consistentabouttheencoding/decoding
process.Intheearlydaysofcomputing,differentdesignersandmanufacturersuseddifferentencodings.You
canimaginewhata headachethiswasforpeopletransferringdatabetweendifferentsystems.
Considerthesituationthatwouldresultif,say, PCsandMacIntoshcomputerseachusedtheirownen-
coding.If youtypea termpaperona PCandsave it asa text file,thecharactersinyourpaperarerepresented
asa certainsequenceofnumbers.Thenif thefilewasreadintoyourinstructor’s MacIntoshcomputer, the
numberswouldbedisplayedonthescreenasdifferentcharactersfromtheonesyoutyped.Theresultwould
looklike gibberish!
To avoidthissortofproblem,computersystemstodayuseindustrystandardencodings.Oneimportant
standardis calledASCII(AmericanStandardCodeforInformationInterchange).ASCIIusesthenumbers
0 through 127 torepresentthecharacterstypicallyfoundonan(American)computerkeyboard,aswell
ascertainspecialvaluesknownascontrolcodesthatareusedtocoordinatethesendingandreceivingof
information.Forexample,thecapitallettersA–Zarerepresentedbythevalues65–90,andthelowercase
versionshave codes97–122.
OneproblemwiththeASCIIencoding,asitsnameimplies,is thatit is American-centric.It doesnothave
symbolsthatareneededinmany otherlanguages.ExtendedASCIIencodingshave beendevelopedbythe
InternationalStandardsOrganizationtoremedythissituation.Mostmodernsystemsaremovingtosupport
of UniCodeamuchlargerstandardthatincludessupportforthecharactersofallwrittenlanguages.Newer
versionsofPythonincludesupportforUniCodeaswellasASCII.
Pythonprovidesa couplebuilt-infunctionsthatallow ustoswitchbackandforthbetweencharactersand
thenumericvaluesusedtorepresenttheminstrings. Theordfunctionreturnsthenumeric(“ordinal”)code
ofa single-characterstring,whilechrgoestheotherdirection.Herearesomeinteractive examples:





ord("a")
97
ord("A")
65
chr(97)
’a’
chr(90)
’Z’





4.3.2 ProgramminganEncoder


Let’s returntothenote-passingexample. UsingthePythonordandchrfunctions,wecanwritesome
simpleprogramsthatautomatetheprocessofturningmessagesintostringsofnumbersandbackagain.The
algorithmforencodingthemessageis simple.


get the message to encode
for each characterin the message:
print the letternumber of the character


Gettingthemessagefromtheuseris easy, arawinputwilltake careofthatforus.

message = raw_input("Please enter the message to encode: ")


Implementingthelooprequiresa bitmoreeffort.We needto dosomethingforeachcharacterofthemessage.
Recallthataforloopiteratesovera sequenceofobjects.Sincea stringis a kindofsequence,wecanjust
useaforlooptorun-throughallthecharactersofthemessage.


for ch in message:

Free download pdf