Python Programming: An Introduction to Computer Science

(Nora) #1
196 CHAPTER11. DATA COLLECTIONS

counts = {}
for w in words:
try:
counts[w]= counts[w] + 1
except KeyError:
counts[w]= 1

Ourlaststepis to printa reportthatsummarizesthecontentsofcounts. Oneapproachmightbeto print
outthelistofwordsandtheirassociatedcountsinalphabeticalorder. Here’s how thatcouldbedone:


get list of wordsthat appear in document


uniqueWords = counts.keys()


put list of wordsin alphabetical order


uniqueWords.sort()


print words and associatedcounts


for w in uniqueWords:
print w, counts[w]


Fora largedocument,however, thisis unlikelytobeuseful.Therewillbefartoomany words,mostof
whichonlyappeara few times.A moreinterestinganalysisis toprintoutthecountsforthenmostfrequent
wordsinthedocument.Inordertodothat,wewillneedtocreatea listthatis sortedbycounts(mostto
fewest)andthenselectthefirstnitemsinthelist.
We canstartbygettinga listofkey-valuepairsusingtheitemsmethodfordictionaries.


items = counts.items()


Hereitemswillbea listoftuples(e.g.,[(’foo’,5),(’bar’,7), (’spam’,376), ]). If we
simplysortthislist(items.sort()) Pythonwillputthemin a standardorder. Unfortunately, whenPython
comparestuples,it ordersthembycomponents,lefttoright.Sincethefirstcomponentofeachpairis the
word,items.sort()willputthislistinalphabeticalorder, whichis notwhatwewant.
Inordertoputourpairlistintheproperorder, weneedtoinvestigatethesortingmethodforlistsa bit
morecarefully. Whenwefirstcoveredthesortmethod,I mentionedthatit cantake a comparisonfunction
asanoptionalparameter. We canusethisfeaturetotellPythonhow tosortthelistofpairs.
If nocomparisonfunctionis given,Pythonordersa listaccordingtothethebuilt-infunctioncmp. This
functionacceptstwo valuesasparametersandreturns-1,0 or1, correspondingtotherelative orderingofthe
parameters.Thus,cmp(a,b)returns-1ifaprecedesb, 0 if they arethesame,and1 ifafollowsb. Here
area few examples.





cmp(1,2)
-1
cmp("a","b")
-1
cmp(3,1)
1
cmp(3.1,3.1)
0





To sortourlistofitems,weneeda comparisonfunctionthattakestwo items(i.e.,word-countpairs)and
returnseither-1,0 or1,givingtherelative orderinwhichwewantthosetwo itemstoappearinthesorted
list.Hereis thecodefora suitablecomparisonfunction:


def compareItems((w1,c1), (w2,c2)):
if c1 > c2:
return - 1
elif c1 == c2:

Free download pdf