(^46) CHAPTER 5 ■ WHAT'S NEW IN PHP 6
Listing 5-6. Unicode Casting
Note that not all characters can be converted to binary format. If you were to take the
Canadian Aboriginal Syllabics from Listing 5-4 and try to convert them to binary mode, you
would end up with all your text being replaced with question marks (3f bytes), because the
question mark is the default error-substitution character.
Unicode Collations
In addition to generally supporting Unicode, PHP 6 will support Unicode collations. One
feature of collations is that they allow you to sort a list based on the sorting rules of a specific
language or region. Listing 5-7 demonstrates sorting in the traditional Spanish collation, which
uses an extended 30-character alphabet, in which ch, ll, ñ, and rr are all considered separate
characters. Because of these extra characters, ll is sorted after l, just as the letter b is normally
sorted after a.Listing 5-7. Unicode Collation Sorting<?php$list = array('luna','llaves','limonada');//Normal alphabetical sort, lla before lu
sort($list);
print_r($list);//Collated sort, lla after lu
locale_set_default('es_VE@collation=traditional');
sort($list, SORT_LOCALE_STRING);
print_r($list);McArthur_819-9C05.fm Page 46 Wednesday, February 27, 2008 8:38 AM
