http://www.phparch.com \ May 2019 \ 31
Education Station
Data Structures, Part One
I do not find myself using stacks
nearly as often as queues, but they can
be convenient under the right circum-
stances. I just come across those use
cases much less often than cases where
I need to use FIFO.
Heaps
What do you do when you need
something other than FIFO or LIFO?
What about when you need to deal with
data in a specific order, but that order
is not tied to the actual insertion order?
Well, we have heaps!
Heaps are a data structure that sorts
data based on specific criteria. We can
somewhat simulate this with a bare
array and usort() as in Listing 8.
The big difference between a heap
and this example is that a heap keeps
the values automatically sorted for you.
In the above example, the “heap” is not
sorted until we call usort(). Yes, this is
what sort() and rsort() are precisely
for, but this is a good example to show
that sometimes we need to order our
values in a precise way.
We do have some proper heap classes:
\SplHeap, \SplMinHeap and \SplMaxHeap.
The last two classes automatically sort
scalar values by their lowest and highest
values, while the base class of \SplHeap
is useful for when you need to have a
custom comparator. For example, if
you have a set of objects you want to
sort, you can pass a custom comparison
function that knows how to sort the
objects.
You can specify a custom compar-
ison function for \SplMinHeap and
\SplMaxHeap if you need to. At that
point though, I’d suggest using
\SplHeap since it is a much more
generic heap implementation.
If we want to do the same thing as
above, our best bet is to use \SplMinHeap,
which iterates over the values from
smallest to largest, see Listing 9.
The big reason to use the various heap
classes is it automatically sorts for you.
While you can supply a custom sort
function for arrays, or use the built-in
sort() and rsort() functions, it relies
on you to actually call them to sort. It is
very easy to accidentally miss a call and
get things back in the wrong order.
Use the Best Tool for the Job
These are some of the more common
data structures that PHP lets you use.
I can not lie—PHP arrays are pretty
powerful on their own, and most of the
time I use them. When I need some-
thing more strict, it is nice that the SPL
provides some common data structures.
The problem is they kind of suck. \
SplQueue, which is billed as a queue and
should allow you to only move from
first to last, allows you to run $queue-
>pop() to get the value at the end. You
can even use $queue->unshift() to put
things at the front of the queue. Since
both \SplQueue and inherit from`, you
can manipulate them in ways that you
should not be able to.
Most of the time, these special data
structures are not any faster than a
bare array. In most applications, any
benefit your data might get from a
specialized data structure is negli-
gible, and sometimes even worse. For
example, inserting 10 million random
numbers into \SplMinHeap took roughly
14 seconds on my desktop, versus six
seconds to add and then sort the same
amount of numbers into an array. PHP’s
internal hash map it uses for arrays is
really efficient in practice.
What you do gain is more explicit
communication of your intent. If you
use a \SplStack object, you are very
clearly stating how you expect to use
this list of data. In your specific appli-
cation, the speed difference might not
even make a difference. Honestly, I
would take clearer, slightly slower code
over obtuse code any day (unless I
desperately need those microseconds).
What if you do need them? There
is an extension that provides properly
implemented data structures. It is the
aptly named “Data Structures” exten-
sion, and we look at it next month. If
you need heavy use of data structures, it
may be what you are looking for if, like
me, you are not impressed with the SPL
classes.
Chris Tankersley is a husband, father, author, speaker,
podcast host, and PHP developer. He works for InQuest, a
network security company out of Washington, DC, but lives in
Northwest Ohio. Chris has worked with many different
frameworks and languages throughout his twelve years of
programming but spends most of his day working in PHP and
Python. He is the author of Docker for Developers and works
with companies and developers for integrating containers into
their workflows. @dragonmantank
Listing 9
- $heap = new \SplMinHeap();
- $heap->insert( 6 );
- $heap->insert( 5 );
- $heap->insert( 10 );
- $heap->insert( 1 );
- foreach ($heap as $value) {
- print $value. PHP_EOL;
- }
- // Output
- // 1
- // 5
- // 6
- // 10
Related Reading
- Internal Aparatus: Memory Abstractions by Edward Barnard. May 2019.
https://phparch.com/magazine/2019-2/may/ - Leveling Up: The Art of Transduction by David Stockton. June 2016.
https://phparch.com/magazine/2016-2/june/