The MagPi - July 2018

(Steven Felgate) #1 July 2018 31


to be robust enough and free to use
for our project.”
The results surprised him in
terms of accuracy and level of
detail: “People, pets, and large
objects seem to be the sweet spot.”

Even when the wand gets it
wrong, the results can be amusing.
“My kids had a lot of fun whenever
something was misidentified, such
as pointing at a toy robot on a table
and having it identified as ‘a small
child on a chair’. Another example
was pointing at our garage with a
sloping roof and being informed
there was ‘a skateboarder coming
down a hill’ – still not sure what it
thought the skateboarder was. My
favourite, though, had to be when
we pointed it at clouds and heard
what sounded like ‘Superman
flying across a blue sky’.”

Wiring the electronics
Components include a Pi Zero W,
Camera Module, and Speaker
pHAT. Wiring is currently via a mini
breadboard. The device is powered
by a 2200 mAh power cube.

PVC housing
The electronics are crammed into a PVC
tube. The camera fits into a closet-rod-
supporting end cap and is held in place by
rigid insulation, with its lens up against the
cap’s screw hole.

Two buttons
The breadboard holds two push-buttons:
one to take a photo of the item you want
to identify, and the other – wired to the
GPIO 03 and GND pins – to turn the Pi
Zero W on and off.


As per its original inspiration,
however, the Seeing Wand could
be of serious use to partially
sighted people. “Although there
are smartphone apps that do the
same thing, this could be a less

expensive and more human-
friendly device.”

Robert admits that the prototype
wand is a little rough around the
edges. “We have talked about
making improvements both to the
hardware and software. On the
hardware side, we would solder
all wires and buttons, and use a
smaller battery in order to make
it truly palm-sized and thinner
so it could fit as the holding end
of a white (blind) cane. For the
software, we’d like to integrate

A Pi Camera Module is used to take photos of items,
while speech is output through a Speaker pHAT

the text recognition and possibly
language translation services so
signs and printed material could
be read, and the face recognition
service so people could be
identified. Also, as the cognitive
services are not yet perfect, it
would be interesting to ‘poll’
multiple services and determine
which identification is best through
our own cognitive meta-service.”

Even when the wand gets it

wrong, the results can be amusing

Free download pdf