Street Photography Magazine

(Elle) #1
Invisible Watermarks | Tracing Images

Alongside effective watermarking, finding
marked images is another very interesting
aspect of the data security process. The best
watermark in the world is no use at all if it
simply disappears in the mass of data stored
on the Internet. All the major watermark
services run their own crawler services for
finding and retrieving watermarked images
from the endless depths of the Web.


Capacity Is Key


Server capacity and available bandwidth are
critical to the success of a crawler service.
Nowadays, any small company has access to
entire server farms via Amazon Cloud Drive
or similar services, so you need to treat the
statistics quoted by a potential crawler
service provider with care. As an end user, it
is virtually impossible to find out exactly
what resources a provider has at its disposal.
According to company data, market leader
Digimarc crawls the Web monthly and claims
that it usually takes between one and six


months to locate a stolen image, although
precise estimates are simply not possible.
Like search engine optimization, Web
crawling is subject to a number of factors
that influence a provider’s ability to find
exactly what it is looking for.
An image has to be recognizable to a
crawler if it is to be found at all. If stolen
images are presented as part of a Flash gallery,
most crawlers will have trouble finding them.
Even if an image is stored in a format that
the crawler can read, there is still no guarantee
that it can be found. The chances of finding
a stolen image are best if it is part of a
well-frequented and technically well-built
website. Less popular sites that are not so
search engine-friendly make it much more
difficult to produce positive results. If the
particular page on which an image is posted
is not linked internally or externally, a crawler
will quickly reach the limits of its capabilities.
Crawlers depend on Web content that has
been appropriately optimized for search
engines.

Social Networks and
Crawlers Don’t Get On Well

The situation is also quite tricky at photo
portals like Google’s Picasa and on social
networks like Facebook. These types of
website deliver dynamic content stored in
CMS (Content Management System)
databases. This is common practice at larger
websites and their operators always try to
create search-friendly pages at all levels of the
database structure. Ideally, a site will use CMS
to simulate a static structure that a crawler can
easily interpret. Search engine optimization
plays only a minor role at sites such as
Facebook where most of the content is only
visible to members, making it extremely
difficult for a crawler to find all the relevant
related content. Additionally, a spider will not
usually have appropriate access rights, and
will not be able to search relevant pages even
if it can find them.
Generally, crawler software can only find
publicly available material or material for
which it has appropriate access rights, making
password-protected or pay-to-view sites off
limits. Adult entertainment sites are nowadays
almost exclusively pay-to-view, and major
publishers are placing more and more of their
content behind commercial barriers of one
sort or another. If a thief steals images
protected using just a watermark and posts
them on a pay-to-view web page, crawler
software will be powerless and the watermark
becomes useless.

Not Perfect, but still Useful


A manual search using visual source material
at TinEye or Google is painstaking, whereas
crawler services are capable of monitoring
large collections of images automatically.
Crawler technology is imperfect, and it is
fairly simple for a webmaster to undermine
the rudimentary protection it provides.
Nevertheless, crawler services are a useful
tool for content owners and make it
relatively easy to document image theft. If a
thief goes public with a stolen image, the
chances are pretty high that a crawler will
find it sooner or later.

Using Crawler Software to Find Stolen Images


Invisible watermarks make it possible to use crawler software to find stolen images in
the wilds of the Internet, although there are various technical challenges that you need
to overcome before you can start to retrieve lost data. Even the best spider bots are
choosy about what they can find, and they often take their time too.


The ADP Tools crawler searches the Web for images in a similar
way to the familiar Google Image Search

Free download pdf