data-architecture-a

(coco) #1

storage, the cost of big data would be prohibitive. In order to be a practical and useful
solution, big data, of necessity, must be able to use inexpensive storage.


The Roman Census Approach


One of the cornerstones of big data architecture is processing referred to as the “Roman
census approach.” By using the Roman census approach, a big data architecture can
accommodate the processing of almost unlimited amounts of data.


When people first hear the “Roman census approach,” it appears to be counterintuitive
and unfamiliar. The reaction most people have is “and just exactly what is a Roman
census approach?” Yet, the approach—architecturally—is at the core of the functioning
of big data. And—surprisingly—it turns out that many people are much more familiar
with the Roman census approach than they ever realized.


Once upon a time—about 2000 years ago—the Romans decided that they wanted to tax
everyone in the Roman empire. But in order to tax the citizens of the Roman empire, the
Romans first had to have a census. The Romans quickly figured out that trying to get
every person in the Roman empire to march through the gates of Rome in order to be
counted was an impossibility. There were people in North Africa, in Spain, in Germany,
in Greece, in Persia, in Israel, in England, and so forth. Not only were there a lot of
people in faraway places; trying to transport everyone on ships and carts and donkeys to
and from the city of Rome was simply an impossibility.


So, the Romans realized that creating a census where the processing (i.e., the counting
and the taking of the census) was done centrally was not going to work. The Romans
solved the problem by creating a body of “census takers.” The census takers were
organized in Rome and then were sent all over the Roman empire, and on the appointed
day, a census was taken. Then, after taking the census, the census takers headed back to
Rome where the results were tabulated centrally.


In such a fashion, the work being done was sent to the data, rather than trying to send the
data to a central location and doing the work in one place. By distributing the processing,
the Romans solved the problem of creating a census over a large diverse population.


Many people don't realize that they are very familiar with the Roman census method and
don't know it. You see, there once was a story about two people—Mary and Joseph—
who had to travel to a small city, Bethlehem, for the taking of a Roman census. On the


Chapter 4.2: What Is Big Data?
Free download pdf