data-architecture-a

(coco) #1
Fig. 2.1.9 The evolving architecture.

The Data Lake Architecture


The architecture component surrounding the data lake deserves a deeper explanation. In
front of the data lake is a mechanism for capturing and prepping the data about to enter
the data lake from external sources of data. There are several reasons for the need for an
elaborate interface. The primary reasons for the need for an ingestion interface are the
following:


Data arrive so fast that the data lake cannot ingest the data as rapidly as it is generated.
There are so much data that some sort of landing zone is appropriate.
Raw editing of data needs to be employed before the data arrive in the data lake. In some cases, data
are discarded. In other cases, data are categorized. In yet other cases, data are refurbished before their
entry into the data lake.

Fig. 2.1.10 shows the data lake infrastructure.


Chapter 2.1: The End-State Architecture—The “World Map”
Free download pdf