Elektor_Mag_-_January-February_2021

([email protected]) #1

100 January & February 2021 http://www.elektormagazine.com


The analysis and interpretion of data


that originates from our surroundings


and environment is a topic of increasing


interest. Such data is now an established


part of our everyday lives, ranging from that


collected about our climate to data acquired


during smart manufacturing processes.


The quantity of data allows us, in theory, to


characterize any phenomenon. However,


dealing with it requires the mastery of many


skills, both theoretical and practical. Here we


examine some of these techniques and make


use of Python for the analysis of this real-


world data.


By Angelo Cardellicchio (Italy)


Terms such as big data and artificial intelligence have become a
permanent entry in our everyday language. This is primarily due
to two factors. This first is the increasing and pervasive diffusion of
data acquisition systems that has allowed the creation of virtually
endless knowledge repositories. The second is the continued growth
in computational capability, due to the widespread use of GPGPUs
(general-purpose graphics processing units) [1], which has made
it possible to tackle computational challenges whose resolution
was once considered essentially impossible.


Let’s start with the description of an application scenario that
will accompany us through this article. Let’s imagine having to
monitor an entire production chain (the precise product does
not play a role here). We have the ability to acquire data from a
wide range of sources. For example, we can place sensors along


the entire production line, or make use of contextual information
that indicates the age and type of each machine. This set of data,
or dataset, can be used for different purposes. This could include
predictive maintenance, allowing us to evaluate and predict the
occurrence of abnormal situations, plan orders for replacement
parts, or undertake repairs before failures occur, all of which result
in cost savings and increased productivity. In addition, the knowl-
edge of the data’s history allows us to correlate the data measured
by each sensor, highlighting possible cause/effect relationships.
As an example, if a sudden increase in temperature and humidity
of the room was followed by a decrease in the number of pieces
manufactured, it may be necessary to make changes that maintain
constant climatic conditions using air conditioning.

The implementation of such a system is certainly not within every-
one’s reach. However, it is simplified by the tools made available
by the open source community. All that is required is a PC (or,
alternatively, our trusty Raspberry Pi can be used, if the amount
of data to be processed is not huge), a knowledge of Python (which
you can deepen by following a tutorial like this [2]) and, of course,
some knowledge of the ’tools of the trade’. OK — let’s get started
and discover them together!

The tools of the trade 
Needless to say, we must be able to create programs written in
Python. To do so, we will have to install the interpreter. This can be
found on the official Python website [3]. In the rest of this article
we will assume that Python has already been installed and added
to the system environment variables.

The virtual environment
Once the Python setup is complete, it is time to set up a virtual
environment. This is implemented as a sort of ’container’ that is
separate from the rest of our system and into which the libraries
used are installed. The reason for using a virtual environment for
the global installation of libraries relates to the rapid evolution of
the Python world. Very often, substantial differences arise even

regular title


Data Analysis


and Artificial


Intelligence in


Python


Interpreting real data with NumPy,


pandas and scikit-learn


software

Free download pdf