Open Source For You — December 2017

(Steven Felgate) #1
http://www.OpenSourceForU.com | OPEN SOURCE FOR YOU | DECEMBER 2017 | 79

Insight Developers

Collecting data: Data plays a vital role in the machine
learning process. It can be from various sources and formats
like Excel, Access, text files, etc. The higher the quality and
quantity of the data, the better the machine learns. This is the
base for future learning.
Preparing the data: After collecting data, its quality
must be checked and unnecessary noise and disturbances that
are not of interest should be eliminated from the data. We
need to take steps to fix issues such as missing data and the
treatment of outliers.
Training the model: The appropriate algorithm is
selected in this step and the data is represented in the form
of a model. The cleaned data is divided into training data
and testing data. The training data is used to develop the
data model, while the testing data is used as reference to
ensure that the model has been trained well to produce
accurate results.
Model evaluation: In this step, the accuracy and
precision of the chosen algorithm is ensured based on the
results obtained using the test data. This step is used to
evaluate the choice of the algorithm.
Performance improvement: If the results are not
satisfactory, then a different model can be chosen to
implement the same or more variables are introduced to
increase efficiency.

Types of machine learning algorithms
Machine learning algorithms have been classified into three
major categories.
Supervised learning: Supervised learning is the most
commonly used. In this type of learning, algorithms produce
a function which predicts the future outcome based on
the input given (historical data). The name itself suggests
that it generates output in a supervised fashion. So these
predictive models are given instructions on what needs to
be learnt and how it is to be learnt. Until the model achieves
some acceptable level of efficiency or accuracy, it iterates
over the training data.
To illustrate this method, we can use the algorithm for
sorting apples and mangoes from a basket full of fruits.

Figure 1: Traditional programming vs machine learning Figure 2: The process of teaching machines


Figure 3: Implementing machine learning

Figure 4: Classification of algorithms

Figure 5: Supervised learning model (Image credit: Google)

Here we know how we can identify the fruits based on their
colour, shape, size, etc.
Some of the algorithms we can use here are the
neural network, nearest neighbour, Naïve Bayes, decision
trees and regression.

Data

Data

Program

Output

Output

Program

Training data for the machine like text files,
SQL databases, spreadsheets etc.

Practical application happens here. It is used to generalize
the real-time data to derive new insights

Actual learning happens here by representing data
in simpler and logical format using algorithm

PERSON
STUDENT TEACHER

ISA

dataCollecting^ the dataPreparing^ the modelTraining^ EvaluationModel^ ImprovementPerformance

Machine
Learning
Algorithm

Expected
Label
Predictive
Model

New Text
Document,
Image,
Sound

Training
Text
Documents,
Images,
Sounds...

features
vector

Labels

Machine Learning
Algorithms
Classification

Reinforcement
(Association
Analysis)

Unsupervised
(Clustering,
Dimensionality
Reduction)

Supervised
(Classification,
Regression/
Prediction)
Free download pdf