http://www.OpenSourceForU.com | OPEN SOURCE FOR YOU | DECEMBER 2017 | 79
Insight Developers
Collecting data: Data plays a vital role in the machine
learning process. It can be from various sources and formats
like Excel, Access, text files, etc. The higher the quality and
quantity of the data, the better the machine learns. This is the
base for future learning.
Preparing the data: After collecting data, its quality
must be checked and unnecessary noise and disturbances that
are not of interest should be eliminated from the data. We
need to take steps to fix issues such as missing data and the
treatment of outliers.
Training the model: The appropriate algorithm is
selected in this step and the data is represented in the form
of a model. The cleaned data is divided into training data
and testing data. The training data is used to develop the
data model, while the testing data is used as reference to
ensure that the model has been trained well to produce
accurate results.
Model evaluation: In this step, the accuracy and
precision of the chosen algorithm is ensured based on the
results obtained using the test data. This step is used to
evaluate the choice of the algorithm.
Performance improvement: If the results are not
satisfactory, then a different model can be chosen to
implement the same or more variables are introduced to
increase efficiency.
Types of machine learning algorithms
Machine learning algorithms have been classified into three
major categories.
Supervised learning: Supervised learning is the most
commonly used. In this type of learning, algorithms produce
a function which predicts the future outcome based on
the input given (historical data). The name itself suggests
that it generates output in a supervised fashion. So these
predictive models are given instructions on what needs to
be learnt and how it is to be learnt. Until the model achieves
some acceptable level of efficiency or accuracy, it iterates
over the training data.
To illustrate this method, we can use the algorithm for
sorting apples and mangoes from a basket full of fruits.
Figure 1: Traditional programming vs machine learning Figure 2: The process of teaching machines
Figure 3: Implementing machine learning
Figure 4: Classification of algorithms
Figure 5: Supervised learning model (Image credit: Google)
Here we know how we can identify the fruits based on their
colour, shape, size, etc.
Some of the algorithms we can use here are the
neural network, nearest neighbour, Naïve Bayes, decision
trees and regression.
Data
Data
Program
Output
Output
Program
Training data for the machine like text files,
SQL databases, spreadsheets etc.
Practical application happens here. It is used to generalize
the real-time data to derive new insights
Actual learning happens here by representing data
in simpler and logical format using algorithm
PERSON
STUDENT TEACHER
ISA
dataCollecting^ the dataPreparing^ the modelTraining^ EvaluationModel^ ImprovementPerformance
Machine
Learning
Algorithm
Expected
Label
Predictive
Model
New Text
Document,
Image,
Sound
Training
Text
Documents,
Images,
Sounds...
features
vector
Labels
Machine Learning
Algorithms
Classification
Reinforcement
(Association
Analysis)
Unsupervised
(Clustering,
Dimensionality
Reduction)
Supervised
(Classification,
Regression/
Prediction)