Open Source For You — December 2017

http://www.OpenSourceForU.com | OPEN SOURCE FOR YOU | DECEMBER 2017 | 79

Insight Developers

Collecting data: Data plays a vital role in the machine learning process. It can be from various sources and formats like Excel, Access, text files, etc. The higher the quality and quantity of the data, the better the machine learns. This is the base for future learning. Preparing the data: After collecting data, its quality must be checked and unnecessary noise and disturbances that are not of interest should be eliminated from the data. We need to take steps to fix issues such as missing data and the treatment of outliers. Training the model: The appropriate algorithm is selected in this step and the data is represented in the form of a model. The cleaned data is divided into training data and testing data. The training data is used to develop the data model, while the testing data is used as reference to ensure that the model has been trained well to produce accurate results. Model evaluation: In this step, the accuracy and precision of the chosen algorithm is ensured based on the results obtained using the test data. This step is used to evaluate the choice of the algorithm. Performance improvement: If the results are not satisfactory, then a different model can be chosen to implement the same or more variables are introduced to increase efficiency.

Types of machine learning algorithms Machine learning algorithms have been classified into three major categories. Supervised learning: Supervised learning is the most commonly used. In this type of learning, algorithms produce a function which predicts the future outcome based on the input given (historical data). The name itself suggests that it generates output in a supervised fashion. So these predictive models are given instructions on what needs to be learnt and how it is to be learnt. Until the model achieves some acceptable level of efficiency or accuracy, it iterates over the training data. To illustrate this method, we can use the algorithm for sorting apples and mangoes from a basket full of fruits.

Figure 1: Traditional programming vs machine learning Figure 2: The process of teaching machines

Figure 3: Implementing machine learning

Figure 4: Classification of algorithms

Figure 5: Supervised learning model (Image credit: Google)

Here we know how we can identify the fruits based on their colour, shape, size, etc. Some of the algorithms we can use here are the neural network, nearest neighbour, Naïve Bayes, decision trees and regression.

Data

Program

Output

Program

Training data for the machine like text files, SQL databases, spreadsheets etc.

Practical application happens here. It is used to generalize the real-time data to derive new insights

Actual learning happens here by representing data in simpler and logical format using algorithm

PERSON STUDENT TEACHER

ISA

dataCollecting^ the dataPreparing^ the modelTraining^ EvaluationModel^ ImprovementPerformance

Machine Learning Algorithm

Expected Label Predictive Model

New Text Document, Image, Sound

Training Text Documents, Images, Sounds...

features vector

Labels

Machine Learning Algorithms Classification

Reinforcement (Association Analysis)

Unsupervised (Clustering, Dimensionality Reduction)

Supervised (Classification, Regression/ Prediction)

Open Source For You — December 2017

Get our desktop app

Company

Features

Documentation

Resources