Open Source For You — December 2017

http://www.OpenSourceForU.com | OPEN SOURCE FOR YOU | DECEMBER 2017 | 81

Insight Developers

By: Palak Shah The author is an associate consultant in Capgemini and loves to explore new technologies. She is an avid reader and writer of technology related articles. She can be reached at [email protected]

Machine learning tools
There are enough open source tools or frameworks
available to implement machine learning on a system.
One can choose any, based on personal preferences for a
specific language or environment.
Shogun: Shogun is one of the oldest machine learning
libraries available in the market. It provides a wide range
of efficient machine learning processes. It supports
many languages such as Python, Octave, R, Java/ Scala,
Lua, C#, Ruby, etc, and platforms such as Linux/UNIX,
MacOS and Windows. It is easy to use, and is quite fast at
compilation and execution.
Weka: Weka is data mining software that has a
collection of machine learning algorithms to mine the data.
These algorithms can be applied directly to the data or
called from the Java code.
Weka is a collection of tools for:
Regression
Clustering
Association rules
Data pre-processing
Classification
Visualisation
Apache Mahout: Apache Mahout is a free and
open source project. It is used to build an environment
to quickly create scalable machine learning algorithms
for fields such as collaborative filtering, clustering and
classification. It also supports Java libraries and Java
collections for various kinds of mathematical operations.
TensorFlow: TensorFlow performs numerical
computations using data flow graphs. It performs
optimisations very well. It supports Python or C++,
is highly flexible and portable, and also has diverse
language options.
CUDA-Convnet: CUDA-Convnet is a machine
learning library widely used for neural network
applications. It has been developed in C++ and can even
be used by those who prefer Python over C++. The
resulting neural nets obtained as output from this library
can be saved as Python-pickled objects, and those objects
can be accessed from Python.
H2O: This is an open source machine learning as well
as deep learning framework. It is developed using Java,
Python and R, and it is used to control training due to its
powerful graphic interface. H2O’s algorithms are mainly
used for business processes like fraud or trend predictions.

Languages that support machine learning
The languages given below support the implementation of
the machine language:
MATLAB
R
Python
Java

But for a non-programmer, Weka is highly recommended when working with machine learning algorithms.

Advantages and challenges The advantages of machine learning are: Machine learning helps the system to decode based on the training data provided in the dynamic or undermined state. It can handle multi-dimensional, multi-variety data, and can extract implicit relationships within large data sets in a dynamic, complex and chaotic environment. It saves a lot of time by tweaking, adding, or dropping different aspects of an algorithm to better structure the data. It also uses continuous quality improvement for any large or complex process. There are multiple iterations that are done to deliver the highest level of accuracy in the final model. Machine learning allows easy application and comfortable adjustment of parameters to improve classification performance. The challenges of machine learning are as follows: A common challenge is the collection of relevant data. Once the data is available, it has to be pre-processed depending on the requirements of the specific algorithm used, which has a serious effect on the final results. Machine learning techniques are such that it is difficult to optimise non-differentiable, discontinuous loss functions. Discontinuous loss functions are important in cases such as sparse representations. Non-differentiable loss functions are approximated by smooth loss functions without much loss in sparsity. It is not guaranteed that machine learning algorithms will always work in every possible case. It requires some awareness about the problem and also some experience in choosing the right machine learning algorithm. Collection of such large amounts of data can sometimes be an unmanageable and unwieldy task.

[1] https://electronicsforu.com/technology-trends/ machine-learning-basics-newbies/3 [2] https://www.analyticsvidhya.com/blog/2015/06/ machine-learning-basics/ [3] https://martechtoday.com/how-machine-learning- works-150366 [4] https://machinelearningmastery.com/basic-concepts- in-machine-learning/

References

Open Source For You — December 2017

Get our desktop app

Company

Features

Documentation

Resources