11

Thinking Pi

FORGE

model = DecisionTreeClassifier()

model.fit(X_train, y_train)

That’s it! Now you get predictions on the test data:

model.predict(X_test)

You can see how many of these it predicts correctly by manually comparing them to the answers stored in y_test. You should find that they’re mostly correct, but we can check the model’s performance in an easier way. Scikit-learn models have a .score function that will tell you what ratio of predictions it gets right for some input data, alongside their answers:

model.score(X_test, y_test)

We got a result of 0.9473684, which means we were 94% correct. It’s not perfect, but pretty good for a few lines of code. Now, because we are on a Raspberry Pi using Python, we can control anything we want with the output. Some LEDs perhaps? We can import the gpiozero library, which will allow us to easily control hardware using Python.

from gpiozero import LED # connect an LED to pin 17 on the Raspberry Pi led = LED(17)

We can create a function that will turn on the LEDs if we predict a flower to be of type setosa (output label of 0).

def led_on_if_setosa(input_data): # input data is a list with the following values: #[sepal length (cm), sepal width (cm), petal length (cm), petal width (cm)] prediction = model.predict([input_data])[0]

if prediction == 0: led.on() else: led.off()

Let’s see if we a found a setosa flower:

led_on_if_setosa([ 4.7, 3.2, 1.3, 0.2])

Hopefully your LED has turned on. This is the basic structure for code to bring machine learning to physical computing projects. You need some training data, a learning algorithm, and a way of performing actions depending on predictions. If an automatic flower identifying kit isn’t what you’re after, then you can send almost any type of data into this system (provided it’s not too noisy). Temperature and other environmental sensors can work really well, but it depends on what you want to control.

Above Iris versicolor, also known as the Blue Flag or Purple Iris, is the official flower of Quebec Credit Danielle Langlois CC-BY-SA

IRIS DATASET

The iris dataset is one of the most commonly used in machine learning. It was first presented by Ronald Fisher in the paper, The use of multiple measurements in taxonomic problems, released in 1936. The four attributes (Sepal Length, Sepal Width, Petal Length, and Petal Width) combine to determine the species, though no one can do it by itself. It’s a great dataset to get started with. If you want to try your new-found machine learning skills with more inputs, there are some other datasets at hsmag.cc/HGGFOa that you can download and use (though some of them may need a little manipulation before they’re in a suitable format).

11

Get our desktop app

Company

Features

Documentation

Resources