Classification:
It is a
process of categorizing a given set of data into classes, It can be performed on both structured or unstructured
data. The process starts with predicting the class of given data points. The
classes are often referred to as target, label or categories.
Random forest classifier: Random forest, like its name implies, consists of a large number of individual decision trees that operate as an ensemble. Each individual tree in the random forest spits out a class prediction and the class with the most votes becomes our model’s prediction.
Iris Dataset: The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant.
Attribute Information:
1.
sepal length in cm
2. sepal width in cm
3. petal length in cm
4. petal width in cm
5. class:
-- Iris Setosa
-- Iris Versicolour
-- Iris Virginica
Below is the code to create Random Forest Classifier for classifying custom samples supplied from user. Output is class label (plan type : Setosa/Versicolour/Virginica)
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('Iris.csv')
data_points = data.iloc[:, 1:5]
labels = data.iloc[:, 5]
#split
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(data_points,labels,test_size=0.2)
# Classify using Random forest
from sklearn.ensemble import RandomForestClassifier
random_forest = RandomForestClassifier()
random_forest.fit(x_train, y_train)
print('Training data accuracy {:.2f}'.format(random_forest.score(x_train, y_train)*100))
print('Testing data accuracy {:.2f}'.format(random_forest.score(x_test, y_test)*100))
# predict for User Input
X_new = np.array([[3, 2, 1, 0.2], [ 4.9, 2.2, 3.8, 1.1 ], [ 5.3, 2.5, 4.6, 1.9 ]])
#classfication of the species from the input vector
classify = random_forest.predict(X_new)
print("classification of Species: {}".format(classify))
The output is predicted class labels.
No comments:
Post a Comment