Part 1¶
Tutorial Part 1: Regression classification model end-to-end example/tutorial¶
Getting data¶
# Import libraries
from sklearn.datasets import make_blobs
import tensorflow as tf
import matplotlib.pyplot as plt
# Generate toy dataset with scikit-learn
X, y = make_blobs(n_samples=1000, centers=5, n_features=2) # in NN X is often used for input data and y for output
print(X.shape, y.shape)
# Plot dataset with matplotlib
plt.figure(figsize=(10, 8)) # initialize matplotlib figure
plt.title('Our dataset') # add title to our graph
plt.scatter(X[:, 0], X[:, 1], marker='o', c=y)
(1000, 2) (1000,)
<matplotlib.collections.PathCollection at 0x15c661647f0>
Note 1: Here we have (of one point):
- Input: coordinates of a point in form (x, y)
- Output/label/feature: color of it.
So here we have basic regression classification problem - by given coordinates of a point predict its class (on graph it could be seen as different colors).
We have 5 different classes - [0, 1, 2, 3, 4]
Note 2: because we have multiple labels (not 2) it is multiclass classification problem. If we had 2 classes (f.e. blue or red), then it would be binary classification problem.
Prepare data¶
# Split into train/test datasets
# Note: Usually whole dataset is split into train, test and validation datasets with ratio 70/20/10.
# Here we won't have validation dataset so ratio train/test will be 80/20.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train.shape, X_test.shape, y_train.shape, y_test.shape
((800, 2), (200, 2), (800,), (200,))
Build model¶
# With Sequential API
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(10, activation='relu'),
tf.keras.layers.Dense(5, activation='softmax')
])
Note: this NN consists of only 2 layers - Input and output.
In input layer we have non-linear activation function - ReLU
.
By using linear activation functions, we have decision boundary only as straight line. Usually it leaves some data out (data isn't always linearly divided in the real world), models have less accuracy. Because of that in the most cases non-linear activation functions are used in NN. If you use non-linear activation functions, our model will be more elastic in categorizing data. In such way there will be fewer points that are falsely categorized (output != ground truth). Visit this website to play with concept of non-linearity!
In output layer we have:
- 5 neurons as we have 5 classes
In multiclass classification problem neuron number in output layer should be the same as label number. In binary classification we would've had only 1 neuron! In classification problems on output we have probabilities of input data being one of classes.
softmax
activation function - more information.In binary classification we would have
sigmoid
activation function.
# compile model
model.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(),
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
# train model
model.fit(x=X_train,
y=y_train,
validation_data=(X_test, y_test),
epochs=20)
Epoch 1/20 25/25 [==============================] - 2s 15ms/step - loss: 4.0471 - accuracy: 0.2325 - val_loss: 3.6445 - val_accuracy: 0.2650 Epoch 2/20 25/25 [==============================] - 0s 5ms/step - loss: 3.4670 - accuracy: 0.2788 - val_loss: 3.1015 - val_accuracy: 0.2850 Epoch 3/20 25/25 [==============================] - 0s 4ms/step - loss: 2.9913 - accuracy: 0.3338 - val_loss: 2.6844 - val_accuracy: 0.4250 Epoch 4/20 25/25 [==============================] - 0s 5ms/step - loss: 2.6197 - accuracy: 0.4737 - val_loss: 2.3478 - val_accuracy: 0.5700 Epoch 5/20 25/25 [==============================] - 0s 5ms/step - loss: 2.3139 - accuracy: 0.5150 - val_loss: 2.0752 - val_accuracy: 0.5900 Epoch 6/20 25/25 [==============================] - 0s 5ms/step - loss: 2.0563 - accuracy: 0.5537 - val_loss: 1.8375 - val_accuracy: 0.6300 Epoch 7/20 25/25 [==============================] - 0s 5ms/step - loss: 1.8286 - accuracy: 0.5900 - val_loss: 1.6283 - val_accuracy: 0.6600 Epoch 8/20 25/25 [==============================] - 0s 5ms/step - loss: 1.6274 - accuracy: 0.6162 - val_loss: 1.4430 - val_accuracy: 0.6700 Epoch 9/20 25/25 [==============================] - 0s 6ms/step - loss: 1.4477 - accuracy: 0.6375 - val_loss: 1.2726 - val_accuracy: 0.6700 Epoch 10/20 25/25 [==============================] - 0s 5ms/step - loss: 1.2823 - accuracy: 0.6525 - val_loss: 1.1239 - val_accuracy: 0.6800 Epoch 11/20 25/25 [==============================] - 0s 5ms/step - loss: 1.1370 - accuracy: 0.6513 - val_loss: 0.9876 - val_accuracy: 0.6800 Epoch 12/20 25/25 [==============================] - 0s 4ms/step - loss: 1.0076 - accuracy: 0.6525 - val_loss: 0.8738 - val_accuracy: 0.6850 Epoch 13/20 25/25 [==============================] - 0s 5ms/step - loss: 0.9019 - accuracy: 0.6162 - val_loss: 0.7812 - val_accuracy: 0.6600 Epoch 14/20 25/25 [==============================] - 0s 6ms/step - loss: 0.8208 - accuracy: 0.6762 - val_loss: 0.7146 - val_accuracy: 0.7750 Epoch 15/20 25/25 [==============================] - 0s 6ms/step - loss: 0.7630 - accuracy: 0.7300 - val_loss: 0.6670 - val_accuracy: 0.7750 Epoch 16/20 25/25 [==============================] - 0s 5ms/step - loss: 0.7212 - accuracy: 0.7412 - val_loss: 0.6332 - val_accuracy: 0.7850 Epoch 17/20 25/25 [==============================] - 0s 5ms/step - loss: 0.6883 - accuracy: 0.7500 - val_loss: 0.6056 - val_accuracy: 0.7900 Epoch 18/20 25/25 [==============================] - 0s 4ms/step - loss: 0.6606 - accuracy: 0.7713 - val_loss: 0.5806 - val_accuracy: 0.8100 Epoch 19/20 25/25 [==============================] - 0s 4ms/step - loss: 0.6360 - accuracy: 0.7925 - val_loss: 0.5587 - val_accuracy: 0.8350 Epoch 20/20 25/25 [==============================] - 0s 4ms/step - loss: 0.6139 - accuracy: 0.8125 - val_loss: 0.5380 - val_accuracy: 0.8450
<keras.callbacks.History at 0x15c6639b6a0>
# model summary
model.summary()
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense (Dense) (None, 10) 30 dense_1 (Dense) (None, 5) 55 ================================================================= Total params: 85 Trainable params: 85 Non-trainable params: 0 _________________________________________________________________
# plot model
tf.keras.utils.plot_model(model)
Note: here would go evaluation part.
Plot results¶
pred_prob = model.predict([[2.0, -5.0]])
pred_prob
1/1 [==============================] - 0s 44ms/step
array([[9.2120802e-01, 2.4555121e-03, 4.4698096e-03, 9.5306876e-07, 7.1865663e-02]], dtype=float32)
import numpy as np
res = np.argmax(pred_prob, axis=1)
res
array([0], dtype=int64)
# Plot decision boundary
import numpy as np
def plot_decision_boundary(model, X, y):
"""
Plots the decision boundary created by a model predicting on X.
This function has been adapted from two phenomenal resources:
1. CS231n - https://cs231n.github.io/neural-networks-case-study/
2. Made with ML basics - https://github.com/GokuMohandas/MadeWithML/blob/main/notebooks/08_Neural_Networks.ipynb
"""
# Define the axis boundaries of the plot and create a meshgrid
x_min, x_max = X[:, 0].min() - 0.1, X[:, 0].max() + 0.1
y_min, y_max = X[:, 1].min() - 0.1, X[:, 1].max() + 0.1
xx, yy = np.meshgrid(np.linspace(x_min, x_max, 100),
np.linspace(y_min, y_max, 100))
# Create X values (we're going to predict on all of these)
x_in = np.c_[xx.ravel(), yy.ravel()] # stack 2D arrays together: https://numpy.org/devdocs/reference/generated/numpy.c_.html
# Make predictions using the trained model
y_pred = model.predict(x_in)
# Check for multi-class
if model.output_shape[-1] > 1: # checks the final dimension of the model's output shape, if this is > (greater than) 1, it's multi-class
print("doing multiclass classification...")
# We have to reshape our predictions to get them ready for plotting
y_pred = np.argmax(y_pred, axis=1).reshape(xx.shape)
else:
print("doing binary classification...")
y_pred = np.round(np.max(y_pred, axis=1)).reshape(xx.shape)
# Plot decision boundary
plt.contourf(xx, yy, y_pred, cmap=plt.cm.RdYlBu, alpha=0.7)
plt.scatter(X[:, 0], X[:, 1], c=y, s=40, cmap=plt.cm.RdYlBu)
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plot_decision_boundary(model, X, y)
doing multiclass classification...
## Save model
model.save('model') # save model in TensorFlow default format
model.save('model.h5') # save model in .h5 format
tf.keras.models.load_model('model.h5')
INFO:tensorflow:Assets written to: model\assets
<keras.engine.sequential.Sequential at 0x20919878220>
More information about saving models.