One of the most important things in machine learning is to define a proper performance metrics. For better machine learning performance, and also for the sake of easier interpretation on the result.
F1 score
f1 score is a good way to balance: precision and recall. In the following function, we convert the model prediction result into a format suitable for interpreting
import numpy as np
from sklearn.metrics import f1_score
def f1_score_func(preds, labels):
# prediction: the prediction generated by model: [0.7, 0.2, 0.04, 0.06, 0.0]
# labels: the true label
# we want to convert prediction into [1, 0, 0, 0, 0]
# the purpose of flatten is to convert [[]] into []
preds_flat = np.argmax(preds, axis=1).flatten()
labels_flat = labels.flatten()
return f1_score(labels_flat, preds_flat, average='weighted')
Accuracy: the most intuitive metrics
Accuracy is by telling the user how well the model is predicting
def accuracy_per_class(preds, labels):
# e.g. index 1 -> happy ; index 2 -> sad
label_dict_inverse = {v: k for k, v in label_dict.items()}
preds_flat= np.argmax(preds, axis=1).flatten()
labels_flat = labels.flatten()
# iterate through each class a prediction has been made
for label in np.unique(labels_flat):
# select all positions, where the ”true label” should be ”label”, and see what’s the value preds_flat at this positions
y_preds = preds_flat[labels_flat==label]
y_true = labels_flat[labels_flat==label]
print(f'class: {label_dict_inverse[label]}')
print(f'Accuracy: {len(y_preds[y_preds==label])/len(y_true)}\n')