One of the most important things in machine learning is to define a proper performance metrics. For better machine learning performance, and also for the sake of easier interpretation on the result.
F1 score
f1 score is a good way to balance: precision and recall. In the following function, we convert the model prediction result into a format suitable for interpreting
import numpy as npfrom sklearn.metrics import f1_scoredeff1_score_func(preds,labels):# prediction: the prediction generated by model: [0.7, 0.2, 0.04, 0.06, 0.0]# labels: the true label# we want to convert prediction into [1, 0, 0, 0, 0]# the purpose of flatten is to convert [[]] into [] preds_flat = np.argmax(preds, axis=1).flatten() labels_flat = labels.flatten()returnf1_score(labels_flat, preds_flat, average='weighted')
Accuracy: the most intuitive metrics
Accuracy is by telling the user how well the model is predicting
defaccuracy_per_class(preds,labels):# e.g. index 1 -> happy ; index 2 -> sad label_dict_inverse ={v: k for k, v in label_dict.items()} preds_flat= np.argmax(preds, axis=1).flatten() labels_flat = labels.flatten()# iterate through each class a prediction has been madefor label in np.unique(labels_flat):# select all positions, where the ”true label” should be ”label”, and see what’s the value preds_flat at this positions y_preds = preds_flat[labels_flat==label] y_true = labels_flat[labels_flat==label]print(f'class: {label_dict_inverse[label]}')print(f'Accuracy: {len(y_preds[y_preds==label])/len(y_true)}\n')