Evaluation Metrics - Classification Accuracy
Evaluation of classification model #
Generative model Naive Bayes models the joint distribution of the feature X and target Y, and then predicts the posterior probability given as P(y|x). e.g. Naive Bayes, Bayesian Networks and Hidden Markov models
Discriminative model Logistic regression directly models the posterior probability of P(y|x) by learning the input to output mapping by minimising the error. e.g. Logistic Regression, Support Vector Machine and Conditional Random Fields
True Positive Rate (or) Sensitivity (or) Recall #
- Measure how good a model is at predicting positive class when the actual outcome is positive.
False Positive Rate (or) Inverse Specificity (or) False alarm rate #
- Measure how ofter positive cases are prodicted when the outcome is negative.
Specificity #
\[Specificity = {TN \above{1pt} (TN + FP)}\]Precision #
- Precision and Recall do not use TN
F-measure (or) F1-Score #
- Harmonic mean of precision and recall
Akaike Information Criterion (AIC) #
- Estimates quality of models for the same dataset relative to each other
- Used as a means for selecting correct model
- Lower the score better the model
- Usually used if less test data; Train on the entire dataset and use the AIC to validate model performance
Bayes Factor #
Precision Recall curves #
- Used in binary classification
- Precision on y-axis and recall and x-axis for different thresholds
- Depicts trade-offs between TPR and positive predictive value using different probability thresholds
- Appropriate for imbalanced datasets
Receiver Operating Characteristic #
- Metric used to evaulte classifier output quality.
- Depicts trade-offs between TPR and FPR using different probability thresholds
- True +ve rate on the y-axis and False +ve rate on the x-axis. Top right corner is the ideal value.
- Appropriate for balanced datasets
- Larger AUC (Area under the curve) is better
- Typically used in binary classifier, to use in multilabel classifier it is required to binarize the output.
Micro-averaging #
- ROC alternate for mutilabel classifiers