Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation

Powers, David Martin

Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation

Files

Powers Evaluation.pdf (1.76 MB)

Date

2011-12-15

Authors

Powers, David Martin

Publisher

Bioinfo Publications

Rights

Author retains copyright of this version.

Rights Holder

Copyright Bioinfo Publications

Abstract

Commonly used evaluation measures including Recall, Precision, F-Measure and Rand Accuracy are biased and should not be used without clear understanding of the biases, and corresponding identification of chance or base case levels of the statistic. Using these measures a system that performs worse in the objective sense of Informedness, can appear to perform better under any of these commonly used measures. We discuss several concepts and measures that reflect the probability that prediction is informed versus chance. Informedness and introduce Markedness as a dual measure for the probability that prediction is marked versus chance. Finally we demonstrate elegant connections between the concepts of Informedness, Markedness, Correlation and Significance as well as their intuitive relationships with Recall and Precision, and outline the extension from the dichotomous case to the general multi-class case.

Keywords

Computational linguistics, Computer science, Artificial intelligence

Citation

Powers, D.M.W., 2011. Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation. Journal of Machine Learning Technologies, 2(1), 37-63.

URI

http://hdl.handle.net/2328/27165

Collections

David Powers
Australian Research Council (ARC)
Flinders Open Access Research

Full item page