Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation
Loading...
Date
2011-12-15
Authors
Powers, David Martin
Journal Title
Journal ISSN
Volume Title
Publisher
Bioinfo Publications
Rights
Author retains copyright of this version.
Rights Holder
Copyright Bioinfo Publications
Abstract
Commonly used evaluation measures including Recall, Precision, F-Measure and Rand Accuracy are
biased and should not be used without clear understanding of the biases, and corresponding identification of chance
or base case levels of the statistic. Using these measures a system that performs worse in the objective sense of
Informedness, can appear to perform better under any of these commonly used measures. We discuss several
concepts and measures that reflect the probability that prediction is informed versus chance. Informedness and
introduce Markedness as a dual measure for the probability that prediction is marked versus chance. Finally we
demonstrate elegant connections between the concepts of Informedness, Markedness, Correlation and Significance
as well as their intuitive relationships with Recall and Precision, and outline the extension from the dichotomous case
to the general multi-class case.
Description
Keywords
Computational linguistics, Computer science, Artificial intelligence
Citation
Powers, D.M.W., 2011. Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation. Journal of Machine Learning Technologies, 2(1), 37-63.