You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While thinking about classifiction metrics, we have all felt that it would be nice to have metrics that can re
ward better classification methods. To give a concrete example, we would prefer a binary classification method that assigns a probability p, (1-p) for an event to belong to the two classes, rather than a method which chooses to assign only 0, 1 or 1, 0 values. Obviously, it is important that this be that case only if the classifier does this reliably, while it would be preferable to choose a discrete probability classifier if this is not the case. This might be an example of a consistency condition that we need. Can we figure out if we can have such a consistency condition, and ensure that metrics respect that?
The text was updated successfully, but these errors were encountered:
After a chat with @rbiswas4, I think we can identify some properties of classifiers we'd like our metric to favor and disfavor.
We do not want to encourage classification of only on the most common class(es) by ignoring rarer classes, i.e. all test set objects should not be of equal value.
We may want to reward classifiers that respect hierarchical classes, i.e. there could be a smaller penalty for misclassifying a SN Ib as SN II than for misclassifying a SN Ia as some type of AGN.
We may not want to favor classifiers that perform well only on the most perfect data (brightest/lowest noise/best possible sampling), i.e. a classifier with better performance on lower quality data could be rewarded more than a classifier that only functions on the highest quality data. Conversely, we might want a higher penalty for misclassifying in the presence of higher quality data than for lower quality data, i.e. the penalty for misclassification on lower quality data could be lower than for higher quality data.
These matters should influence the weightings that may need to be used for multi-class metrics. Do folks have more major considerations we want to bake into the metrics?
While thinking about classifiction metrics, we have all felt that it would be nice to have metrics that can re
ward better classification methods. To give a concrete example, we would prefer a binary classification method that assigns a probability p, (1-p) for an event to belong to the two classes, rather than a method which chooses to assign only 0, 1 or 1, 0 values. Obviously, it is important that this be that case only if the classifier does this reliably, while it would be preferable to choose a discrete probability classifier if this is not the case. This might be an example of a consistency condition that we need. Can we figure out if we can have such a consistency condition, and ensure that metrics respect that?
The text was updated successfully, but these errors were encountered: