Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP prettyToString(ConfusionMatrix<MultiLabel>) and labelConfusionMat… #128

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

nezda
Copy link
Contributor

@nezda nezda commented Apr 6, 2021

Description

Simpler attempt at "conversion" of MultiLabelConfusionMatrix to corresponding LabelConfusionMatrix) - not meant to actually be merged. How can this be done better?

Seems correct except testMultiLabelConfusionMatrixToStrings() labelConfusionMatrixToString has off-diagonal values?

Motivation

Such a feature would make analysis of MultiLabel models simpler: in particular MultiLabelConfusionMatrix toString() is hard to interpret. Compare testMultiLabel()

[DenseMatrix(dim1=2,dim2=2,values=
	row 0 [ 3.000000000000000, 0.000000000000000];
	row 1 [ 0.000000000000000, 1.000000000000000];
)
DenseMatrix(dim1=2,dim2=2,values=
	row 0 [ 0.000000000000000, 1.000000000000000];
	row 1 [ 1.000000000000000, 2.000000000000000];
)
DenseMatrix(dim1=2,dim2=2,values=
	row 0 [ 2.000000000000000, 1.000000000000000];
	row 1 [ 1.000000000000000, 0.000000000000000];
)
]

to pretty toString

(LabelSet={a})
    [tn: 3 fn: 0]
    [fp: 0 tp: 1]
(LabelSet={b})
    [tn: 0 fn: 1]
    [fp: 1 tp: 2]
(LabelSet={c})
    [tn: 2 fn: 1]
    [fp: 1 tp: 0]

or labelConfusionMatrixToString

       a   b   c
a      1   1   0
b      0   2   1
c      0   1   0

Unfortunately, something seems off in testMultiLabelConfusionMatrixToStrings()

current toString()

[DenseMatrix(dim1=2,dim2=2,values=
	row 0 [ 2.000000000000000, 0.000000000000000];
	row 1 [ 0.000000000000000, 2.000000000000000];
)
DenseMatrix(dim1=2,dim2=2,values=
	row 0 [ 2.000000000000000, 0.000000000000000];
	row 1 [ 0.000000000000000, 2.000000000000000];
)
DenseMatrix(dim1=2,dim2=2,values=
	row 0 [ 2.000000000000000, 0.000000000000000];
	row 1 [ 0.000000000000000, 2.000000000000000];
)
]

as pretty toString looks fine

(LabelSet={MONKEY})
    [tn: 2 fn: 0]
    [fp: 0 tp: 2]
(LabelSet={PUZZLE})
    [tn: 2 fn: 0]
    [fp: 0 tp: 2]
(LabelSet={TREE})
    [tn: 2 fn: 0]
    [fp: 0 tp: 2]

but why does labelConfusionMatrixToString have off-diagonal values?

          MONKEY  PUZZLE    TREE
MONKEY         2       1       1
PUZZLE         1       2       1
TREE           1       1       2

…rixToString(ConfusionMatrix<MultiLabel>)

- seems correct except testMultiLabelConfusionMatrixToStrings() labelConfusionMatrixToString has off-diagonal values?
@Craigacp
Copy link
Member

Craigacp commented Apr 6, 2021

Thanks for working this up. Before I can review this in detail you'll need to either sign the OCA, or let me know what email address you've previously signed it under. There will be a bot which does this automatically at some point, but it isn't enabled on the Tribuo repo yet.

@nezda
Copy link
Contributor Author

nezda commented Apr 6, 2021

Hmm, I made an account and trying to submit an Individual OCA but tribuo isn't listed in https://oca.opensource.oracle.com/?ojr=contrib-create%3Btype%3Dindividual Projects? I must be doing doing something silly?

image

@Craigacp
Copy link
Member

Craigacp commented Apr 6, 2021

I'll follow up internally. The OCA signing process recently changed, maybe Tribuo isn't enrolled in it properly.

@Craigacp
Copy link
Member

Craigacp commented Apr 6, 2021

Could you try again? It should be in the system now.

@nezda
Copy link
Contributor Author

nezda commented Apr 6, 2021

All good @Craigacp - my OCA is "Under Review"

@Craigacp
Copy link
Member

Craigacp commented Apr 6, 2021

Thanks. I think the turnaround time is 24-48 hours for this, which is frustrating, but at least it's a one time thing. Hopefully as the new system beds in it'll get quicker (in the old system you had to mail in a form).

@nezda
Copy link
Contributor Author

nezda commented Apr 12, 2021

@Craigacp my OCA was accepted.

@Craigacp Craigacp added the OCA signed This PR is from a person/organisation who has signed the OCA label Apr 13, 2021
@Craigacp
Copy link
Member

Excellent. We should get the bot running at some point soon so it'll tag things when people have signed it.

I'll dig into the confusion matrix having off diagonal entries, that might be an issue in how we tabulate the results. We've got some tests that covered this so I'll try to figure out what's different in your example.

In the meantime the prettyToString looks good, could you replace MultiLabelConfusionMatrix.toString with it? I think the main issue with the other one is coming up with a good name for it and also to a lesser extent that users access the confusion matrix through a superclass so it's not going to have that method. Maybe it have a definition on MultiLabelEvaluation and be implemented by MultiLabelEvaluationImpl. Not sure what a good name for it is though, any ideas?

@Craigacp
Copy link
Member

After some consideration I'm not sure what the off diagonal elements on an all label by all label multi-label confusion matrix should correspond to. If the true label is a and the prediction is b then that's straightforward to record, but if the true label is (a,c) and the prediction is b, or if the true label is a and the prediction is (a,b,c) then I'm not sure what values should be updated. In which case I'm not clear what we're currently computing.

How would you expect the off diagonal entries to be computed?

@nezda
Copy link
Contributor Author

nezda commented Apr 16, 2021

the prettyToString looks good, could you replace MultiLabelConfusionMatrix.toString with it?

done - PR updated - I added a little whitespace and simplified the implementation slightly with String.join

How would you expect the off diagonal entries to be computed?
if the true label is (a,c) and the prediction is b

I guess treat each atomic label independently, as if it were true label a, prediction b; true label c, prediction is b

if the true label is a and the prediction is (a,b,c)

Treating each atomic label independently would give: true label a, prediction a; true label a, prediction b; true label a, prediction c.

I attempted to "fix" my current implementation and both test cases now almost give results I'd expect but I don't feel great about the logic yet.

IndependentMultiLabelTest.testMultiLabelConfusionMatrixToStrings()

labelConfusionMatrixToString
          MONKEY  PUZZLE    TREE
MONKEY         2       0       0
PUZZLE         0       2       0
TREE           0       0       2

MultiLabelConfusionMatrixTest.testMultiLabel()

wrong!

labelConfusionMatrixToString
       a   b   c
a      1   0   0
b      0   2   1
c      0   1   0

I think it should be (update: I think this is wrong too!)

labelConfusionMatrixToString
       a   b   c
a      1   0   0
b      1*  2   1
c      0   1   0

@Craigacp
Copy link
Member

I think the issue with treating it completely independently is that summing the all class version of a multi-label confusion matrix doesn't lead to a sensible number. Normally when you sum all the entries you end up with the number of examples (or the number of labels). That feels like a nice thing to enforce, but if you do then you need to add extra columns for "false positive" and "false negative" and then you only have entries on the label/label diagonal and in those two columns.

@nezda nezda force-pushed the MultiLabelEvaluationAsLabelEvaluation-take2 branch 2 times, most recently from a41a33c to b5b1206 Compare April 22, 2021 16:23
…redictions)

- partially represent false negatives by calling them false positives tied to some predicted label if there is one
@nezda nezda force-pushed the MultiLabelEvaluationAsLabelEvaluation-take2 branch from b5b1206 to 0dcab25 Compare April 22, 2021 16:44
@nezda
Copy link
Contributor Author

nezda commented Apr 22, 2021

Learned some things - maybe got my intended result this time.

  1. discarded labelConfusionMatrixToString
  2. created LabelConfusionMatrix singleLabelConfusionMatrix(List<Prediction<MultiLabel>> predictions) that I think does what I want: provide easy api to visualize multi-label errors as if they were a simpler multi-class, single-label errors. Implementation is straightforward: converts multi-label predictions into corresponding single-label predictions and then "directly" makes a LabelConfusionMatrix from that. I did have to hack access to package private constructor MutableLabelInfo.
  3. learned I was originally thinking the rows were the predicted and the columns were the truth, but now I know its vice versa in this library (and in sklearn.confusion_matrix). The wikipedia article was useful :) (not!) https://en.wikipedia.org/wiki/Confusion_matrix

Each row of the matrix represents the instances in a predicted class, while each column represents the instances in an actual class (or vice versa)

I now believe the following is correct and useful (currently in MultiLabelConfusionMatrixTest.testTabulateMultiLabel())

multi-label predictions:

trueLabels: [a] predicted: [a, b]
trueLabels: [b, c] predicted: [b]
trueLabels: [b] predicted: [b]
trueLabels: [b] predicted: [c]

don't handle false negatives

singleLabelConfusionMatrix
       a   b   c
a      1   1   0
b      0   2   1
c      0   0   0

"charge" false negatives to a predicted label if there is one

       a   b   c
a      1   1   0
b      0   2   2
c      0   1   0

@Craigacp
Copy link
Member

So I'd like to get the MultiLabelConfusionMatrix.toString merged in before 4.1 which will come out shortly after the TF PR is merged in and we've done the final docs & tutorials passes. I think we'll need to carefully document what the singleLabelConfusionMatrix does, and figure out where to expose it in the hierarchy, and that might take longer than the dev time left for 4.1.

Therefore it might be best to split this into two PRs, I'm agnostic to if you put the toString in a separate PR or the singleLabelConfusionMatrix, whichever is easiest. When you do, could you back out the import rearrangement changes that are in the various files? We have non-JDK imports, then JDK imports then static imports in the Tribuo codebase, and it looks like your IDE rearranged them. We've not had chance to setup a style checker for Tribuo, but when I'd prefer all style changes to be minimal until something is enforced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA signed This PR is from a person/organisation who has signed the OCA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants