Prediction speed scales with training data size rather than output size #79

davidfstein · 2023-06-14T14:28:35Z

I am running some experiments with the Mulan wrapper. Particularly, I added the COCOA method from that repository and am running the following for training:

java -cp "~/bin/meka-release-1.9.8-SNAPSHOT/lib/*" meka.classifiers.multilabel.MULAN -S COCOA -verbosity 8 -split-percentage 100 -t "train.arff" -d "clf.dmp" -W weka.classifiers.trees.J48
and for inference:
java -cp "~/bin/meka-release-1.9.8-SNAPSHOT/lib/*" meka.classifiers.multilabel.MULAN -S COCOA -verbosity 8 -t "train.arff" -T "test.arff" -l "clf.dmp" -W weka.classifiers.trees.J48

Notably, training time increases moderately but reasonably as "train.arff" grows. However, with a fixed "test.arff" size, inference time scales exponentially with "train.arff" size. It seems almost as if training is not actually occurring during the first command but rather in the second. My java is very rusty so perhaps that is indeed what is happening. Is this the expected behavior?

The text was updated successfully, but these errors were encountered:

fracpete · 2023-06-14T23:40:38Z

I just submitted a fix (0608eef), that will allow you to evaluate a previously trained model on a test set. This wasn't possible before, the model always got retrained with the training data.

With the latest snapshot, you would use something like this:

java -cp "~/bin/meka-release-1.9.8-SNAPSHOT/lib/*" meka.classifiers.multilabel.MULAN -S COCOA -verbosity 8 -threshold 1 -T "test.arff" -l "clf.dmp"

davidfstein · 2023-06-15T15:02:37Z

Thanks for the quick fix!

I rebuilt from master, but I'm running into this error now:

java.lang.ArrayIndexOutOfBoundsException: Index 1341 out of bounds for length 1341 at weka.core.DenseInstance.value(DenseInstance.java:347) at mulan.transformations.BinaryRelevanceTransformation.transformInstance(BinaryRelevanceTransformation.java:126) at mulan.classifier.transformation.BinaryRelevance.makePredictionInternal(BinaryRelevance.java:83) at mulan.classifier.MultiLabelLearnerBase.makePrediction(MultiLabelLearnerBase.java:113) at mulan.classifier.transformation.COCOA.makePredictionforThreshold(COCOA.java:305) at mulan.classifier.transformation.COCOA.makePredictionInternal(COCOA.java:324) at mulan.classifier.MultiLabelLearnerBase.makePrediction(MultiLabelLearnerBase.java:113) at meka.classifiers.multilabel.MULAN.distributionForInstance(MULAN.java:263) at meka.classifiers.multilabel.Evaluation.testClassifier(Evaluation.java:617) at meka.classifiers.multilabel.Evaluation.evaluateModel(Evaluation.java:419) at meka.classifiers.multilabel.Evaluation.runExperiment(Evaluation.java:301) at meka.classifiers.multilabel.ProblemTransformationMethod.runClassifier(ProblemTransformationMethod.java:172) at meka.classifiers.multilabel.ProblemTransformationMethod.evaluation(ProblemTransformationMethod.java:152) at meka.classifiers.multilabel.MULAN.main(MULAN.java:273)

fracpete · 2023-06-15T20:51:41Z

Please provide a minimal example that replicates this problem.

fracpete added a commit that referenced this issue Jun 14, 2023

$@fracpete$

can test previously trained (and serialized) model now on test set (#79)

0608eef

$@fracpete$ fracpete self-assigned this Jun 14, 2023

$@fracpete$ fracpete added the bug label Jun 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prediction speed scales with training data size rather than output size #79

Prediction speed scales with training data size rather than output size #79

davidfstein commented Jun 14, 2023 •

edited

Loading

fracpete commented Jun 14, 2023

davidfstein commented Jun 15, 2023

fracpete commented Jun 15, 2023

Prediction speed scales with training data size rather than output size #79

Prediction speed scales with training data size rather than output size #79

Comments

davidfstein commented Jun 14, 2023 • edited Loading

fracpete commented Jun 14, 2023

davidfstein commented Jun 15, 2023

fracpete commented Jun 15, 2023

davidfstein commented Jun 14, 2023 •

edited

Loading