Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arbitrary column reported as number of classes if no default_target_attribute set #528

Closed
amueller opened this issue Dec 5, 2017 · 34 comments
Assignees
Labels

Comments

@amueller
Copy link

amueller commented Dec 5, 2017

As mentioned in #527, https://www.openml.org/d/40945 has no default_target_attribute but reports 370 classes. That doesn't make a lot of sense to me. I think we shouldn't report "number of classes" if there's no default target.

@janvanrijn
Copy link
Member

This one was fixed by @joaquinvanschoren

@amueller
Copy link
Author

thanks for checking. Is there a unit test? Generally it would be great if you could link to the PR or commit that closes an issue (which will also show the unit test).

@amueller
Copy link
Author

Actually can you maybe fix openml/openml-python#346 ?
That kind of prevents me from checking if this is fixed, because in python I need to download the ARFF just to check if default_target_attribute is set :-/

@amueller
Copy link
Author

Actually it's not fixed:
https://www.openml.org/d/471

@amueller amueller reopened this Oct 19, 2018
@amueller
Copy link
Author

The point was that there shouldn't be some arbitrary number of classes if the default_target_attribute is not set, not that titanic doesn't have a target attribute.

@janvanrijn
Copy link
Member

I get it. Cc @joaquinvanschoren

@joaquinvanschoren
Copy link
Contributor

That's an EvaluationEngine issue, right?
openml/EvaluationEngine#13

The consensus was to not compute any supervised meta-features in that case?
I believe that was implemented in this PR openml/EvaluationEngine#14
Specifically, here: https://github.com/openml/EvaluationEngine/pull/14/files#diff-4749e7478cbf2cdbe2f061811900d63aR104

@janvanrijn: has the evaluationengine been exported in a new jar and pulled to master? Then it's a matter of rerunning all the meta-features?

@janvanrijn
Copy link
Member

I am surprises. I checked the meta-feature engine with the latest code base, and it seems to do this fine (obviously, as your code is correct)

Somewhere we are running old meta-feature engines? The servers engine should be up to date (I would be highly surprised ).

@joaquinvanschoren
Copy link
Contributor

This was only merged recently, right? Are we already running the new jar, and
did we remove the old meta-metafeatures so that they would be recomputed?

@janvanrijn
Copy link
Member

This was only merged recently, right?

Yesterday night we merged some other PR. This is the one that (presumably) solved this issue:
openml/EvaluationEngine#8

@janvanrijn
Copy link
Member

janvanrijn commented Oct 19, 2018

Correction, this one:
openml/EvaluationEngine#5

@amueller
Copy link
Author

there's no regression test?!

@amueller
Copy link
Author

Right now I get:

NumberOfClasses not defined:  471
NumberOfClasses not 0:  501
NumberOfClasses not 0:  502
NumberOfClasses not 0:  1184
NumberOfClasses not 0:  1419
NumberOfClasses not 0:  4139
NumberOfClasses not 0:  4140
NumberOfClasses not 0:  4535
NumberOfClasses not defined:  4546
NumberOfClasses not defined:  4562
NumberOfClasses not defined:  4563
NumberOfClasses not 0:  23383
NumberOfClasses not 0:  40864
NumberOfClasses not 0:  40869
NumberOfClasses not 0:  40992
NumberOfClasses not 0:  40993
NumberOfClasses not 0:  41039

Shouldn't NumberOfClasses be always defined?

@janvanrijn
Copy link
Member

There is. The code in the repository results in the correct behavior. However, as we are working with a websystem with multiple components, it is not impossible that somewhere an old version of one of the components is running, which is my only explanation why these results have come on line.

I checked all instances of the evaluation engine I am running, and these are up to date. This query reveals that there are more datasets that are calculated wrongly. I will reset them all, and we should check whether once they are reevaluated still have this problem.

SELECT d.did, d.name, q.value 
FROM data_quality q, dataset d 
WHERE q.data = d.did 
AND d.default_target_attribute IS NULL 
AND q.quality = "NumberOfClasses" 
AND q.value > 0 
LIMIT 1000 

@amueller
Copy link
Author

There was no regression test in the PR though ;)

@janvanrijn
Copy link
Member

There was no regression test in the PR though ;)

I asked @joaquinvanschoren to make a unit test, and it was implemented in this PR:
https://github.com/openml/EvaluationEngine/pull/6/files

Shouldn't NumberOfClasses be always defined?

I get a different result. I just executed the following query:

SELECT d.did, d.name
FROM data_quality q, dataset d 
WHERE q.data = d.did 
AND q.quality = "NumberOfClasses" 
AND q.value IS NULL

I get 207 results (where there is no number of classes defined), I am checking them now

@janvanrijn
Copy link
Member

The following query is a bit more accurate:

SELECT d.did, d.name 
FROM data_processed p, dataset d 
LEFT JOIN data_quality q ON q.data = d.did AND q.quality = "NumberOfClasses" 
WHERE q.value IS NULL AND d.did = p.did AND p.evaluation_engine_id = 1 AND p.error IS NULL 

It only shows datasets that are already processed and that were processed without errors. I still get about 30 results:

did | name
471 | analcatdata_draft
1088 | variousCancers_final
23389 | OAEI-Person11
23417 | yagoSchema.ttl
23418 | yagoSchema.ttl
23419 | yagoSchema.ttl
23425 | yagoSchema.ttl
23428 | yagoSchema.ttl
23466 | yagoSchema.ttl
23485 | yagoSchema.ttl
23490 | yagoSchema.ttl
23500 | yagoSchema.ttl
23502 | yagoSchema.ttl
23503 | yagoSchema.ttl
23504 | yagoSchema.ttl
23505 | yagoSchema.ttl
23506 | yagoSchema.ttl
23507 | yagoSchema.ttl
23510 | yagoSchema.ttl
23511 | yagoSchema.ttl
40751 | subsample_delays_zurich_transport
40818 | KC2
40968 | feedback_new
41043 | ClientClosing
41067 | audio

@janvanrijn
Copy link
Member

This is the right query:

SELECT d.did, d.name 
FROM data_processed p, dataset d 
LEFT JOIN data_quality q ON q.data = d.did AND q.quality = "NumberOfClasses" 
WHERE q.value IS NULL 
AND d.did = p.did 
AND p.evaluation_engine_id = 1 
AND p.error IS NULL 
AND q.quality IS NOT NULL 

This shows 1 result:
471 | analcatdata_draft.

@amueller For all your other results: there is a difference between quality not calculated yet and quality set to NULL. The prior is not optimal, but not a bug, I would say (evaluation engine can be slow for certain qualities, making it lack behind when an initial try has crashed for whatever reason)

The second is a problem. I will investigate.

@janvanrijn
Copy link
Member

So the Analcat data problem hinges on the following: It does not have a default target attribute (problem is actually bigger, I opened #834 for this). I think the NULL value is OK, as we now have the following distinction:

number of classes > 0: classification dataset
number of classes = 0: regression dataset
number of classes undefined: we don't know.

I am actually confused why other datasets that have default target attribute IS NULL other values.. Will investigate this.

@janvanrijn
Copy link
Member

Found an issue in the OpenML API:
Admins did not have the right to download data features of private datasets. I fixed this and created a unit test in java API to ensure correct behavior in future.
This little mistake led that private datasets often did not have the correct number of data qualities. The fix is in the accumulated PR on openml server, and waiting for @joaquinvanschoren s review

@janvanrijn
Copy link
Member

So little update. This is the query with problematic datasets:

SELECT d.did, d.name, d.visibility, q.value 
FROM data_processed p, dataset d 
LEFT JOIN data_quality q ON q.data = d.did AND q.quality = "NumberOfClasses" 
WHERE d.default_target_attribute IS NULL 
AND d.did = p.did AND p.evaluation_engine_id = 1 
AND p.error IS NULL 
AND q.value IS NOT NULL 
LIMIT 100 

These are the datasets:

did | name | visibility | NumberOfClasses |  
1184 | BNG(baseball) | public | 3.0
1420 | cpu.with.vendor | public | 0.0
4133 | kdd_internet_usage | public | 10108.0
4138 | DBpedia(YAGO).arff | public | 0.0
4140 | NELL | public | 2.0
4352 | YearPredictionMSD | friends | 0.0
4353 | Concrete_Data | public | 0.0
4444 | InterferenceSeperation | private | 2.0
4531 | parkinsons-telemonitoring | public | 0.0
4533 | KEGGMetabolicReactionNetwork | public | 0.0
4540 | ParkinsonSpeechDatasetwithMultipleTypesofSoundReco... | public | 0.0
4548 | BuzzinsocialmediaTomsHardware | public | 0.0
4551 | WaveformDatabaseGenerator | public | 0.0
4553 | TurkiyeStudentEvaluation | public | 0.0
23394 | COMET_MC_SAMPLE | public | 0.0
23396 | COMET_MC_SAMPLE | public | 0.0
23416 | YAGO_Schema_KGE | private | 0.0
23420 | yagoSchema.ttl | public | 0.0
40598 | XYZ | public | 0.0
40600 | Glass | public | 6.0
40630 | testpvy | public | 3.0
40725 | PredictItTesting | public | 0.0
40729 | olympic-marathon-men | public | 0.0
40730 | xw-pen | public | 0.0
40759 | Abalone-train | public | 26.0
40760 | Abalone-test | public | 24.0
40762 | Amazon-test | public | 50.0
40767 | Dexter-train | public | 2.0
40768 | Dexter-test | public | 2.0
40769 | Dorothea-train | public | 2.0
40771 | GermanCredit-train | public | 2.0
40772 | GermanCredit-test | public | 2.0
40773 | Gisette-train | public | 2.0
40774 | Gisette-test | public | 2.0
40775 | KR-vs-KP-train | public | 2.0
40779 | Secom-train | public | 2.0
40782 | Semeion-test | public | 10.0
40787 | Yeast-train | public | 10.0
40788 | Yeast-test | public | 9.0
40871 | test | public | 3.0
40883 | hcc-dataset | public | 2.0
40917 | USvid | public | 0.0
40925 | QuaLiKiz-4D | public | 0.0
40952 | UploadTestWithURL | public | 3.0
40953 | UploadTestWithURL | public | 3.0
40956 | Bankdata | private | 0.0
40957 | Bankdata | public | 0.0
40958 | Bankdata | public | 0.0
40960 | sponge | public | 3.0
40976 | Bike | public | 0.0
40977 | cpu | private | 0.0
40989 | climate-model-simulation-crashes | public | 2.0
40991 | Breast_Cancer | public | 2.0
41012 | NPSdecay | public | 0.0
41015 | qType_5prev | public | 55.0
41016 | qType_30prev | public | 55.0
41029 | data-v3.en-es-lit.clean.anno.uniform.20top | public | 0.0
41030 | data-v3.en-es-lit.clean.anno.uniform.without_20top | public | 0.0
41031 | data-v3.en-es-lit.clean.anno.uniform.30top | public | 0.0
41032 | data-v3.en-es-lit.clean.anno.uniform.without_30top | public | 0.0
41035 | Random | private | 0.0
41036 | TestofRandom | private | 0.0
41038 | Pizza | public | 9.0
41041 | intuit | public | 0.0
41060 | London-smart-meter-data-preprocessed-for-clusterin... | public | 0.0
41068 | CF-metadataset | public | 5.0
41080 | test | private | 0.0
41088 | Airline_delay | public | 0.0
41089 | sod1_mouse | public | 0.0
41090 | elevators | public | 0.0
41092 | Cmu_mocap_35_walk_jog | public | 0.0
41093 | creep | public | 0.0
41094 | brendan_faces | public | 0.0
41095 | Silhouette | public | 0.0
41097 | Incremental-gradual-balanced | private | 6.0
41111 | Vili | public | 0.0
41117 | JNUG | public | 0.0
41118 | JNUG | public | 0.0
41119 | JNUG | public | 2.0
41120 | English_Spanish_Annotated_Mappings | public | 0.0
41121 | English_Spanish_Annotated_IRI_Mappings | public | 0.0
41122 | TitanicTestData | public | 370.0
41123 | doc2vec_510 | public | 0.0
41171 | Color | public | 0.0
41174 | LMMM_Dataset | public | 0.0
41177 | GaneTest | public | 0.0
41188 | LogisticaAPagar | private | 0.0
41190 | cwurData | public | 3.0
41191 | Data.arff | public | 0.0

As argued, I expected these to have value NULL for NumberOfClasses. But they don't. I will investigate.

@janvanrijn
Copy link
Member

I am rerunning the latest version of evaluation engine on all these datasets, and they seem to be fine now. This causes me to believe that there was an old version of the evaluation engine running for a while, or there still is one running somewhere (let's hope not)

We should probably start versioning evaluation engines, and make the api check for the latest version before accepting results.

@janvanrijn
Copy link
Member

I reapplied the query, everything seems fine. please close if you agree. Note that NumberOfClasses can be null, according to this rule:

number of classes > 0: classification dataset
number of classes = 0: regression dataset
number of classes undefined: we don't know.

Is there a good place to document this? probably meta-feature record? @joaquinvanschoren

Why were the results wrong in the first place? Although I am not 100% sure about this, my best guess would be that there was somewhere an old version of the evaluation engine calculating meta-features. I opened #835 for this reason.

@joaquinvanschoren
Copy link
Contributor

joaquinvanschoren commented Oct 22, 2018 via email

@joaquinvanschoren
Copy link
Contributor

joaquinvanschoren commented Oct 22, 2018 via email

@amueller
Copy link
Author

Nope:
https://www.openml.org/d/501

@janvanrijn
Copy link
Member

Thanks!

forgot this query:

SELECT d.did, d.name, p.processing_date
FROM data_processed p, dataset d 
LEFT JOIN data_quality q ON q.data = d.did AND q.quality = "NumberOfClasses" 
WHERE (d.default_target_attribute IS NULL OR d.default_target_attribute = "") 
AND q.value IS NOT NULL 
AND d.did = p.did 
AND p.evaluation_engine_id = 1 

which gives me this result:

did | name | processing_date |  
-- | -- | -- | --
490 | hip | 2018-10-03 21:35:58
493 | wind_correlations | 2018-10-03 21:22:37
502 | analcatdata_whale | 2018-10-03 21:24:38
532 | analcatdata_uktrainacc | 2018-10-03 21:22:49
680 | chscase_funds | 2018-10-03 21:23:13
705 | chscase_health | 2018-10-03 21:22:50

(501 was also in there before I reran it). I will reset them all.

@janvanrijn
Copy link
Member

The evaluation engine is already versioned, right?

Yes, versioned, but this version is not communicated with the openml server, and the version with which something was ran is not reported.

Even with the version number, right now anyone can update the metafeatures simply by downloading and running the evaluationengine, right?

Nope, this is an (semi)-admin functions. (Semi)-admins can do this, indeed. We need to find a way how to check whether (semi)-admins can only run evaluation engines that belong to their user account.

@amueller
Copy link
Author

NumberOfClasses not defined:  471
NumberOfClasses not defined:  501
NumberOfClasses not 0:  502
NumberOfClasses not 0:  1184
NumberOfClasses not 0:  1419
NumberOfClasses not defined:  1420
NumberOfClasses not 0:  4139
NumberOfClasses not 0:  4140
NumberOfClasses not 0:  4535
NumberOfClasses not defined:  4546
NumberOfClasses not defined:  4562
NumberOfClasses not defined:  4563
NumberOfClasses not 0:  23383
NumberOfClasses not 0:  40864
NumberOfClasses not 0:  40869
NumberOfClasses not 0:  40992
NumberOfClasses not 0:  40993
NumberOfClasses not 0:  41039

@janvanrijn
Copy link
Member

As a gentle reminder: datasets without a default target feature will have number of classes is NULL, datasets with a numeric default target feature will have number of classes is 0.

@joaquinvanschoren was correct, the production server didn't run the latest version of the EvalEngine. I updated it. I will reset all faulty runs.

FFR, the list with currently wrong datasets:

did | name | processing_date |  
493 | wind_correlations | 2018-10-24 03:00:24
502 | analcatdata_whale | 2018-10-24 03:06:48
532 | analcatdata_uktrainacc | 2018-10-24 03:46:16
680 | chscase_funds | 2018-10-24 03:42:40
705 | chscase_health | 2018-10-24 03:42:09
1184 | BNG(baseball) | 2018-10-22 23:54:22
1419 | contact-lenses | 2018-10-22 23:48:37
1456 | appendicitis | 2018-10-22 23:50:26
4133 | kdd_internet_usage | 2018-10-24 03:00:30
4138 | DBpedia(YAGO).arff | 2018-10-24 03:37:47
4139 | Wikidata | 2018-10-24 03:00:20
4140 | NELL | 2018-10-24 03:42:39
4352 | YearPredictionMSD | 2018-10-22 23:52:28
4353 | Concrete_Data | 2018-10-22 23:48:12
4444 | InterferenceSeperation | 2018-10-22 23:48:36
4531 | parkinsons-telemonitoring | 2018-10-24 03:39:21
4533 | KEGGMetabolicReactionNetwork | 2018-10-24 03:00:45
4535 | Census-Income | 2018-10-24 03:42:22
4540 | ParkinsonSpeechDatasetwithMultipleTypesofSoundReco... | 2018-10-24 03:50:03
4548 | BuzzinsocialmediaTomsHardware | 2018-10-22 23:47:07
4551 | WaveformDatabaseGenerator | 2018-10-22 23:52:29
4553 | TurkiyeStudentEvaluation | 2018-10-22 23:48:15
23383 | SensorDataResource | 2018-10-24 03:05:58
23386 | CoverType5percent | 2018-10-24 03:38:18
23394 | COMET_MC_SAMPLE | 2018-10-24 03:38:07
23396 | COMET_MC_SAMPLE | 2018-10-22 23:48:18
23416 | YAGO_Schema_KGE | 2018-10-24 03:07:04
23420 | yagoSchema.ttl | 2018-10-22 23:48:29
40598 | XYZ | 2018-10-22 23:48:27
40600 | Glass | 2018-10-24 03:39:18
40630 | testpvy | 2018-10-24 03:00:32
40631 | pvy1 | 2018-10-24 03:53:02
40729 | olympic-marathon-men | 2018-10-22 23:54:25
40739 | weather.nominal | 2018-10-22 23:53:47
40747 | Reuters-Grain | 2018-10-24 03:07:03
40759 | Abalone-train | 2018-10-24 03:49:03
40760 | Abalone-test | 2018-10-22 23:53:48
40761 | Amazon-train | 2018-10-24 03:00:37
40762 | Amazon-test | 2018-10-22 23:53:40
40763 | Car-train | 2018-10-24 03:42:07
40764 | Car-test | 2018-10-24 03:07:08
40765 | Convex-train | 2018-10-24 03:48:09
40766 | Convex-test | 2018-10-24 03:47:45
40767 | Dexter-train | 2018-10-22 23:47:15
40768 | Dexter-test | 2018-10-24 03:06:19
40769 | Dorothea-train | 2018-10-24 03:41:40
40770 | Dorothea-test | 2018-10-22 23:47:39
40771 | GermanCredit-train | 2018-10-24 03:07:08
40772 | GermanCredit-test | 2018-10-24 03:43:03
40773 | Gisette-train | 2018-10-24 03:07:00
40774 | Gisette-test | 2018-10-22 23:52:36
40775 | KR-vs-KP-train | 2018-10-24 04:00:02
40776 | KR-vs-KP-test | 2018-10-24 03:38:09
40777 | Madelon-train | 2018-10-24 03:38:17
40778 | Madelon-test | 2018-10-24 03:00:22
40779 | Secom-train | 2018-10-22 23:53:44
40780 | Secom-test | 2018-10-22 23:50:20
40781 | Semeion-train | 2018-10-22 23:50:23
40782 | Semeion-test | 2018-10-22 23:48:28
40784 | Waveform-test | 2018-10-24 03:00:42
40785 | WineQualityWhite-train | 2018-10-24 03:06:16
40786 | WineQualityWhite-test | 2018-10-24 03:42:08
40787 | Yeast-train | 2018-10-22 23:50:29
40788 | Yeast-test | 2018-10-24 03:46:17
40864 | Honey_bee_Seasonal_mortality | 2018-10-24 03:42:06
40865 | epilobee_mortality | 2018-10-24 03:06:49
40869 | pathogen_survey_dataset | 2018-10-22 23:48:19
40871 | test | 2018-10-24 03:38:03
40884 | KDDCup99 | 2018-10-22 23:50:19
40912 | Teste_1 | 2018-10-24 03:57:02
40913 | Teste_2 | 2018-10-22 23:54:23
40917 | USvid | 2018-10-22 23:47:09
40921 | Devnagari_Script_Dataset | 2018-10-24 03:39:17
40925 | QuaLiKiz-4D | 2018-10-24 03:00:26
40952 | UploadTestWithURL | 2018-10-24 03:38:22
40953 | UploadTestWithURL | 2018-10-22 23:52:30
40954 | UploadTestWithURL | 2018-10-24 03:53:03
40955 | UploadTestWithURL | 2018-10-24 03:37:52
40956 | Bankdata | 2018-10-22 23:50:22
40957 | Bankdata | 2018-10-22 23:48:17
40958 | Bankdata | 2018-10-22 23:48:32
40960 | sponge | 2018-10-24 03:38:08
40967 | feedback | 2018-10-24 03:07:12
40969 | feedback_1 | 2018-10-24 03:38:15
40976 | Bike | 2018-10-24 03:07:05
40977 | cpu | 2018-10-22 23:52:31
40989 | climate-model-simulation-crashes | 2018-10-22 23:53:50
40990 | climate-model-simulation-crashes | 2018-10-22 23:48:31
40991 | Breast_Cancer | 2018-10-22 23:48:14
40992 | sylva_agnostic | 2018-10-24 03:38:21
40993 | ada_agnostic | 2018-10-24 03:07:06
40995 | climate-model-simulation-chrashes | 2018-10-24 03:07:07
41008 | Diabetes | 2018-10-24 03:37:54
41010 | Homicide | 2018-10-22 23:48:22
41012 | NPSdecay | 2018-10-24 03:06:47
41015 | qType_5prev | 2018-10-24 03:06:27
41016 | qType_30prev | 2018-10-24 03:57:11
41023 | orcid | 2018-10-24 03:06:15
41024 | food-truck-test | 2018-10-22 23:50:46

41029 | data-v3.en-es-lit.clean.anno.uniform.20top | 2018-10-22 23:48:16
41030 | data-v3.en-es-lit.clean.anno.uniform.without_20top | 2018-10-24 03:47:03
41031 | data-v3.en-es-lit.clean.anno.uniform.30top | 2018-10-24 03:06:50
41032 | data-v3.en-es-lit.clean.anno.uniform.without_30top | 2018-10-22 23:50:27
41035 | Random | 2018-10-24 03:37:53
41036 | TestofRandom | 2018-10-24 03:37:51
41037 | Pizza | 2018-10-22 23:47:08
41038 | Pizza | 2018-10-24 03:06:25
41039 | EMNIST_Balanced | 2018-10-22 23:53:37
41060 | London-smart-meter-data-preprocessed-for-clusterin... | 2018-10-24 03:46:13
41068 | CF-metadataset | 2018-10-22 23:54:24
41077 | balance-scale | 2018-10-24 03:00:25
41079 | a | 2018-10-22 23:50:23
41080 | test | 2018-10-22 23:50:31
41088 | Airline_delay | 2018-10-22 23:54:29
41089 | sod1_mouse | 2018-10-22 23:48:11
41090 | elevators | 2018-10-24 03:37:50
41092 | Cmu_mocap_35_walk_jog | 2018-10-24 03:06:51
41093 | creep | 2018-10-22 23:48:30
41094 | brendan_faces | 2018-10-22 23:53:46
41095 | Silhouette | 2018-10-24 03:39:19
41096 | ripley_synth | 2018-10-22 23:48:35
41098 | N3 | 2018-10-24 03:53:03
41099 | Stagger | 2018-10-22 23:50:47
41100 | Stagger | 2018-10-24 03:38:10
41111 | Vili | 2018-10-22 23:53:43
41117 | JNUG | 2018-10-22 23:50:25
41118 | JNUG | 2018-10-22 23:50:25
41119 | JNUG | 2018-10-24 03:00:52
41120 | English_Spanish_Annotated_Mappings | 2018-10-24 03:55:02
41121 | English_Spanish_Annotated_IRI_Mappings | 2018-10-22 23:52:30
41122 | TitanicTestData | 2018-10-24 03:38:11
41123 | doc2vec_510 | 2018-10-22 23:53:49
41124 | iris | 2018-10-24 03:38:23
41174 | LMMM_Dataset | 2018-10-22 23:47:03
41175 | OpenML_trainingdata_placings | 2018-10-24 03:00:53
41177 | GaneTest | 2018-10-24 03:50:04
41188 | LogisticaAPagar | 2018-10-24 03:00:41
41190 | cwurData | 2018-10-22 23:50:28
41191 | Data.arff | 2018-10-24 03:07:13
41199 | Students | 2018-10-27 20:03:26

@janvanrijn
Copy link
Member

FFR, here is the other query that needs to be checked:

SELECT d.did, d.name, p.processing_date, f.name, f.data_type, q.value
FROM data_processed p, data_feature f, dataset d
LEFT JOIN data_quality q ON q.data = d.did AND q.quality = "NumberOfClasses" 
WHERE d.default_target_attribute = f.name 
AND d.did = f.did
AND f.data_type = 'numeric'
AND q.value > 0
AND d.did = p.did 
AND p.evaluation_engine_id = 1

@janvanrijn
Copy link
Member

seems solved. shall we close?

@amueller
Copy link
Author

I didn't have time to check but if you're certain we can close.

@janvanrijn
Copy link
Member

Both my queries still do not give any results. will close for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants