You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our current DefaultClassificationTemplate has CorEx text before imputer. If no hyperparameters are specified for CorEx, we get a successful run.
If only one set of hyperparameters is specified, for example
10, 0, 1, .9, .02 or 10, 0, 5, .9, .02
we get a successful run.
If we allow 'n_grams' to be equal to the list [(1), (5)], the cross validation fails
To reproduce the error uncomment the hyperparameters from the CorEx step. The run below is on 38_sick. As a note, CorEx shouldn't be doing any computations on this dataset, just returning the input as output.
Error log:
(dsbox-devel-710) [stan@dsbox01 python]$ python ta2-search /nas/home/stan/dsbox/runs2/config-seed/38_sick_config.json
Namespace(configuration_file='/nas/home/stan/dsbox/runs2/config-seed/38_sick_config.json', cpus=-1, debug=False, output_prefix=None, timeout=-1)
Using configuation:
{'cpus': '10',
'dataset_schema': '/nfs1/dsbox-repo/data/datasets/seed_datasets_current/38_sick/38_sick_dataset/datasetDoc.json',
'executables_root': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/38_sick/executables',
'pipeline_logs_root': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/38_sick/logs',
'problem_root': '/nfs1/dsbox-repo/data/datasets/seed_datasets_current/38_sick/38_sick_problem',
'problem_schema': '/nfs1/dsbox-repo/data/datasets/seed_datasets_current/38_sick/38_sick_problem/problemDoc.json',
'ram': '10Gi',
'saved_pipeline_ID': '',
'saving_folder_loc': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/38_sick',
'temp_storage_root': '/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/38_sick/temp',
'timeout': 9,
'training_data_root': '/nfs1/dsbox-repo/data/datasets/seed_datasets_current/38_sick/38_sick_dataset'}
[INFO] No test data config found! Will split the data.
[INFO] Succesfully parsed test data
{'structural_type': <class 'd3m.container.pandas.DataFrame'>, 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table', 'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'), 'dimension': {'name': 'rows', 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/TabularRow',), 'length': 3018}}
{'dimension': <FrozenOrderedDict OrderedDict([('name', 'rows'), ('semantic_types', ('https://metadata.datadrivendiscovery.org/types/TabularRow',)), ('length', 3018)])>,
'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table',
'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'),
'structural_type': <class 'd3m.container.pandas.DataFrame'>}
{'structural_type': <class 'd3m.container.pandas.DataFrame'>, 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table', 'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'), 'dimension': {'name': 'rows', 'semantic_types': ('https://metadata.datadrivendiscovery.org/types/TabularRow',), 'length': 754}}
{'dimension': <FrozenOrderedDict OrderedDict([('name', 'rows'), ('semantic_types', ('https://metadata.datadrivendiscovery.org/types/TabularRow',)), ('length', 754)])>,
'semantic_types': ('https://metadata.datadrivendiscovery.org/types/Table',
'https://metadata.datadrivendiscovery.org/types/DatasetEntryPoint'),
'structural_type': <class 'd3m.container.pandas.DataFrame'>}
[INFO] Template choices:
Template ' Default_classification_template ' has been added to template base.
[INFO] Worker started, id: <_MainProcess(MainProcess, started)>
/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
[INFO] Push@cache: ('d3m.primitives.dsbox.Denormalize', 4986999622121787936)
/nfs1/dsbox-repo/stan/miniconda/envs/dsbox-devel-710/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
[INFO] Push@cache: ('d3m.primitives.datasets.DatasetToDataFrame', 4986999622121787936)
[INFO] Push@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', -2701047265198232908)
[INFO] Push@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', 4441544048499093159)
[INFO] Push@cache: ('d3m.primitives.data.ColumnParser', 6916788228332877018)
[INFO] Push@cache: ('d3m.primitives.data.CastToType', -5585081685236413210)
[INFO] Push@cache: ('d3m.primitives.dsbox.CorexText', -1029106721422580684)
[INFO] Push@cache: ('d3m.primitives.sklearn_wrap.SKImputer', 2755365990599631608)
[INFO] Push@cache: ('d3m.primitives.sklearn_wrap.SKMultinomialNB', 2607380340696403083)
[INFO] Hit@cache: ('d3m.primitives.sklearn_wrap.SKImputer', 2755365990599631608)
The following pipeline file will be loaded:
/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/38_sick/pipelines/c27faa63-98ab-4b3b-93ab-b08084762be5.json
Pickling succeeded
****************************************************************************************************
[INFO] Running Pool: 1
[INFO] Worker started, id: <ForkProcess(ForkPoolWorker-2, started daemon)>
[INFO] Hit@cache: ('d3m.primitives.dsbox.Denormalize', 4986999622121787936)
[INFO] Hit@cache: ('d3m.primitives.datasets.DatasetToDataFrame', 4986999622121787936)
[INFO] Hit@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', -2701047265198232908)
[INFO] Hit@cache: ('d3m.primitives.data.ExtractColumnsBySemanticTypes', 4441544048499093159)
[INFO] Hit@cache: ('d3m.primitives.data.ColumnParser', 6916788228332877018)
[INFO] Hit@cache: ('d3m.primitives.data.CastToType', -5585081685236413210)
[INFO] Push@cache: ('d3m.primitives.dsbox.CorexText', -8598879279764128775)
[INFO] Hit@cache: ('d3m.primitives.sklearn_wrap.SKImputer', 2755365990599631608)
[INFO] Hit@cache: ('d3m.primitives.sklearn_wrap.SKMultinomialNB', 2607380340696403083)
[INFO] Hit@cache: ('d3m.primitives.sklearn_wrap.SKImputer', 2755365990599631608)
The following pipeline file will be loaded:
/nfs1/dsbox-repo/stan/dsbox-ta2/python/output/38_sick/pipelines/ab7db90a-48f3-46d9-b7b8-7a7f5aad7307.json
Pickling succeeded
[WARN] write_training_results
[WARN] write_training_results
[WARN] write_training_results
[WARN] write_training_results
[WARN] write_training_results
[WARN] write_training_results
[WARN] write_training_results
[WARN] write_training_results
[WARN] write_training_results
[WARN] write_training_results
Traceback (most recent call last):
File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/template/search.py", line 192, in search_one_iter
cross_validation_values.append(res['cross_validation_metrics'][0]['value'])
IndexError: list index out of range
Traceback (most recent call last):
File "ta2-search", line 141, in <module>
result = main(args)
File "ta2-search", line 110, in main
status = controller.train()
File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/controller/controller.py", line 378, in train
candidate, value = search.search_one_iter()
File "/nfs1/dsbox-repo/stan/dsbox-ta2/python/dsbox/template/search.py", line 224, in search_one_iter
best_cv_index = cross_validation_values.index(max(cross_validation_values))
ValueError: max() arg is an empty sequence
The text was updated successfully, but these errors were encountered:
Our current DefaultClassificationTemplate has CorEx text before imputer. If no hyperparameters are specified for CorEx, we get a successful run.
If only one set of hyperparameters is specified, for example
10, 0, 1, .9, .02
or10, 0, 5, .9, .02
we get a successful run.
If we allow
'n_grams'
to be equal to the list[(1), (5)]
, the cross validation failsTo reproduce the error uncomment the hyperparameters from the CorEx step. The run below is on 38_sick. As a note, CorEx shouldn't be doing any computations on this dataset, just returning the input as output.
Error log:
The text was updated successfully, but these errors were encountered: