You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Python cleaning script data-preparation/preprocessing/training/01a_catalogue_cleaning_and_filtering/clean.py
is using only the train split at the moment. Iteration over splits is needed and the filter application on all of them is needed!
Hint: just deleting the used split load_from_disk(dataset_path)['train'] by deleting the square brackets will not do it, because you will receive a DatasetDict Object then instead of a Dataset one. In consequence there is dataset.select() not possible because the method only exists for Dataset type
The text was updated successfully, but these errors were encountered:
Python cleaning script
data-preparation/preprocessing/training/01a_catalogue_cleaning_and_filtering/clean.py
is using only the train split at the moment. Iteration over splits is needed and the filter application on all of them is needed!
Hint: just deleting the used split
load_from_disk(dataset_path)['train']
by deleting the square brackets will not do it, because you will receive aDatasetDict
Object then instead of aDataset
one. In consequence there isdataset.select()
not possible because the method only exists forDataset
typeThe text was updated successfully, but these errors were encountered: