Releases: zooniverse/aggregation-for-caesar
Version 4.0.0
New feature
Version 4 of the aggregation code brings the ability to extract and reduce data across major workflow versions. Because of this change the command line interface has been updated with a new way to enter workflow versions:
The old way
Before the minor workflow version was entered as a different flag
panoptes_aggregation config penguin-watch-workflows.csv 6465 -v 52 -m 76
The new way
Now it is entered with the major workflow version under the -v
flag
panoptes_aggregation config penguin-watch-workflows.csv 6465 -v 52.76
Specifying a version range
If you want to aggregate across all minor versions of a major workflow version just included the major version in the -v
flag
panoptes_aggregation config penguin-watch-workflows.csv 6465 -v 52
If you want to go between specific versions (even between major versions) use the new --min_version
and --max_version
flags (both flags are inclusive)
panoptes_aggregation config penguin-watch-workflows.csv 6465 --min_version 51.3 --max_version 53.5
New documentation string for panoptes_aggregation config
usage: panoptes_aggregation config [-h] [-d DIR] [-v VERSION]
[--min_version MIN_VERSION]
[--max_version MAX_VERSION] [-k KEYWORDS]
[-vv]
workflow_csv workflow_id
Make configuration files for panoptes data extraction and reduction based on a
workflow export
optional arguments:
-h, --help show this help message and exit
Load Workflow Files:
This file can be exported from a project\'s Data Export tab
workflow_csv The csv file containing the workflow data
Workflow ID and version numbers:
Enter the workflow ID, major version number, and minor version number
workflow_id the workflow ID you would like to extract
-v VERSION, --version VERSION
The workflow version to extract. If only a major
version is given (e.g. -v 3) all minor versions will
be extracted at once. If a minor version is provided
(e.g. -v 3.14) only that specific version will be
extracted.
--min_version MIN_VERSION
The minimum workflow version to extract (inclusive).
This can be provided as either a major version (e.g.
--min_version 3) or a major version with a minor
version (e.g. --min_version 3.14). If this flag is
provided the --version flag will be ignored.
--max_version MAX_VERSION
The maximum workflow version to extract (inclusive).
This can be provided as either a major version (e.g.
--max_version 3) or a major version with a minor
version (e.g. --max_version 3.14). If this flag is
provided the --version flag will be ignored.
Other keywords:
Additional keywords to be passed into the configuration files
-k KEYWORDS, --keywords KEYWORDS
keywords to be passed into the configuration of a task
in the form of a json string, e.g. '{"T0":
{"dot_freq": "line"} }' (note: double quotes must be
used inside the brackets)
Save Config Files:
The directory to save the configuration files to
-d DIR, --dir DIR The directory to save the configuration files to
Other options:
-vv, --verbose increase output verbosity
Version 3.7.0
This version adds support for new clustering options for drawing tasks:
- The OPTICS clustering algorithm is now available for all drawing task types
- The "intersection over union" metric can be used for all closed shape drawing task types
A full explanation of all the clustering options can be found in the documents: https://aggregation-caesar.zooniverse.org/How_Clustering_Works.html
Version 3.6.0
Version 3.6.0
This version adds support for the new simple-dropdown
task that is part of the new classifier v2.0 on the Zooniverse. Changes were also made to the way dropdown tasks values show up in the auto-config key lookup table. The hash values used in the classification export is now included in the lookup table with the text associated with that dropdown option.
other changes
Various dependency bumps including changing the minimum version of Pandas to 1.0.0.
Version 3.5
Version 3.5
This version release mostly focuses on bumping the package version of various dependencies. Most notably Numpy 1.20 or higher is now required, this is to ensure the package installs correctly with pip
and the same version of Numpy is being used to compile the other dependencies (e.g. HDBSCAN) and run the code.
minor updates to optics transcription reducer
- The default
gutter_eps
value is now300.0
to better match the kind of data using this reducer.
Version 3.4.5
Additions
- New extractor:
all_tasks_empty_extractor
to check if every task is empty for a single classification - New reducer:
first_n_true_reducer
for checking if the first N items in a boolean list areTrue
Updates
- A
flagged
filed is added to transcription reductions that copies the output oflow_consensus
field. This is needed for auto line flagging in ALICE - Various dependencies have been bumped to their latest versions
Bug fixes
- The dropdown extractor failed if all classifications were blank, this has been fixed
Version 3.4.4
New features
- The ability to handle classifier v2.0 subtasks
Bug fixes
- Text subtasks now work as expected
- Userify will create a "blank" user if Panoptes gives a
user not found
error
Version 3.4.3
Another bug fix for the text reducer.
Version 3.4.2
This fixes some bugs with the offline version of the text reducer and brings the output of the text reducer to be more in line with the output of the transcription reducers.
Version 3.3
What's New
Multiprocessing
Multiprocessing is now available for the command line tools, the -c
or --cpu_count
flag can be used for either the panoptes_aggregation extract
or panoptes_aggregation reduce
commands. This sets how many cpu cores should be used when processing the CSV files (defaults to 1 core). Using this shows significant speed ups when running with 2 cores, but does not see much improvement in running time when using 3 or more cores.
This option can also be set in the GUI.
Improvements in the running time of the offline reduction code
Before (slow) filters for repeated classifications by a single volunteer were always being run all subjects, the code now checks if there are any repeated classifications before applying the filters.
General bug fixes
Several bug fixes for extractors and reducers based on errors seen in Caesar.
Version 3.2.1
-Lock down all packages
-Fix depreciation warnings