Allow non-active scalars to be modified in python operators #2061

psavery · 2020-03-05T15:09:25Z

This primarily adds a new function, dataset.set_scalars(), which allows the user to modify non-active scalars in the python operators. Additionally, the user can add new scalars with dataset.set_scalars().

If there are other scalars that are present in the dataset when dataset.set_scalars() is called, and if the dimensions of the other scalars do not match the dimensions of the new scalars, the other scalars are discarded. This is because all scalars must have the same dimensions.

Some other new parts from this PR:

Remove the deep copy of the parent in the external dataset.create_child_dataset(), because that isn't consistent with the internal dataset.
Add a new option to dataset.create_child_dataset(), to specify whether the child should be a volume (default is True) or not. Child datasets that aren't volumes will retain their tilt angles.
Enforce a spacing of 1.0 for the tilt axis in a few places, so that the data does not change spacing in certain operation paths.

Some examples of operations involving multiscalars are attached here:
multiscalars_operators.tar.gz
They include:

Make 3 channels from one channel
Invert all scalars
Bin 2x Tilt Images on multi-scalar data
SIRT Recon on multi-scalar data

Some caveats:

For the internal pipeline, a description.json file must be present that specifies what the result will be named in the results dict, or else it won't read the output. This is not a requirement for the external pipeline.
~~Live reconstruction (setting self.progress.data in the python operators) does not currently work for multiple scalars, and causes a crash. This might should be fixed.~~ The crash is fixed, but it appears that only the active scalars get updated in the render window during live updates. Need to fix that.
~~In the internal pipeline only, if a live update of the data is performed on data in an operator without a description.json file, tomviz crashes.~~

Update: Caveat 1 is part of the master branch (it is not new with this PR), and it is now documented in #2075.

Update: Caveat 3 is also a part of the master branch (it is not new with this PR), but it is not documented anywhere. It is related, however, to an issue mentioned in #2075, where a child data source does not appear to get created properly if there is no description.json file for the internal pipeline.

psavery · 2020-03-07T21:24:03Z

I should also mention that when I perform in an external pipeline these multiscalars operations from above: Multiscalar Bin2x Tilt Series -> Multiscalar Recon, it does not appear to produce reconstruction output (but it produces an output identical to the input instead). It seems like it might be caused by a rare bug under specific circumstances. I'll look into it.

Update: this problem is part of the master branch, and it is not new here. It is now documented as part of #2075.

psavery · 2020-04-08T11:26:30Z

This PR is now functional. It has some bigger changes to the pipeline in 514cc36, which were needed primarily to make the behavior of child-producing operators consistent (#2075), which are needed because multi-scalars operators use child-producing operators. These changes are also a step towards saving intermediate data sources.

The changes in the pipeline result in the snapshot operator not working. But the addition of intermediate data sources should bring the snapshot operator back (since the snapshot operator is essentially creating an intermediate data source).

Previously, the child dataset for the external pipeline would create a deep copy of the parent dataset. This was not done in the internal dataset. Instead, in the internal dataset, the child dataset would be empty, and it would only have spacing and dimensions that matched that of the parent (with the exception of when the parent was a tilt series, then the spacing of the tilt axis of the child would be altered). Additionally, in the internal dataset, if the active_scalars were set on the empty child dataset, a new scalars named 'Scalars' would be created. The changes in this commit cause the external dataset to match the behavior of the internal dataset, in that, the child dataset is an empty dataset with the same spacing as that of the parent (except for the tilt series parent case), and if the active scalars get set on an empty child dataset, a new scalars named 'Scalars' would be created. Signed-off-by: Patrick Avery <[email protected]>

Previously, child data sources from python operators could only be volumes. However, new changes coming soon will allow child data sources to be tilt series as well. Add in a check so that a child data source is appropriately marked as a tilt series if it has tilt angles. Signed-off-by: Patrick Avery <[email protected]>

In the python operator "Dataset" classes, allow the non-active scalars to be modified via "set_scalars()". If there are scalars already present in the dataset, and the newly set scalars have different dimensions compared to the other scalars, the other scalars are discarded. Additionally, this allows a child dataset to be either a volume or another tilt series, so that the user can indicate whether the tilt angles and spacing should be preserved. Signed-off-by: Patrick Avery <[email protected]>

In the EMD format, the spacing of the tilt axis was previously calculated by comparing the distance between the first and second angles. Other formats, though, like the GenericHDF5 format, do not give the tilt axis a spacing. For consistency, and so that the data does not change spacing when a GenericHDF5 file is loaded and used in an external pipeline, use a spacing of 1.0 both in the EMD format, and in the pipeline executor. Signed-off-by: Patrick Avery <[email protected]>

For the internal pipeline, try to set the child data sources' active scalars to match that of the parent if possible. This prevents the active scalars changing due to operations that produce child data sources. Signed-off-by: Patrick Avery <[email protected]>

When live updates are performed on data that contains multiple scalars, the number of scalars can change in between updates. Add in a safety check in the DataPropertiesPanel to account for this. This prevents a crash. Signed-off-by: Patrick Avery <[email protected]>

Signed-off-by: Patrick Avery <[email protected]>

This is for a little more general use than the previous "Pipeline::moveModulesDown()" was. Signed-off-by: Patrick Avery <[email protected]>

Signed-off-by: Patrick Avery <[email protected]>

Previously, if other operators were performed after a child-producing operator (such as reconstruction), the behavior would be somewhat unpredictable, depending on several factors, including whether a description.json file was present, and whether the pipeline was internal or external. This fixes the behavior so that operators can be performed after child-producing operators. This was done by making sure that the output from child-producing operators gets passed down as input to the next operator. This needed to be done both in the external pipeline, and in the internal pipeline. Additionally, this also required that modules be passed down to the output of the next operator as well. This commit additionally allows child-producing operators to work in the internal pipeline without the description.json file present, which was not possible before (it required that the child be named explicitly). The external pipeline should not emit that the pipeline is finished if a child datasource is encountered. That was removed. The snapshot operator does not have any useful behavior, currently. But it should work well with intermediate data sources, since it will essentially create an intermediate data source. Signed-off-by: Patrick Avery <[email protected]>

ssize_t is producing an error on Windows. Signed-off-by: Patrick Avery <[email protected]>

This ensures the modules get cleaned up. Signed-off-by: Patrick Avery <[email protected]>

psavery · 2020-04-09T21:12:42Z

To provide an update on this branch: most things seem to be working very well. You can perform multi-scalars operations (which produce children) and non-child-producing operators, and they can both be used as a part of the pipeline, internal and external. You can add/remove operators as well - the pipeline appears to be flexible.

There may be other bugs we haven't found yet, but the one big bug that's hanging up this PR is that, if you have operators in your pipeline, and you close the application, it crashes. Unfortunately, the backtrace from gdb varies from run to run, and it is not very clear what the problem is. Most likely, it is some kind of memory issue (deleting an object twice, accessing an array out-of-bounds, etc.). Once we figure out what the issue is, this will hopefully be ready to merge.

There is a new datasource every time now, so this is not necessary. Additionally, this is causing crashing when closing the application when operators are present, and it is also causing crashing occasionally when deleting datasources that have operators. Signed-off-by: Patrick Avery <[email protected]>

psavery · 2020-04-09T22:08:58Z

The bug seems to be fixed. This PR now seems fairly robust - I have a hard time making it crash. I did encounter one crash, though, that is not fixed:

If you have two operators, and a volume module on the output, and you delete the first operator, it crashes. This doesn't occur for me with the slice/contour modules, nor with only one operator present, nor by deleting the second operator. It's a very specific crash it seems.

Backtrace shows it has something to do with the histogram:

Thread 1 "tomviz" received signal SIGSEGV, Segmentation fault.
0x00005555559db872 in QScopedPointer<tomviz::DataSource::DSInternals, QScopedPointerDeleter<tomviz::DataSource::DSInternals> >::operator-> (this=0x10) at /usr/include/x86_64-linux-gnu/qt5/QtCore/qscopedpointer.h:118
118	        return d;
(gdb) where
#0  0x00005555559db872 in QScopedPointer<tomviz::DataSource::DSInternals, QScopedPointerDeleter<tomviz::DataSource::DSInternals> >::operator->() const (this=0x10) at /usr/include/x86_64-linux-gnu/qt5/QtCore/qscopedpointer.h:118
#1  0x00005555559d6cfa in tomviz::DataSource::proxy() const (this=0x0) at ../tomviz/DataSource.cxx:625
#2  0x00005555559d9694 in tomviz::DataSource::algorithm() const (this=0x0) at ../tomviz/DataSource.cxx:1250
#3  0x00005555559d96be in tomviz::DataSource::dataObject() const (this=0x0) at ../tomviz/DataSource.cxx:1255
#4  0x0000555555b67148 in tomviz::ModuleVolume::<lambda(vtkSmartPointer<vtkImageData>, vtkSmartPointer<vtkImageData>)>::operator()(vtkSmartPointer<vtkImageData>, vtkSmartPointer<vtkImageData>) const (__closure=0x55555c2085b0, image=..., histogram2D=...)
    at ../tomviz/modules/ModuleVolume.cxx:43
#5  0x0000555555b69fc3 in QtPrivate::FunctorCall<QtPrivate::IndexesList<0, 1>, QtPrivate::List<vtkSmartPointer<vtkImageData>, vtkSmartPointer<vtkImageData> >, void, tomviz::ModuleVolume::ModuleVolume(QObject*)::<lambda(vtkSmartPointer<vtkImageData>, vtkSmartPointer<vtkImageData>)> >::call(tomviz::ModuleVolume::<lambda(vtkSmartPointer<vtkImageData>, vtkSmartPointer<vtkImageData>)> &, void **) (f=..., arg=0x7fffffffd4a0) at /usr/include/x86_64-linux-gnu/qt5/QtCore/qobjectdefs_impl.h:130
#6  0x0000555555b69f2b in QtPrivate::Functor<tomviz::ModuleVolume::ModuleVolume(QObject*)::<lambda(vtkSmartPointer<vtkImageData>, vtkSmartPointer<vtkImageData>)>, 2>::call<QtPrivate::List<vtkSmartPointer<vtkImageData>, vtkSmartPointer<vtkImageData> >, void>(tomviz::ModuleVolume::<lambda(vtkSmartPointer<vtkImageData>, vtkSmartPointer<vtkImageData>)> &, void *, void **) (f=..., arg=0x7fffffffd4a0) at /usr/include/x86_64-linux-gnu/qt5/QtCore/qobjectdefs_impl.h:240
#7  0x0000555555b69e73 in QtPrivate::QFunctorSlotObject<tomviz::ModuleVolume::ModuleVolume(QObject*)::<lambda(vtkSmartPointer<vtkImageData>, vtkSmartPointer<vtkImageData>)>, 2, QtPrivate::List<vtkSmartPointer<vtkImageData>, vtkSmartPointer<vtkImageData> >, void>::impl(int, QtPrivate::QSlotObjectBase *, QObject *, void **, bool *) (which=1, this_=0x55555c2085a0, r=0x55555c0af410, a=0x7fffffffd4a0, ret=0x0) at /usr/include/x86_64-linux-gnu/qt5/QtCore/qobject_impl.h:168
#8  0x00007fffed89366f in QMetaObject::activate(QObject*, int, int, void**) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#9  0x00005555559767ba in tomviz::HistogramManager::histogram2DReady(vtkSmartPointer<vtkImageData>, vtkSmartPointer<vtkImageData>) (this=0x5555560c69c0 <tomviz::HistogramManager::instance()::theInstance>, _t1=..., _t2=...)
    at tomviz/tomvizlib_autogen/EWIEGA46WW/moc_HistogramManager.cpp:164
#10 0x0000555555a155e1 in tomviz::HistogramManager::histogram2DReadyInternal(vtkSmartPointer<vtkImageData>, vtkSmartPointer<vtkImageData>) (this=0x5555560c69c0 <tomviz::HistogramManager::instance()::theInstance>, image=..., histogram=...)
    at ../tomviz/HistogramManager.cxx:316
#11 0x00005555559763ea in tomviz::HistogramManager::qt_static_metacall(QObject*, QMetaObject::Call, int, void**) (_o=0x5555560c69c0 <tomviz::HistogramManager::instance()::theInstance>, _c=QMetaObject::InvokeMetaMethod, _id=3, _a=0x7fffac106de0)
    at tomviz/tomvizlib_autogen/EWIEGA46WW/moc_HistogramManager.cpp:95
#12 0x00007fffed8940c2 in QObject::event(QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#13 0x00007ffff5d7983c in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
---Type <return> to continue, or q <return> to quit---
#14 0x00007ffff5d81104 in QApplication::notify(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Widgets.so.5
#15 0x00007fffed8648d8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#16 0x00007fffed86704d in QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) ()
    at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#17 0x00007fffed8be263 in  () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#18 0x00007fffde996417 in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#19 0x00007fffde996650 in  () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#20 0x00007fffde9966dc in g_main_context_iteration () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#21 0x00007fffed8bd88f in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) ()
    at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#22 0x00007fffed86290a in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#23 0x00007fffed86b9b4 in QCoreApplication::exec() () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#24 0x0000555555951deb in main(int, char**) (argc=1, argv=0x7fffffffde18) at ../tomviz/main.cxx:60

psavery · 2020-04-10T16:24:32Z

Here are a few things to think about for the design of this branch when we return to it:

In both master and this branch, python operators create a new child data source when:
a) There is a progress update with data
b) There is a result dict returned from the operator

If a child data source was not created, it is assumed that the operator's input vtkDataObject was modified in place. That input vtkDataObject is what gets passed along to the next operator.

In the master branch, if a child data source was created, the input vtkDataObject is just ignored and the child data source is used as the output. This would result in future operators "not seeing" the changes made by child-producing operators. However, in this branch, the data inside the child data source is set on the input vtkDataObject, and since that gets passed along to the next operator, the next operator can see the changes. In this way, this branch moves us more toward a proper pipeline approach, I think, where the input data object gets modified is the output.

In both master and this branch, there is always a child data source at the end of a chain of operators, that is considered the final output. In master, this data source gets re-used and moved up/down when operators are removed/added. In this branch, a new data source is created each time to be used at the end when operators are added or taken away. This is something that can change, I think we might be able to move this branch back to re-using data sources if that is desired.

When we introduce intermediate/persistent data sources into the pipeline chain, however, the intermediate data sources will not be moved up/down when the operators are removed/added, but a new one will need to be created. It's something we'll need to consider, and we might need to consider moving modules to/from the intermediate data sources when they are created/destroyed.

cjh1 · 2020-12-01T20:29:04Z

@psavery We should revisit this and get it merged.

psavery force-pushed the dataset-set-scalars branch 2 times, most recently from 6018bca to fe7fbfd Compare March 10, 2020 13:55

psavery mentioned this pull request Mar 12, 2020

Intermediate data sources when running multiple child-producing operators #2075

Open

psavery force-pushed the dataset-set-scalars branch 2 times, most recently from ce045ee to 514cc36 Compare April 7, 2020 09:28

psavery marked this pull request as ready for review April 8, 2020 14:52

psavery force-pushed the dataset-set-scalars branch 2 times, most recently from 643374c to ea9ce44 Compare April 9, 2020 19:08

psavery added 13 commits April 9, 2020 16:38

Add method to get keys from python dict

a917344

Signed-off-by: Patrick Avery <[email protected]>

Add Dict functions to get size and delete items

7e21bde

Signed-off-by: Patrick Avery <[email protected]>

Added moveModules function to ModuleManager

73ba20e

This is for a little more general use than the previous "Pipeline::moveModulesDown()" was. Signed-off-by: Patrick Avery <[email protected]>

Remove unused header

0ea3789

Signed-off-by: Patrick Avery <[email protected]>

Use long long instead of ssize_t

f4c2d66

ssize_t is producing an error on Windows. Signed-off-by: Patrick Avery <[email protected]>

When a datasource is destroyed, remove all modules

4482381

This ensures the modules get cleaned up. Signed-off-by: Patrick Avery <[email protected]>

psavery force-pushed the dataset-set-scalars branch from ea9ce44 to b4616b7 Compare April 9, 2020 21:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow non-active scalars to be modified in python operators #2061

Allow non-active scalars to be modified in python operators #2061

psavery commented Mar 5, 2020 •

edited

Loading

psavery commented Mar 7, 2020 •

edited

Loading

psavery commented Apr 8, 2020

psavery commented Apr 9, 2020

psavery commented Apr 9, 2020

psavery commented Apr 10, 2020 •

edited

Loading

cjh1 commented Dec 1, 2020

Allow non-active scalars to be modified in python operators #2061

Are you sure you want to change the base?

Allow non-active scalars to be modified in python operators #2061

Conversation

psavery commented Mar 5, 2020 • edited Loading

psavery commented Mar 7, 2020 • edited Loading

psavery commented Apr 8, 2020

psavery commented Apr 9, 2020

psavery commented Apr 9, 2020

psavery commented Apr 10, 2020 • edited Loading

cjh1 commented Dec 1, 2020

psavery commented Mar 5, 2020 •

edited

Loading

psavery commented Mar 7, 2020 •

edited

Loading

psavery commented Apr 10, 2020 •

edited

Loading