Algorithms don't work across both dicom and luna scans #229

reubano · 2017-11-17T18:09:35Z

Overview

Currently, the classification algorithm is tested with a luna scan, and the segmentation and identification algorithms are tested with dicom scans. This indicates that there may be some incompatibilities between dicom and luna scans that make them not work with certain algorithms. I have confirmed this by adding new tests for the complementary scans and produced the following results:

algorithm	dicom	luna
classification	FAILS	PASSES
segmentation	PASSES	FAILS
identification	crashes	crashes

Expected Behavior

All algorithm tests should pass

Technical details

Below are the specific tracebacks:

test_classification.py

(alcf) ⚡ docker-compose -f local.yml run prediction pytest -vrsk src/tests/test_classification.py
========================================== test session starts ==========================================
platform darwin -- Python 3.6.2, pytest-3.1.3, py-1.4.34, pluggy-0.4.0 -- /opt/local/bin/python
cachedir: .cache
rootdir: /Users/reubano/Documents/Projects/alcf/prediction, inifile:
collected 3 items

src/tests/test_classification.py::test_classify_predict_load PASSED
src/tests/test_classification.py::test_classify_dicom FAILED
src/tests/test_classification.py::test_classify_luna PASSED

=============================================== FAILURES ================================================
__________________________________________ test_classify_dicom __________________________________________

dicom_paths = ['/Users/reubano/Documents/Projects/alcf/images_full/LIDC-IDRI-0002/1.3.6.1.4.1.14519.5.2.1.6279.6001.4901573811602007...14519.5.2.1.6279.6001.298806137288633453246975630178/1.3.6.1.4.1.14519.5.2.1.6279.6001.179049373636438705059720603192']
nodule_locations = [{'x': 50, 'y': 50, 'z': 21}]
model_path = '/Users/reubano/Documents/Projects/alcf/prediction/src/algorithms/classify/assets/gtr123_model.ckpt'

    def test_classify_dicom(dicom_paths, nodule_locations, model_path):
>       predicted = trained_model.predict(dicom_paths, nodule_locations, model_path)

src/tests/test_classification.py:9:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src/algorithms/classify/trained_model.py:40: in predict
    return gtr123_model.predict(dicom_path, centroids, model_path)
src/algorithms/classify/src/gtr123_model.py:264: in predict
    ct_array, meta = preprocess(*load_ct(ct_path))
src/preprocess/load_ct.py:96: in load_ct
    dicom_pattern = os.path.join(path, '*.dcm')
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

a = ['/Users/reubano/Documents/Projects/alcf/images_full/LIDC-IDRI-0002/1.3.6.1.4.1.14519.5.2.1.6279.6001.4901573811602007...14519.5.2.1.6279.6001.298806137288633453246975630178/1.3.6.1.4.1.14519.5.2.1.6279.6001.179049373636438705059720603192']
p = ('*.dcm',)

    def join(a, *p):
        """Join two or more pathname components, inserting '/' as needed.
        If any component is an absolute path, all previous path components
        will be discarded.  An empty last part will result in a path that
        ends with a separator."""
>       a = os.fspath(a)
E       TypeError: expected str, bytes or os.PathLike object, not list

../../../../.virtualenvs/alcf/lib/python3.6/posixpath.py:78: TypeError
================================== 1 failed, 2 passed in 12.92 seconds ==================================

test_segmentation.py

(alcf) ⚡ docker-compose -f local.yml run prediction pytest -vrsk src/tests/test_segmentation.py
========================================== test session starts ==========================================
platform darwin -- Python 3.6.2, pytest-3.1.3, py-1.4.34, pluggy-0.4.0 -- /opt/local/bin/python
cachedir: .cache
rootdir: /Users/reubano/Documents/Projects/alcf/prediction, inifile:
collected 5 items

src/tests/test_segmentation.py::test_correct_paths PASSED
src/tests/test_segmentation.py::test_segment_predict_load PASSED
src/tests/test_segmentation.py::test_segment_dicom PASSED
src/tests/test_segmentation.py::test_segment_luna FAILED
src/tests/test_segmentation.py::test_lung_segmentation SKIPPED
======================================== short test summary info ========================================
SKIP [1] src/tests/test_segmentation.py:38: Takes very long

=============================================== FAILURES ================================================
___________________________________________ test_segment_luna ___________________________________________

metaimage_path = '/Users/reubano/Documents/Projects/alcf/images/LUNA-0001/1.3.6.1.4.1.14519.5.2.1.6279.6001.102133688497886810253331438797/1.3.6.1.4.1.14519.5.2.1.6279.6001.102133688497886810253331438797.mhd'
luna_nodule = {'x': 0, 'y': 100, 'z': 556}

    def test_segment_luna(metaimage_path, luna_nodule):
>       predicted = predict(metaimage_path, [luna_nodule])

src/tests/test_segmentation.py:33:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src/algorithms/segment/trained_model.py:58: in predict
    volumes = calculate_volume(segment_path, centroids)
src/algorithms/segment/trained_model.py:85: in calculate_volume
    labels = [mask[centroid['x'], centroid['y'], centroid['z']] for centroid in centroids]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

.0 = <list_iterator object at 0x13b1a4f28>

>   labels = [mask[centroid['x'], centroid['y'], centroid['z']] for centroid in centroids]
E   IndexError: index 556 is out of bounds for axis 2 with size 128

src/algorithms/segment/trained_model.py:85: IndexError
============================ 1 failed, 3 passed, 1 skipped in 43.85 seconds =============================

test_identification.py

(alcf) ⚡ docker-compose -f local.yml run -e RUN_SLOW_TESTS=true prediction pytest -vrsk src/tests/test_identification.py::test_identify_dicom_001
========================================== test session starts ==========================================
platform linux -- Python 3.6.3, pytest-3.1.3, py-1.5.2, pluggy-0.4.0 -- /usr/bin/python3.6
cachedir: .cache
rootdir: /app, inifile:
collected 1 item 

src/tests/test_identification.py::test_identify_dicom_001 

(alcf) ⚡ docker-compose -f local.yml run -e RUN_SLOW_TESTS=true prediction pytest -vrsk src/tests/test_identification.py::test_identify_dicom_003
========================================== test session starts ==========================================
platform linux -- Python 3.6.3, pytest-3.1.3, py-1.5.2, pluggy-0.4.0 -- /usr/bin/python3.6
cachedir: .cache
rootdir: /app, inifile:
collected 1 item 

src/tests/test_identification.py::test_identify_dicom_003

(alcf) ⚡ docker-compose -f local.yml run -e RUN_SLOW_TESTS=true prediction pytest -vrsk src/tests/test_identification.py::test_identify_luna
========================================== test session starts ===========================================
platform linux -- Python 3.6.3, pytest-3.1.3, py-1.5.2, pluggy-0.4.0 -- /usr/bin/python3.6
cachedir: .cache
rootdir: /app, inifile:
collected 1 item 

src/tests/test_identification.py::test_identify_luna

Steps to Reproduce

In order to replicate these results you must first apply the following patch file.

git checkout 8380847
git apply patch.txt

Acceptance criteria

classification test(s) should pass for dicom scans
segmentation test(s) should pass for luna scans
identification test(s) should pass for luna scans (CR #229 Fix identification on LUNA #252 - out of memory)
identification test(s) should pass for dicom scans (reset timeout back to 15, CR #195 Stop slow tests after set period of time #243 & Fixed nodule identification #213 - out of memory)

NOTES

All PRs must follow the standard PR checklist
Full points will be awarded separately for each boxed issue
A PR may include fixes for one or more boxed issues
For the above boxed issues, a check means the issue has been solved and a ... means someone is working on a PR
Once you start working on a boxed issue, please indicate so in the comments below

The text was updated successfully, but these errors were encountered:

WGierke · 2017-11-19T16:44:01Z

Once #237 is merged, I'd like to work on "segmentation test(s) should pass for luna scans".

WGierke · 2017-11-19T18:48:44Z

I took a quick glance at the identification problems. When running the tests, I get a RuntimeError: $ Torch: not enough memory: you tried to allocate 21GB. Buy new RAM! at /pytorch/torch/lib/TH/THGeneral.c:270. This basically means that our client / web server (depending on where the project will be deployed in the end) needs at least 21 GB of RAM. Is it realistic to assume that this will be the case? Otherwise we might need to crunch the model a bit ...
PS: 👍 for providing tests that need to pass :)

reubano · 2017-11-23T01:04:22Z

This basically means that our client / web server (depending on where the project will be deployed in the end) needs at least 21 GB of RAM. Is it realistic to assume that this will be the case?

I'd be inclined to say 'no', but I suppose @isms or @pjbull would know for sure.

isms · 2017-11-23T04:09:50Z

Per design doc, the goal here is running app on premise on local computer and not worrying so much about optimizing to moves lots of packets across networks. We can assume a workstation with a reasonable amount of RAM.

For practical purposes we are running the demo on a machine without unlimited RAM so it's not boundless. But it's an AWS c5.large or something similar so RAM is not in short supply.

WGierke · 2017-12-01T09:13:43Z

@reubano could you do me a favor and check out my branch 229_fix_segmentation? All I did was to apply the patch for the segmentation tests and merge the current master. The outcome was that all segmentation tests passed (I also ran slow tests):

➜  concept-to-clinic git:(229_fix_segmentation) ✗ docker-compose -f local.yml run prediction pytest -vrsk src/tests/test_segmentation.py
Starting base ... 
Starting base ... done
======================================== test session starts ========================================
platform linux -- Python 3.6.1, pytest-3.1.3, py-1.4.34, pluggy-0.4.0 -- /usr/bin/python3.6
cachedir: .cache
rootdir: /app, inifile:
collected 7 items 

src/tests/test_segmentation.py::test_correct_paths PASSED
src/tests/test_segmentation.py::test_segment_predict_load PASSED
src/tests/test_segmentation.py::test_segment_dicom PASSED
src/tests/test_segmentation.py::test_segment_luna PASSED
src/tests/test_segmentation.py::test_nodule_segmentation PASSED
src/tests/test_segmentation.py::test_lung_segmentation PASSED
src/tests/test_segmentation.py::test_stop_timeout PASSED

==================================== 7 passed in 514.55 seconds =====================================

I think the issues behind the failing segmentation tests at the moment of opening this issue could have been resolved in #237 . Can you confirm?

lamby · 2017-12-04T07:21:08Z

merged in 40c1b41 :)

reubano · 2017-12-04T21:56:21Z

Leaving open until all sub-issues are solved.

WGierke · 2017-12-17T10:16:34Z

@isms I can understand that RAM usage is not a hard constraint. However, this makes it impossible for me (and others with less than 32GB RAM) to debug (and improve as described in #130 ) the model. Do you see any possibility to address this problem?

reubano · 2017-12-19T00:01:22Z

For practical purposes we are running the demo on a machine without unlimited RAM so it's not boundless. But it's an AWS c5.large or something similar so RAM is not in short supply.

According to the c5 spec page, a c5.large only has 4gb of RAM, the c5.4xlarge however has 32gb. The c5.4xlarge prices at $0.680/hr. Alternatively, the r4.xlarge has less compute power (vs the c5.4xlarge), but a similar amount of ram, for $0.266/hr.

@WGierke looks like your best bet may be to try spinning up an ec2 instance.

isms · 2017-12-19T14:04:11Z

I can understand that RAM usage is not a hard constraint. However, this makes it impossible for me (and others with less than 32GB RAM) to debug (and improve as described in #130 ) the model. Do you see any possibility to address this problem?

@WGierke I know that @caseyfitz (on our team) is looking into the models, and we can potentially try to report out some accuracy statistics from the status quo model in case it informs a parallel approach that uses less memory. That being said, we don't know how long that would take so if you are blocked based on system requirements I'm not sure what to recommend other than using a cloud VM with higher stats like @reubano suggested.

WGierke · 2017-12-19T14:22:42Z

@isms Alright, thanks a lot for the insights and the feedback. I'll have a look at how I can solve this problem for me. Thanks!

Raab70 · 2017-12-19T15:14:35Z

My 2 cents as someone who is lurking interested in contributing is to lower the barrier to entry as much as possible. If AWS is the answer can we use terraform or cloudformation to create a template which makes it easy to spin up a dev instance or an instance which runs the test suite? And make sure the costs of any kind of template are well documented so people know what they're getting into ahead of time.

Also, I have experience in terraform and ECS if you want me to try to put something like that together.

isms · 2017-12-19T15:17:55Z

@Raab70 That sounds like quite a bit of "meta" work with the limited amount of time left in the competition! That being said if you think it would be helpful for you to do so, feel free to give it a try and report back.

reubano added bugfix medium minor official prediction labels Nov 17, 2017

reubano added this to the 2-feature-building milestone Nov 17, 2017

WGierke mentioned this issue Nov 21, 2017

#195 Stop slow tests after set period of time #243

Merged

1 task

WGierke mentioned this issue Nov 26, 2017

#229 Fix identification on LUNA #252

Merged

1 task

This was referenced Dec 3, 2017

#229 Fix Segmentation #255

Merged

#229 Fix Classification #256

Merged

WGierke added a commit to WGierke/concept-to-clinic that referenced this issue Dec 3, 2017

Add segmentation test from drivendataorg#229

536350b

lamby pushed a commit that referenced this issue Dec 4, 2017

Add segmentation test from #229

f9d6274

lamby closed this as completed Dec 4, 2017

reubano reopened this Dec 4, 2017

WGierke mentioned this issue Dec 15, 2017

Classification throws RuntimeError for real nodule location #268

Closed

5 tasks

isms modified the milestones: 2-feature-building, 3-packaging Jan 5, 2018

isms removed this from the 3-packaging milestone Jan 5, 2018

louisgv pushed a commit to louisgv/concept-to-clinic that referenced this issue Jan 16, 2018

Add segmentation test from drivendataorg#229

fbdac43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Algorithms don't work across both dicom and luna scans #229

Algorithms don't work across both dicom and luna scans #229

reubano commented Nov 17, 2017 •

edited

Loading

WGierke commented Nov 19, 2017

WGierke commented Nov 19, 2017

reubano commented Nov 23, 2017

isms commented Nov 23, 2017

WGierke commented Dec 1, 2017

lamby commented Dec 4, 2017

reubano commented Dec 4, 2017

WGierke commented Dec 17, 2017

reubano commented Dec 19, 2017

isms commented Dec 19, 2017 •

edited

Loading

WGierke commented Dec 19, 2017

Raab70 commented Dec 19, 2017

isms commented Dec 19, 2017

Algorithms don't work across both dicom and luna scans #229

Algorithms don't work across both dicom and luna scans #229

Comments

reubano commented Nov 17, 2017 • edited Loading

Overview

Expected Behavior

Technical details

Steps to Reproduce

Acceptance criteria

WGierke commented Nov 19, 2017

WGierke commented Nov 19, 2017

reubano commented Nov 23, 2017

isms commented Nov 23, 2017

WGierke commented Dec 1, 2017

lamby commented Dec 4, 2017

reubano commented Dec 4, 2017

WGierke commented Dec 17, 2017

reubano commented Dec 19, 2017

isms commented Dec 19, 2017 • edited Loading

WGierke commented Dec 19, 2017

Raab70 commented Dec 19, 2017

isms commented Dec 19, 2017

reubano commented Nov 17, 2017 •

edited

Loading

isms commented Dec 19, 2017 •

edited

Loading