SampleSupervisor #101

rabii-chaarani · 2024-06-12T02:18:21Z

Description

This is the new SampleSupervisor class, it handles all storage and processing operations within Project() and outside of it. The class allow simple calls to get samples for any calculations.
Various tests are added in this PR for SampleSupervisor, the class passes all tests and works fine, if you have any other test cases to propose, please let me know.
It is expected that as we are adding more features over time, I will rewrite the SampleSupervisor in the future to use graphs to process the necessary computations in the correct order.

Fixes #(issue)

Type of change

Documentation update
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Test improvement

How Has This Been Tested?

All tests performed are included in tests/sample_storage/test_sample_storage.py

Checklist:

This branch is up-to-date with master
All gh-action checks are passing
I have performed a self-review of my own code
My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
My tests run with pytest from the map2loop folder
New and existing tests pass locally with my changes

Checklist continued (if PR includes changes to documentation)

I have built the documentation locally with make.bat
I have built this documentation in docker, following the docker configuration in map2loop/docs
I have checked my spelling and grammar

…vailable

…ver link

…here the DEM was downloaded from

lachlangrose

There are a lot of changes in this pull request which make it a bit difficult to follow - I have left some comments on the code but I am not sure I follow the design/reasoning completely. I think it is important to think about how this class will interact with the other classes/methods within map2loop. I think a plan either as a design document and/or a flow chart would really help understanding these changes.

When I initially suggested this class my intention was to have a class which has a get_sample() method for each different type of sample - it then either calls the appropriate sampler or returns the stored data depending on the current state. This would mean that within other samplers there could be a call to the sample_supervisor.get_samples for any other data type (although some care would be needed to prevent circular calls). The motivation behind this method was to allow for the underlying data or sampling algorithm parameters to be changed by the user and then map2loop would only recompute what was/is required. From what I could follow this current implementation does allow for this but the order of the calls is defined by the supervisor instead of the samplers - I think it would make more sense for the samplers to control the ordr because what is actually used by each sampler would be dependent on the algorithm.

I think that the calls to the project should not exists and these functions calculate_fault_orientations, summarise_fault_data, extract_geology_contacts should all be run by sampler type classes that are managed by this class.

My last comment is there seem to some changes to m2l that are included in this pull request that are not really related to the sample supervisor. I know it is a pain to separate but it really makes reviewing the pull request easier if only the relevant changes are in the PR.

lachlangrose · 2024-06-13T00:00:57Z

map2loop/sample_storage.py

+    from .project import Project
+
+
+class AccessStorage(ABC):


Is it really necessary to have an abstract class for this? I can't think of a case where there are multiple sample supervisors

lachlangrose · 2024-06-13T00:03:24Z

map2loop/sample_storage.py

+    def type(self):
+        return self.storage_label
+
+    def set_default_samplers(self):


Shouldn't default be chosen with appropriate parameterisation given the map/dataset?

lachlangrose · 2024-06-13T00:04:20Z

map2loop/sample_storage.py

+        self.sampler_dirtyflags[sampletype] = True
+
+    @beartype.beartype
+    def get_sampler(self, sampletype: SampleType):


Should this return the sample not the sample type? and it should be type hinted for the return type

lachlangrose · 2024-06-13T00:04:59Z

map2loop/sample_storage.py

+        return self.samplers[sampletype].sampler_label
+
+    @beartype.beartype
+    def store(self, sampletype: SampleType, data):


What type is data? does this matter? Should there be any validation?

lachlangrose · 2024-06-13T00:16:02Z

map2loop/sample_storage.py

+        datatype = Datatype(sampletype)
+
+        if datatype == Datatype.DTM:
+            self.map_data.load_raster_map_data(datatype)
+
+        else:
+            # load map data
+            self.map_data.load_map_data(datatype)


I know this is from the mapdata class, but why can't we just have map_data.load(datatype)? And it takes care of what type the data is? Would probably be more future proof as raster data could also include geophysical grids

lachlangrose · 2024-06-13T00:17:54Z

map2loop/sample_storage.py

+            sampletype (SampleType): The type of the sample.
+        """
+
+        if sampletype == SampleType.CONTACT:


If the samplers are changed so that they have a __call__ method this would avoid having to pass different arguments to the sample methods.

lachlangrose · 2024-06-13T00:21:44Z

map2loop/sample_storage.py

+            self.store(
+                SampleType.CONTACT,
+                self.samplers[SampleType.CONTACT].sample(
+                    self.map_data.basal_contacts, self.map_data


Why is basal contacts not an attribute of the sample supervisor? Isn't is something that is sampled?

lachlangrose · 2024-06-13T00:24:00Z

map2loop/sample_storage.py

+            )
+
+    @beartype.beartype
+    def reprocess(self, sampletype: SampleType):


I don't really see why there is a reprocess. Shouldn't reprocess just be process?

lachlangrose · 2024-06-13T00:26:41Z

map2loop/sample_storage.py

+            self.process(SampleType.FOLD)
+
+    @beartype.beartype
+    def __call__(self, sampletype: SampleType):


I like having a call method, I think the logic is a bit complicated. It should just be self.process and return the data. All other checks should be done within process.

… the beginning of the test scripts

edit dddf70b chore: added issue_templates

…here the DEM was downloaded from

rabii-chaarani · 2024-06-14T00:36:01Z

Rebasing did not solve the problem of having too many file changes. I will start over in another branch!

rabii-chaarani and others added 30 commits June 3, 2024 14:01

feat: added SampleSupervisor

731910e

fix: added StateType enum

42a6238

doc: added libtiff-dev instead of libtiff5

8e6652d

fix: updated to align SampleSupervisor

aca1164

fix: fault_obs_data aligned to master branch

439c383

test: updated InterpolatedStructure test

47a0b69

test: add m2l import

cff5d39

test: changed test data

06e6b10

test: fixed structure config for test

f6618cf

fix: inplace to false

4be78bc

fix: fixed set_crs method

d425b3c

fix: ensure all dipdir vals < 360

09dd7f3

docs: update libtiff version

a37058b

fix: add test for all structures less than 360

d3fc6b5

fix: add catchall exception to mapdata_parse_structure & comment code

9efe63b

fix: make the templates a bit easier to fill out

1aa5add

fix: update question template

c61e225

fix: move location of the PR template

8be28c8

fix: fix for altitude not being saved properly in LPF

f7b29c6

fix: update tests that depend on server to skip in case server is una…

6efdc98

…vailable

test: fix for when server not available

933130e

fix: allow 2 minutes for server connection & add add available ga ser…

b4fba56

…ver link

fix: add 2 more links to try from GA WCS if timing out & prints out w…

78c5a80

…here the DEM was downloaded from

fix: revert back to original

4a05764

fix: create the tmp file function missing

024f67f

fix: return just column with new vals instead of inplace modification

f6fe370

fix: use gpd.points_from_xy

7d9e6d7

fix: verbose dipdir 360 test name

f7fd8e8

fix: revert back to original timeout

9285be5

fix: revert back to original url

1ade12d

AngRodrigues added 4 commits June 12, 2024 16:05

fix: return just column with new vals instead of inplace modification

dac2a90

fix: verbose dipdir 360 test name

4e042e5

fix: revert back to original timeout

3978108

fix: revert back to original url

9a19373

lachlangrose requested changes Jun 13, 2024

View reviewed changes

AngRodrigues and others added 24 commits June 14, 2024 09:58

fix: remove matplotlib & assign random colors & tidy up imports

11d507f

style: style fixes by ruff and autoformatting by black

6a8197d

fix: assign the right beartype output in create_points

51f878d

fix: add invalid hex input handling

437e47d

fix: comment the code

27e14a8

tests: add test for mapdata.colour_units

c1fcfda

tests: reorganise tests

072bb34

fix: make sure all colours are unique

c2c8f1e

fix: small fixes - double check for integer & add test information at…

40b3042

… the beginning of the test scripts

fix: fix for duplicate units

049f00c

tests: add tests for duplicate units in colour_units()

9e63732

tests: rephrase the unit must be unique assertion in colour_unit_tests

afe4d5a

fix: add the suggestions

5c87e5d

fix: update the tests for new function names (hex_to_rgb)

fad1f3c

chore(master): release 3.1.6

d5726f3

feat: added SampleSupervisor

975863d

feat: added issue_templates

eb42ed5

fix: add pull request template

ca40991

fix: few typos grammar

e9d3b8b

Revert "Update issue templates"

eeddac3

feat: added issue_templates

1eff115

edit dddf70b chore: added issue_templates

fix: add issue templates back

78e5c58

fix: correct grammar for clarity

80826a7

fix: add 2 more links to try from GA WCS if timing out & prints out w…

4f299b9

…here the DEM was downloaded from

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SampleSupervisor #101

SampleSupervisor #101

rabii-chaarani commented Jun 12, 2024

lachlangrose left a comment

lachlangrose Jun 13, 2024

lachlangrose Jun 13, 2024

lachlangrose Jun 13, 2024

lachlangrose Jun 13, 2024

lachlangrose Jun 13, 2024

lachlangrose Jun 13, 2024

lachlangrose Jun 13, 2024

lachlangrose Jun 13, 2024

lachlangrose Jun 13, 2024

rabii-chaarani commented Jun 14, 2024

SampleSupervisor #101

Are you sure you want to change the base?

SampleSupervisor #101

Conversation

rabii-chaarani commented Jun 12, 2024

Description

Type of change

How Has This Been Tested?

Checklist:

Checklist continued (if PR includes changes to documentation)

lachlangrose left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rabii-chaarani commented Jun 14, 2024