Add a KBMOD results filter for matching "known objects" #741

wilsonbb · 2024-11-12T19:31:38Z

Adds a filter for matching and filtering KBMOD results to "known objects", as defined by a user-provided astropy table specifying a catalog of objects we expect to find in the KBMOD data as part of addressing #528. The catalog of known objects can either be cached information of real known objects from a service such as astroquery or a catalog of inserted synthetic fakes a user has added to the data.

Such a catalog must have columns representing an object's:

Name (either a synthetically generated or real target name as defined in https://astroquery.readthedocs.io/en/latest/imcce/imcce.html)
it's RA and Dec for each observation (which can be reflex-corrected to a given guess distance if the user wants to compare against reflex-corrected results)
the mjd of each observation

On klone/hyak loading and filtering with an approximately 750 mb catalog of cached astroquery results data corresponding to cone searches around bore sights in the DEEP search was tested, and loading and filtering took about 30 seconds. So there is room for optimization but currently would likely not be a scaling bottle neck.

The filter can be called either in the kbmod search wrokflow in src/kbmod/run_search.py where it can be called multiple times for different data sources or from any post-processing steps with a saved KBMOD Results object and its saved WCS.

The filter matches each result observation to potentially multiple objects with the user being able to apply thresholds for how close they need to be both spatially and temporally. Observations that match to known objects are then set as invalid in the Results table's "obs_valid" column, and remove_match_obs=True Results table filtering is applied by the filter to remove results that no longer have enough matching observations.

The function returns which which known objects matched to which observation for each KBMOD result (regardless of how many observations matched and the truth value of remove_match_obs). This preserves as much matching information as possible for cases such as when multiple known objects intersect different parts of a result trajectory. While this PR does not provide a convenient list of which expected recovered objects were not in the KBMOD results as requested in #528, the caller of the filter has all of the information needed to construct that.

An example workflow for filtering out known objects, identifying recovered fakes, and then processing potentially real results is provided below:

from kbmod.filters.known_object_filters import KnownObjsMatcher
from kbmod.results import Results

res = Results.read_table("/path/to/results")

# Remove all real observations from real objects from results
real_obj_filter_params = {
          "filter_type": "real_obj_matches",
          "known_obj_thresh": 0.5,
          "known_obj_sep_thresh": 1.0,
          "known_obj_sep_time_thresh_s": 600,
}
real_obj_table = Table("/path/to/real_obj_table/")
real_obj_matcher = KnownObjsMatcher(real_obj_table, obstimes, real_obj_filter_params)
res = real_obj_matcher.apply_known_obj_valid_obs_filter(
            res,
            wcs = res["wcs"][0],
            update_obs_valid=True,
        )

# Identify all recovered fakes
fake_filter_params = {
          "filter_type": "fake_matches",
          "known_obj_thresh": 0.5,
          "known_obj_sep_thresh": 1.0,
          "known_obj_sep_time_thresh_s": 600,
          "recovered_fake_matches_obs_ratio": 0.5
}
fake_table = Table("/path/to/real_obj_table/")
fake_matcher = KnownObjsMatcher(real_obj_table, obstimes, real_obj_filter_params)

# Here we match observations to our fakes. Note that this does not update the "obs_valid"
# column of the Results table
res = fake_matcher.match_known_obj_filters(
            res,
            wcs = res["wcs"][0],
        )

# Apply a threshold for how many observations from the fake catalog we had to recover
# in order for the fake to be found. (note that we already filter near the obstimes for this
# KBMOD run, so fakes on distant nights shouldn't matter). Our cutoff ratio is
# from fake_filter_params.known_obj_match_obs_ratio
res  = fake_matcher.apply_known_obj_match_obs_ratio(res)

# Now we should have the column "recovered_fake_matches_obs_ratio" in the Results table.
# And we can use it to generate a list of recovered fakes. 
recovered_fakes = set([])
for r in res:
    recovered_fakes.update(res["recovered_fake_matches_obs_ratio"])
print(f"Recovered fakes: {recovered_fakes}")

# Now we can filter out all recovered fakes to continue processing potential results with ML
res = fake_matcher.filter_known_obj(res, "recovered_fake_matches_obs_ratio", recovered_fake_matches_obs_ratio")
res = ml_magic_yay(res)

wilsonbb added 7 commits November 8, 2024 17:08

Add filter for known objects

4def7ee

Modify init with known object filter

50ab379

Merge branch 'main' into results_filter

ec65f66

Clean up comments and tests

77aeeb3

Lint fixes

fd37556

Refactored to KnownObjsMatcher and added filters

2d2b79e

More refactoring and renaming

3d0c6fb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a KBMOD results filter for matching "known objects" #741

Add a KBMOD results filter for matching "known objects" #741

wilsonbb commented Nov 12, 2024 •

edited

Loading

Add a KBMOD results filter for matching "known objects" #741

Are you sure you want to change the base?

Add a KBMOD results filter for matching "known objects" #741

Conversation

wilsonbb commented Nov 12, 2024 • edited Loading

wilsonbb commented Nov 12, 2024 •

edited

Loading