-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Austenem/CAT-960 Fix EPIC redirects #3583
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job always including tests!
context/app/utils.py
Outdated
def find_earliest_dataset_ancestor(client, uuid): | ||
dataset = client.get_entities( | ||
'datasets', | ||
query_override={ | ||
"bool": { | ||
"must": { | ||
"term": { | ||
"uuid": uuid | ||
} | ||
} | ||
} | ||
}, | ||
non_metadata_fields=['hubmap_id', 'uuid', 'immediate_ancestors', 'entity_type'] | ||
) | ||
|
||
# If no dataset is found or it has no ancestors, return the current dataset UUID | ||
if not dataset or not dataset[0].get('immediate_ancestors'): | ||
return uuid | ||
|
||
# Traverse through immediate ancestors to find the earliest dataset ancestor | ||
for ancestor in dataset[0]['immediate_ancestors']: | ||
if ancestor.get('entity_type') == 'Dataset': | ||
uuid = find_earliest_dataset_ancestor(client, ancestor.get('uuid')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't depend on immediate_ancestors
since it's somewhat deprecated and we should also avoid recursion and multiple requests if possible.
Can we use a query that returns the dataset which has processing === 'raw'
, does not have the field ancestor_counts.entity_type.Dataset
, and its uuid
is included in the given dataset's ancestors_ids
? Let me know if you need help drafting the query.
I spent a bit of time working on updating the util tests, and then realized that style of testing (mocking the client response) isn't applicable to this new approach. I removed the tests, but if there's a better approach/a way to verify this query works apart from manually testing redirects - which look fine so far - I'd be interested in it! |
You could test the redirect with a couple of e2e tests. |
context/app/routes_browse.py
Outdated
redirectedFromPipeline=entity.get('pipeline'))) | ||
# Check whether the oldest found ancestor exists and is of an expected type. 404 is | ||
# preferable to a page that shows only a support entity or processed/component dataset. | ||
if not earliest_dataset or should_redirect_entity(earliest_dataset[0]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to call should_redirect_entity
again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was originally to check that the retrieved entity was not actually a processed/support dataset - with the updated query, this shouldn't be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
Summary
Update support entity and processed/component dataset redirects to go only to the appropriate primary datasets. If the primary dataset does not exist, a 404 error will be shown. This addresses an issue that caused non-unified pages for EPIC datasets to be shown.
Design Documentation/Original Tickets
CAT-960 Jira ticket
Testing
Unit tested helper function and manually checked for regressions in redirects with EPIC, centrally processed, component, and primary datasets. There are currently no EPIC datasets with a parent processed dataset on dev that can be used to check the success case for this update.
Once EPICs fitting this case are released, this update should be tested again.
Note: the EPIC dataset linked in the Jira ticket (
0739bdc773cd2c294e5bdb3fed1b4cb4
) is missing a primary dataset ancestor, and so cannot be used to test this update. It can, however, be used to test the error case, which is successful (previously the page redirected to a processed dataset ancestor, and it now redirects to a 404 page).Checklist
CHANGELOG-your-feature-name-here.md
is present in the root directory, describing the change(s) in full sentences.