Add remaining GET endpoints to parity check #1847

ethax-ross · 2024-10-23T12:24:52Z

Context

We want to expand the parity check to include all other GET endpoints so that we can compare the results. This also involves supporting individual GET resource endpoints, which require a different id in the path for each provider.

Changes proposed in this pull request

Optimise the ResponseComparison deletion on re-run

Deleting the existing parity check results can be slow as they can be very large records in the database. We don't care about callbacks or validation here, so we can optimise the deletion with delete_all.

Support checking an individual resource

We need to be able to request an individual resource as part of the parity check. The difficulty with this is that we need a valid ID for each lead provider. Instead of specifying these manually, I've opted to allow dynamic substitution by evaluating code from the YAML file.

We can now specify an ID attribute in the path with :id and a corresponding option in the YAML file that, when evaluated (in the context of the Client) should return a valid id for the lead provider.

Add remaining GET endpoints to YAML file

Add all the listing and individual GET resource endpoints to the YAML file.

Convert CSV responses to hexdigest

The CSV responses are very large and cause issues when trying to insert into Postgres. To be able to include them for now we are converting the response CSV to a hexdigest that can be stored and compared. If we want more details we will have to make the requests manually to inspect the responses.

Stop paginating on first match

In order to finish the parity check as soon as possible, we stop paginating responses as soon as we encounter a discrepency. A mismatch in a earlier page is likely to effect later pages anyway, so this is all we will be interested in (and other issues would be surfaced on future runs when its fixed).

Remove CSV endpoints

These cause issues with timeouts on the database; I'm not sure why as we hash the large CSV response so it shouldn't cause issues.

Remove CSV endpoints; if we want to include them we can add another ticket to make them work.

Persist formatted path to ResponseComparison

As the id we grab for the lead provider is random, we need to persist it in the ResponseComparison to be able to replicate the request.

Add id to request_path before saving (avoids adding a new attribute to ResponseComparison).

Pretty format JSON in diff

When we output the JSON diff we want to pretty format it or we end up with a single line which is hard to compare.

Deep sort responses

If the response is JSON we deep sort the resulting hash so that we can compare the contents irrespective of order (as the new serializers in NPQ reg may serialize the fields in a different order).

Guidance for review

Best reviewed by-commit.

I left the CSV hexdigest commit in as we will still need it if we bring the CSV endpoints in again (we return all results, which would be a really big diff/thing to put into the database without hashing it).

Results	Results (expanded)	Updated diff

Deleting the existing parity check results can be slow as they can be very large records in the database. We don't care about callbacks or validation here, so we can optimise the deletion with `delete_all`.

We need to be able to request an individual resource as part of the parity check. The difficulty with this is that we need a valid ID for each lead provider. Instead of specifying these manually, I've opted to allow dynamic substitution by evaluating code from the YAML file. We can now specify an ID attribute in the path with `:id` and a corresponding option in the YAML file that, when evaluated (in the context of the `Client`) should return a valid id for the lead provider.

github-actions · 2024-10-23T12:35:39Z

Review app deployed to https://npq-registration-review-1847-web.test.teacherservices.cloud/

Add all the listing and individual GET resource endpoints to the YAML file.

The CSV responses are very large and cause issues when trying to insert into Postgres. To be able to include them for now we are converting the response CSV to a hexdigest that can be stored and compared. If we want more details we will have to make the requests manually to inspect the responses.

In order to finish the parity check as soon as possible, we stop paginating responses as soon as we encounter a discrepency. A mismatch in a earlier page is likely to effect later pages anyway, so this is all we will be interested in (and other issues would be surfaced on future runs when its fixed).

cwrw · 2024-10-24T08:11:23Z

config/parity_check_endpoints.yml

+    paginate: true
+
+  "/api/v1/participants/npq/:id":
+    id: 'User.includes(:applications).where(applications: { lead_provider: }).order("RANDOM()").limit(1).pick(:ecf_id)'


can we make sure those are accepted applications?

cwrw · 2024-10-24T08:23:52Z

app/services/migration/parity_check/client.rb

@@ -87,7 +91,11 @@ def url(app:)
    def formatted_path
      return path unless options[:id] && path.include?(":id")

-      path.sub(":id", eval(options[:id]).to_s) # rubocop:disable Security/Eval
+      id = eval(options[:id]) # rubocop:disable Security/Eval


curious about this, why wouldn't we able to find an id 🤔 ideally all should be there?

Ah so I put this in when I was finding some returning no results, but it was down to a bug in one of my queries; I'll remove this

config/parity_check_endpoints.yml

cwrw

looks great, some minor comments 🙌

These cause issues with timeouts on the database; I'm not sure why as we hash the large CSV response so it shouldn't cause issues. Remove CSV endpoints; if we want to include them we can add another ticket to make them work.

As the id we grab for the lead provider is random, we need to persist it in the `ResponseComparison` to be able to replicate the request. Add `id` to `request_path` before saving (avoids adding a new attribute to `ResponseComparison`).

When we output the JSON diff we want to pretty format it or we end up with a single line which is hard to compare.

If the response is JSON we deep sort the resulting hash so that we can compare the contents irrespective of order (as the new serializers in NPQ reg may serialize the fields in a different order).

mooktakim · 2024-10-24T09:21:34Z

app/services/migration/parity_check/client.rb

+      @formatted_path ||= begin
+        return path unless options[:id] && path.include?(":id")
+
+        path.sub(":id", eval(options[:id]).to_s) # rubocop:disable Security/Eval


Instead of doing eval from the yaml file content, can we pass in instructions instead?
eg:

"/api/v1/npq-applications/:id": id: 'random_lead_provider_ecf_id`

And then in Client have method to translate to action:

def process_instruct(n) case n when "random_lead_provider_ecf_id" lead_provider.applications.order("RANDOM()").limit(1).pick(:ecf_id)' end end ecf_id = process_instruct(options[:id])

hmm somehow prefer it in the yaml file 🤔 otherwise we'll have a huge list in the client

I prefer it out of the YAML file; easier to test, avoids eval and less duplication

ok fair enough, I can see it looks better 👍

👍 for the meme

mooktakim · 2024-10-24T09:32:57Z

config/parity_check_endpoints.yml

+    paginate: true
+
+  "/api/v1/npq-applications/:id":
+    id: 'lead_provider.applications.order("RANDOM()").limit(1).pick(:ecf_id)'


possible feature, option to run the single request multiple times? eg: "run_count: 100"

Yeah one for the future maybe as this would have knock on effects to how we display/store the results

Hmm, i'm thinking we could do it now for free by duplicating the same URL like this:

/api/v1/npq-applications/:id#1

/api/v1/npq-applications/:id#2

/api/v1/npq-applications/:id#3

lol

Instead of evaluating the YAML contents directly, we can specify a method in the YAML file and call in the `Client`. This also has the benefit of being able to more easily test the possible `id` options.

sonarcloud · 2024-10-24T09:57:28Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
100.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

leandroalemao

👏 tks @ethax-ross 👍

ethax-ross added 2 commits October 23, 2024 13:15

Optimise the ResponseComparison deletion on re-run

1919ad1

Deleting the existing parity check results can be slow as they can be very large records in the database. We don't care about callbacks or validation here, so we can optimise the deletion with `delete_all`.

ethax-ross temporarily deployed to review October 23, 2024 12:28 — with GitHub Actions Inactive

ethax-ross temporarily deployed to staging October 23, 2024 12:30 — with GitHub Actions Inactive

ethax-ross force-pushed the 3617-support-get-single-resource branch from a4b7ba3 to 218490f Compare October 23, 2024 12:44

ethax-ross temporarily deployed to review October 23, 2024 12:48 — with GitHub Actions Inactive

ethax-ross temporarily deployed to staging October 23, 2024 12:51 — with GitHub Actions Inactive

ethax-ross had a problem deploying to review October 23, 2024 14:12 — with GitHub Actions Error

ethax-ross temporarily deployed to staging October 23, 2024 14:14 — with GitHub Actions Inactive

ethax-ross force-pushed the 3617-support-get-single-resource branch from 9d0192a to 2b08acf Compare October 23, 2024 14:17

ethax-ross had a problem deploying to review October 23, 2024 14:21 — with GitHub Actions Failure

ethax-ross temporarily deployed to staging October 23, 2024 14:23 — with GitHub Actions Inactive

ethax-ross temporarily deployed to review October 23, 2024 14:27 — with GitHub Actions Inactive

ethax-ross marked this pull request as ready for review October 23, 2024 15:15

ethax-ross requested a review from a team as a code owner October 23, 2024 15:15

ethax-ross requested a review from a team October 23, 2024 15:15

cwrw assigned mooktakim and leandroalemao Oct 23, 2024

ethax-ross added 3 commits October 24, 2024 08:48

Add remaining GET endpoints to YAML file

60b95bd

Add all the listing and individual GET resource endpoints to the YAML file.

ethax-ross force-pushed the 3617-support-get-single-resource branch from 2b08acf to 885afa4 Compare October 24, 2024 07:49

ethax-ross temporarily deployed to review October 24, 2024 07:52 — with GitHub Actions Inactive

ethax-ross temporarily deployed to staging October 24, 2024 07:54 — with GitHub Actions Inactive

cwrw reviewed Oct 24, 2024

View reviewed changes

config/parity_check_endpoints.yml Show resolved Hide resolved

cwrw reviewed Oct 24, 2024

View reviewed changes

Remove CSV endpoints

c752d79

These cause issues with timeouts on the database; I'm not sure why as we hash the large CSV response so it shouldn't cause issues. Remove CSV endpoints; if we want to include them we can add another ticket to make them work.

ethax-ross added 4 commits October 24, 2024 10:13

Persist formatted path to ResponseComparison

0d74ceb

As the id we grab for the lead provider is random, we need to persist it in the `ResponseComparison` to be able to replicate the request. Add `id` to `request_path` before saving (avoids adding a new attribute to `ResponseComparison`).

Pretty format JSON in diff

6ebc3f5

When we output the JSON diff we want to pretty format it or we end up with a single line which is hard to compare.

Deep sort responses

e57502d

If the response is JSON we deep sort the resulting hash so that we can compare the contents irrespective of order (as the new serializers in NPQ reg may serialize the fields in a different order).

Restrict NPQ participant id to accepted applications

ffb3de2

ethax-ross force-pushed the 3617-support-get-single-resource branch from 885afa4 to ffb3de2 Compare October 24, 2024 09:15

ethax-ross temporarily deployed to review October 24, 2024 09:19 — with GitHub Actions Inactive

ethax-ross temporarily deployed to staging October 24, 2024 09:20 — with GitHub Actions Inactive

mooktakim reviewed Oct 24, 2024

View reviewed changes

Replace use of eval with safer solution

abba51f

Instead of evaluating the YAML contents directly, we can specify a method in the YAML file and call in the `Client`. This also has the benefit of being able to more easily test the possible `id` options.

ethax-ross temporarily deployed to review October 24, 2024 09:52 — with GitHub Actions Inactive

ethax-ross temporarily deployed to staging October 24, 2024 09:54 — with GitHub Actions Inactive

mooktakim approved these changes Oct 24, 2024

View reviewed changes

cwrw approved these changes Oct 24, 2024

View reviewed changes

leandroalemao approved these changes Oct 24, 2024

View reviewed changes

ethax-ross added this pull request to the merge queue Oct 25, 2024

Merged via the queue into main with commit 37ca1c6 Oct 25, 2024
18 checks passed

ethax-ross deleted the 3617-support-get-single-resource branch October 25, 2024 06:51

ethax-ross temporarily deployed to review October 25, 2024 06:51 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add remaining GET endpoints to parity check #1847

Add remaining GET endpoints to parity check #1847

ethax-ross commented Oct 23, 2024 •

edited

Loading

github-actions bot commented Oct 23, 2024

cwrw Oct 24, 2024

cwrw Oct 24, 2024

ethax-ross Oct 24, 2024

cwrw left a comment

mooktakim Oct 24, 2024

ethax-ross Oct 24, 2024

cwrw Oct 24, 2024

ethax-ross Oct 24, 2024

cwrw Oct 24, 2024

mooktakim Oct 24, 2024

mooktakim Oct 24, 2024

ethax-ross Oct 24, 2024

mooktakim Oct 24, 2024

sonarcloud bot commented Oct 24, 2024 •

edited

Loading

leandroalemao left a comment

Add remaining GET endpoints to parity check #1847

Add remaining GET endpoints to parity check #1847

Conversation

ethax-ross commented Oct 23, 2024 • edited Loading

Context

Changes proposed in this pull request

Guidance for review

github-actions bot commented Oct 23, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cwrw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sonarcloud bot commented Oct 24, 2024 • edited Loading

Quality Gate passed

leandroalemao left a comment

Choose a reason for hiding this comment

ethax-ross commented Oct 23, 2024 •

edited

Loading

sonarcloud bot commented Oct 24, 2024 •

edited

Loading