Skip to content

Commit

Permalink
Add remaining GET endpoints to parity check (#1847)
Browse files Browse the repository at this point in the history
* Optimise the `ResponseComparison` deletion on re-run

Deleting the existing parity check results can be slow as they can be very
large records in the database. We don't care about callbacks or validation
here, so we can optimise the deletion with `delete_all`.

* Support checking an individual resource

We need to be able to request an individual resource as part of the parity
check. The difficulty with this is that we need a valid ID for each lead
provider. Instead of specifying these manually, I've opted to allow dynamic
substitution by evaluating code from the YAML file.

We can now specify an ID attribute in the path with `:id` and a corresponding
option in the YAML file that, when evaluated (in the context of the `Client`)
should return a valid id for the lead provider.

* Add remaining GET endpoints to YAML file

Add all the listing and individual GET resource endpoints to the YAML file.

* Convert CSV responses to hexdigest

The CSV responses are very large and cause issues when trying to insert into
Postgres. To be able to include them for now we are converting the response CSV
to a hexdigest that can be stored and compared. If we want more details we will
have to make the requests manually to inspect the responses.

* Stop paginating on first mismatch

In order to finish the parity check as soon as possible, we stop paginating
responses as soon as we encounter a discrepency. A mismatch in a earlier page
is likely to effect later pages anyway, so this is all we will be interested in
(and other issues would be surfaced on future runs when its fixed).

* Remove CSV endpoints

These cause issues with timeouts on the database; I'm not sure why as we hash
the large CSV response so it shouldn't cause issues.

Remove CSV endpoints; if we want to include them we can add another ticket to
make them work.

* Persist formatted path to ResponseComparison

As the id we grab for the lead provider is random, we need to persist it in the
`ResponseComparison` to be able to replicate the request.

Add `id` to `request_path` before saving (avoids adding a new attribute to
`ResponseComparison`).

* Pretty format JSON in diff

When we output the JSON diff we want to pretty format it or we end up with a
single line which is hard to compare.

* Deep sort responses

If the response is JSON we deep sort the resulting hash so that we can compare
the contents irrespective of order (as the new serializers in NPQ reg may
serialize the fields in a different order).

* Restrict NPQ participant id to accepted applications

* Replace use of eval with safer solution

Instead of evaluating the YAML contents directly, we can specify a method in
the YAML file and call in the `Client`. This also has the benefit of being able
to more easily test the possible `id` options.
  • Loading branch information
ethax-ross authored Oct 25, 2024
1 parent c3425d4 commit 37ca1c6
Show file tree
Hide file tree
Showing 9 changed files with 323 additions and 18 deletions.
1 change: 1 addition & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ gem "canonical-rails"
gem "countries"
gem "cssbundling-rails", "~> 1.4"
gem "daemons"
gem "deepsort"
gem "delayed_cron_job"
gem "delayed_job", "~> 4.1"
gem "delayed_job_active_record"
Expand Down
2 changes: 2 additions & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,7 @@ GEM
daemons (1.4.1)
date (3.3.4)
declarative (0.0.20)
deepsort (0.5.0)
delayed_cron_job (0.9.0)
fugit (>= 1.5)
delayed_job (4.1.12)
Expand Down Expand Up @@ -685,6 +686,7 @@ DEPENDENCIES
countries
cssbundling-rails (~> 1.4)
daemons
deepsort
delayed_cron_job
delayed_job (~> 4.1)
delayed_job_active_record
Expand Down
30 changes: 29 additions & 1 deletion app/models/migration/parity_check/response_comparison.rb
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
module Migration
class ParityCheck::ResponseComparison < ApplicationRecord
before_validation :clear_response_bodies_when_equal
before_validation :digest_csv_response_bodies, :format_json_response_bodies, :clear_response_bodies_when_equal

belongs_to :lead_provider

Expand Down Expand Up @@ -69,6 +69,34 @@ def response_body_diff

private

def format_json_response_bodies
self.ecf_response_body = pretty_format(ecf_response_body_hash) if ecf_response_body_hash
self.npq_response_body = pretty_format(npq_response_body_hash) if npq_response_body_hash
end

def pretty_format(hash)
JSON.pretty_generate(hash)
end

def ecf_response_body_hash
@ecf_response_body_hash ||= JSON.parse(ecf_response_body).deep_sort
rescue JSON::ParserError, TypeError
nil
end

def npq_response_body_hash
@npq_response_body_hash ||= JSON.parse(npq_response_body).deep_sort
rescue JSON::ParserError, TypeError
nil
end

def digest_csv_response_bodies
return unless request_path&.include?(".csv")

self.ecf_response_body = Digest::SHA2.hexdigest(ecf_response_body) if ecf_response_body
self.npq_response_body = Digest::SHA2.hexdigest(npq_response_body) if npq_response_body
end

def clear_response_bodies_when_equal
return if different?

Expand Down
7 changes: 4 additions & 3 deletions app/services/migration/parity_check.rb
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ def prepare!
Rails.cache.write(:parity_check_started_at, Time.zone.now)
Rails.cache.write(:parity_check_completed_at, nil)

Migration::ParityCheck::ResponseComparison.destroy_all
# We want this to be fast, so we're not bothering with callbacks.
Migration::ParityCheck::ResponseComparison.delete_all
end

def running?
Expand Down Expand Up @@ -56,8 +57,8 @@ def call_endpoints(lead_provider)
paths.each do |path, options|
client = Client.new(lead_provider:, method:, path:, options:)

client.make_requests do |ecf_result, npq_result, page|
save_comparison!(lead_provider:, path:, method:, page:, ecf_result:, npq_result:)
client.make_requests do |ecf_result, npq_result, formatted_path, page|
save_comparison!(lead_provider:, path: formatted_path, method:, page:, ecf_result:, npq_result:)
end
end
end
Expand Down
60 changes: 56 additions & 4 deletions app/services/migration/parity_check/client.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
module Migration
class ParityCheck::Client
class UnsupportedIdOption < RuntimeError; end

attr_reader :lead_provider, :method, :path, :options, :page

PAGINATION_PER_PAGE = 10
Expand All @@ -17,7 +19,7 @@ def make_requests(&block)
ecf_result = timed_response { send("#{method}_request", app: :ecf) }
npq_result = timed_response { send("#{method}_request", app: :npq) }

block.call(ecf_result, npq_result, page)
block.call(ecf_result, npq_result, formatted_path, page)

break unless next_page?(ecf_result, npq_result)

Expand All @@ -29,7 +31,17 @@ def make_requests(&block)

def next_page?(ecf_response, npq_response)
return false unless paginate?
return false unless responses_match?(ecf_response, npq_response)

pages_remain?(ecf_response, npq_response)
end

def responses_match?(ecf_response, npq_response)
ecf_response[:response].code == npq_response[:response].code &&
ecf_response[:response].body == npq_response[:response].body
end

def pages_remain?(ecf_response, npq_response)
[ecf_response[:response].body, npq_response[:response].body].any? do |body|
JSON.parse(body)["data"]&.size == PAGINATION_PER_PAGE
rescue JSON::ParserError
Expand All @@ -49,7 +61,7 @@ def timed_response(&request)
end

def get_request(app:)
HTTParty.get(url(app:, path:), query:, headers:)
HTTParty.get(url(app:), query:, headers:)
end

def token_provider
Expand All @@ -70,8 +82,48 @@ def headers
}
end

def url(app:, path:)
Rails.application.config.npq_separation[:parity_check]["#{app}_url".to_sym] + path
def url(app:)
Rails.application.config.npq_separation[:parity_check]["#{app}_url".to_sym] + formatted_path
end

def formatted_path
@formatted_path ||= begin
return path unless options[:id] && path.include?(":id")

raise UnsupportedIdOption, "Unsupported id option: #{options[:id]}" unless respond_to?(options[:id], true)

path.sub(":id", send(options[:id]).to_s)
end
end

def application_ecf_id
lead_provider.applications.order("RANDOM()").limit(1).pick(:ecf_id)
end

def declaration_ecf_id
Declaration.where(lead_provider:).order("RANDOM()").limit(1).pick(:ecf_id)
end

def participant_outcome_ecf_id
ParticipantOutcome
.includes(declaration: { application: :user })
.where(declaration: { lead_provider: })
.order("RANDOM()")
.limit(1)
.pick("users.ecf_id")
end

def participant_ecf_id
User
.includes(:applications)
.where(applications: { lead_provider:, lead_provider_approval_status: :accepted })
.order("RANDOM()")
.limit(1)
.pick(:ecf_id)
end

def statement_ecf_id
lead_provider.statements.order("RANDOM()").limit(1).pick(:ecf_id)
end
end
end
59 changes: 59 additions & 0 deletions config/parity_check_endpoints.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,66 @@
get:
"/api/v1/npq-applications":
paginate: true
"/api/v2/npq-applications":
paginate: true
"/api/v3/npq-applications":
paginate: true

"/api/v1/npq-applications/:id":
id: application_ecf_id
"/api/v2/npq-applications/:id":
id: application_ecf_id
"/api/v3/npq-applications/:id":
id: application_ecf_id

"/api/v1/participant-declarations":
paginate: true
"/api/v2/participant-declarations":
paginate: true
"/api/v3/participant-declarations":
paginate: true

"/api/v1/participant-declarations/:id":
id: declaration_ecf_id
"/api/v2/participant-declarations/:id":
id: declaration_ecf_id
"/api/v3/participant-declarations/:id":
id: declaration_ecf_id

"/api/v1/participants/npq/outcomes":
paginate: true
"/api/v2/participants/npq/outcomes":
paginate: true
"/api/v3/participants/npq/outcomes":
paginate: true

"/api/v1/participants/npq/:id/outcomes":
id: participant_outcome_ecf_id
"/api/v2/participants/npq/:id/outcomes":
id: participant_outcome_ecf_id
"/api/v3/participants/npq/:id/outcomes":
id: participant_outcome_ecf_id

"/api/v1/participants/npq":
paginate: true
"/api/v2/participants/npq":
paginate: true
"/api/v3/participants/npq":
paginate: true

"/api/v1/participants/npq/:id":
id: participant_ecf_id
"/api/v2/participants/npq/:id":
id: participant_ecf_id
"/api/v3/participants/npq/:id":
id: participant_ecf_id

"/api/v3/statements":
paginate: true
"/api/v3/statements?filter[cohort]=2021":
paginate: true
"/api/v3/statements?filter[updated_since]=2024-01-01T12:00:00Z":
paginate: true

"/api/v3/statements/:id":
id: statement_ecf_id
55 changes: 55 additions & 0 deletions spec/models/migration/parity_check/response_comparison_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,44 @@
expect(response_comparison.ecf_response_body).to be_nil
expect(response_comparison.npq_response_body).to be_nil
end

it "converts CSV response bodies to a hexdigest" do
response_comparison = build(:response_comparison, request_path: "/path.csv", ecf_response_body: "ecf_response", npq_response_body: "npq_response")
response_comparison.valid?

expect(response_comparison.ecf_response_body).to eq("051e069f1405735e3c348cae238eaee95a20c898e950e23c63f118df1518327e")
expect(response_comparison.npq_response_body).to eq("bdbfa59f301e06b4598976d83df868f62b2c1ce3aad679ee9034370fdaf9ea31")
end

it "performs a deep sort on JSON response bodies" do
response_comparison = build(:response_comparison, ecf_response_body: %({ "foo": "bar", "baz": [2, 1] }), npq_response_body: %({ "foo": "baz", "bar": [5, 1, 4] }))
response_comparison.valid?

expect(response_comparison.ecf_response_body).to eq(
<<~JSON.strip,
{
"baz": [
1,
2
],
"foo": "bar"
}
JSON
)

expect(response_comparison.npq_response_body).to eq(
<<~JSON.strip,
{
"bar": [
1,
4,
5
],
"foo": "baz"
}
JSON
)
end
end

describe "scopes" do
Expand Down Expand Up @@ -108,6 +146,23 @@
DIFF
)
}

context "when the response bodies are JSON" do
let(:instance) { create(:response_comparison, :different, ecf_response_body: %({ "baz": "baz", "foo": "bar" }), npq_response_body: %({ "foo": "bar", "baz": "qux" })) }

it {
expect(diff.to_s(:text)).to eq(
<<~DIFF,
{
- "baz": "baz",
+ "baz": "qux",
"foo": "bar"
}
\\ No newline at end of file
DIFF
)
}
end
end

describe "#response_times_by_path" do
Expand Down
Loading

0 comments on commit 37ca1c6

Please sign in to comment.