Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUM-6263 Trigger based approach #2074

Open
wants to merge 17 commits into
base: performance_improvements
Choose a base branch
from

Conversation

maciejburda
Copy link
Contributor

@maciejburda maciejburda commented Oct 3, 2024

What and why?

This PR changes the default behaviour of the SR recorder from scheduler to trigger based approach. Significantly reducing the SR footprint on idle.

How?

It's achieved by introducing RecordingTrigger which is responsible for subscribing to two events: UIView.layoutSubviews and UIApplication.sendEvent using obj-c swizzling. Trigger takes a weak dependency on the recording coordinator to start/stop it, as well as invoke captureNextRecord.

Those events happen frequently (every touch interaction and each view's layout change), they are throttled for 100ms (which matches scheduler capture frequency).

Below you can find screenshots from the basic Xcode profiler showcasing CPU usage for similar scenario (go through checkout and stay idle on dashboard):

Before
Screenshot 2024-10-09 at 13 15 16

After
Screenshot 2024-10-09 at 13 09 47

Review checklist

  • Feature or bugfix MUST have appropriate tests (unit, integration)
  • Make sure each commit and the PR mention the Issue number or JIRA reference
  • Add CHANGELOG entry for user facing changes

@datadog-datadog-prod-us1
Copy link

datadog-datadog-prod-us1 bot commented Oct 4, 2024

Datadog Report

Branch report: maciey/RUM-6263-trigger-based-recording
Commit report: 82c2e7a
Test service: dd-sdk-ios

❌ 1 Failed (0 Known Flaky), 3236 Passed, 0 Skipped, 1m 54.38s Total Time
🔻 Test Sessions change in coverage: 1 decreased (-0.01%), 4 increased, 5 no change

❌ Failed Tests (1)

  • test_whenEmptyRUMContextAndTriggered_itShouldNotRecord - RecordingCoordinatorTests - Details

    Expand for error
     
     Assertion Failure at RecordingCoordinatorTests.swift:97: XCTAssertEqual failed: ("1") is not equal to ("0")
     Assertion Failure at RecordingCoordinatorTests.swift:98: XCTAssertEqual failed: ("1") is not equal to ("2")
    

🔻 Code Coverage Decreases vs Default Branch (1)

  • test DatadogRUMTests iOS 81.16% (-0.01%) - Details

@maciejburda maciejburda marked this pull request as ready for review October 9, 2024 11:01
@maciejburda maciejburda requested review from a team as code owners October 9, 2024 11:01
Copy link
Collaborator

@ncreated ncreated left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great optimization idea 👍 💯 . I left few feedbacks, but most importantly a change request to offload the main thread from unnecessary tasks.

@maciejburda maciejburda force-pushed the maciey/RUM-6263-trigger-based-recording branch from ce52eaa to 2c0baea Compare October 22, 2024 12:47
@maciejburda maciejburda changed the base branch from develop to performance_improvements October 22, 2024 12:47
@maciejburda
Copy link
Contributor Author

I have addressed all the feedback, which led to small refactor. But I think it makes more sense now.

Key change is that trigger is a dependency of coordinator and it notifies it through a delegate pattern.

On checking shouldSkipFrame outside of queue. I made some tests and it's significantly more performant when done from within the queue.
Screenshot 2024-10-23 at 13 37 00
Screenshot 2024-10-23 at 13 38 20

Copy link
Member

@maxep maxep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍 I have left some suggestions and a change-request regard thread safety

stopRecording()
} catch {
telemetry.error("[SR] Failed to take snapshot", error: error)
queue.run { [weak self] in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/question We can assume we are already on the main thread, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this runs async job on main thread

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the captureNextRecord is called by triggers from the main thread, so I wonder why we jump on the main thread again?

Comment on lines 123 to 125
private var shouldSkipFrame: Bool {
return dateProvider.now.timeIntervalSince(lastFrameTimestamp ?? .distantPast) < Constants.throttingRate
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/suggestion We could have a Debouncer object that could take care of this, similar to the future TimeBank, WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally - but for now I'd suggest merging this as is - and refactor once we have timebank in place. It will shed some more slight on our approach of unit testing this

@maciejburda
Copy link
Contributor Author

maciejburda commented Oct 23, 2024

Addressed all the requested changes, and did some extra tests (internal).

SR Enabled:
Screenshot 2024-10-23 at 15 39 04
SR Disabled:
Screenshot 2024-10-23 at 15 40 40

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants