Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pthread_mutex_t dangling pointers #111

Closed
wants to merge 8 commits into from

Conversation

digitaliz
Copy link

@digitaliz digitaliz commented Apr 29, 2021

Fixes #110

There is some background reasoning on the original issue #110 (comment)

@digitaliz digitaliz force-pushed the fix-pthread_mutex_t-dangling-pointer branch from 0949384 to ca1a7a2 Compare April 29, 2021 10:52
@digitaliz digitaliz marked this pull request as ready for review May 12, 2021 15:29
@digitaliz digitaliz requested a review from a team as a code owner May 12, 2021 15:29
func unlock() {
pthread_mutex_unlock(self)
}

/// Will lock `self`, call `block`, then unlock `self`
Copy link
Author

@digitaliz digitaliz May 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed protect because it triggered Thread 1: Simultaneous accesses to 0x1032d57a8, but modification requires exclusive access when running the FlowTests. I currently don't have a good enough mental model to understand why that is (but a guess is recursively calling protect, starts two possible mutating scopes). Since I was unable to clearly reason about this construct, and for now can't do more research, I decided to remove it in favour of straight mutex lock/unlock that I understand better.

Unclear to me if it's supported
@digitaliz digitaliz force-pushed the fix-pthread_mutex_t-dangling-pointer branch from 4c49587 to f32c157 Compare May 17, 2021 15:24
@@ -41,9 +41,11 @@ public final class FutureQueue<Resource> {
queueScheduler = executeOn
OSAtomicIncrement32(&futureQueueUnitTestAliveCount)
memPrint("Queue init", futureQueueUnitTestAliveCount)
mutex.initialize(as: .recursive)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before adding the ability to initialize this as a recursive mutex, it deadlocked in FlowTests.FutureQueueTests testBatchQueue.

PhilippOtto89
PhilippOtto89 previously approved these changes May 17, 2021
Copy link
Contributor

@PhilippOtto89 PhilippOtto89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great that you looked into this... lets check how it works!

@alasdairlaw
Copy link

alasdairlaw commented May 21, 2021

Hey, I think it's possible we've encountered this issue before;

We've been seeing stack traces where it looks like we crash trying to use this nonatomic dangling mutex pointer 💀

I'm far from an expert on it, but from my understanding it seems to fit with your & The Eskimo's explanation of what's going on here.
It also seems to fit with the The Eskimo's prediction that this could happen when we change unrelated code, so we've had a hard time trying to reproduce the crash.

@alasdairlaw
Copy link

Is everyone happy to merge this? Would be great to have his fix available to use

@moglistree
Copy link
Contributor

Hey @alasdairlaw we are sing some issues with this update on the Go apps side. Looking in to it for how to work around them.

@digitaliz
Copy link
Author

@moglistree We should open up an issue/PR for the problems we see with UITests in Go project. I'll write a comment here for now to just capture thoughts/where we are right now.

I've just done bit of investigation on the failing UITests, my results are that they are all happening in edit mode for ProductDetailView, ie the view for adding a product to the Product Library. I still need to verify but seem connected to testing of input fields in that view. Specifically for fields presenting a keyboard I think it could be that at the point of dismissal/relinquish main responder, main thread is deadlocked.

The abstraction distance between this change and failing UITests in Go app is quite long. My suggestion is that I should pair with someone that has experience with the product detail view and it's UITests, we can possibly figure it out much faster together? I'll ping @achernenko-pp here to get his thoughts :)

@digitaliz
Copy link
Author

digitaliz commented Jun 4, 2021

Follow-up thought for a possible action, for the Flow test suite to pass I had to enable recursive locks in FutureQueue, #111 (comment). The idea I have now is that the failing test in PL UITests is because the same type of issue, there is re-entrant locking going on but the lock is not enabled as recursive. My thinking is that this could be something that is not captured by the Flow tests suite. So for a next action I'm going to see if making Flow locks recursive by default makes the PL UITests pass.

@digitaliz
Copy link
Author

digitaliz commented Jun 5, 2021

CircleCI is down atm. but locally Flow test suite passes after making recursive locks the default.

Running PL UITests locally after fastlane update frameworks, it seems making recursive locks the default does not make the PL UITests pass. I'd like to verify with @moglistree it's actually running the version with default recursive locks before conclusively closing that investigation.

An activity I think could be next up is to manually execute the UITests that are failing, if failing behaviour is experienced then we can rule out that the UI testing code/framework itself is involved in the failure.

@digitaliz digitaliz closed this Sep 17, 2021
@digitaliz digitaliz deleted the fix-pthread_mutex_t-dangling-pointer branch September 17, 2021 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Warnings: Initialization of 'PThreadMutex' results in a dangling pointer
4 participants