Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed use case when bundle start level will be set within an bundle activator, test case FrameworkStartLevelImplTest #190

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

JochenHiller
Copy link

We found problems when manipulating start levels from a bundle activator.
This test case can reproduce that, see console logs.

We can also extend that to check on results adding some assertions

@JochenHiller
Copy link
Author

Related to https://issues.apache.org/jira/browse/FELIX-6586

@JochenHiller
Copy link
Author

JochenHiller commented Dec 16, 2022

@karlpauls @cziegeler I added now a proposal for a fix of the test case.

From my point of view there are two problems happening:

  1. When a bundle start level will be changed from a bundle activator which is started by FrameworkStartLevel thread, the way to process that is by default asynchronous. Which does not work when bundle startup is already done by FrameworkStartLevel thread, so the processing has to be done in current thread synchronous by calling m_felix.setBundleStartLevel() on Felix framework

  2. When this method m_felix.setBundleStartLevel() will be called, there is already a computed TreeMap of m_startLevelBundles in Felix. This list contains the bundle with old start level. So new start level requested needs to be updated by removing old tuple and adding new tuple to get it in right sort-order in TreeMap.

I found that by adding a huge number of debug messages. I will leave them in for your testing. If you would agree on change, I would remove all debug output again, and do some re-formatting of code to Felix code style.

My tests are running fine, also the whole Felix framework tests did also work well.
Do you consider to re-run OSGi TCK test suite for a complete test? I can offer to support on that as well if helpful.

@JochenHiller JochenHiller changed the title Added test case for FrameworkStartLevelImpl Fixed use case when bundle start level will be set within an bundle activator, test case FrameworkStartLevelImplTest Dec 16, 2022
@tjwatson
Copy link
Member

  1. When a bundle start level will be changed from a bundle activator which is started by FrameworkStartLevel thread, the way to process that is by default asynchronous. Which does not work when bundle startup is already done by FrameworkStartLevel thread, so the processing has to be done in current thread synchronous by calling m_felix-setBundleStartLevel() on Felix framework

I'm not in the position to review this change, but I will state that the specification requires that the start-level value be persisted synchronously, but then the actual act of starting or stopping the bundle be done asynchronously in reaction to the bundle start-level being changed with org.osgi.framework.startlevel.BundleStartLevel.setStartLevel(int). That should include being done by the actual start-level processing thread.

If a bundle changes another bundles start-level during the start-level processing such that the bundle must be stopped or started according to the current start level then that operation must be queued up to happen asynchronously. That is until the current start-level framework setting is fully processed to the "final" start-level. Any bundles that have not been processed yet should use the synchronously persisted start-level value for the bundle. Any bundles that have already been processed by the framework start-level change operation will "catch-up" after the "final" framework start-level has be reached and processed. That is they will get the async job that got queued up run at this point to either stop/start the bundle according to the current state of the framework start-level and the bundle's start-level.

@JochenHiller
Copy link
Author

Thanks a lot for your comment @tjwatson

If a bundle changes another bundles start-level during the start-level processing such that the bundle must be stopped or started according to the current start level then that operation must be queued up to happen asynchronously. That is until the current start-level framework setting is fully processed to the "final" start-level. Any bundles that have not been processed yet should use the synchronously persisted start-level value for the bundle. Any bundles that have already been processed by the framework start-level change operation will "catch-up" after the "final" framework start-level has be reached and processed. That is they will get the async job that got queued up run at this point to either stop/start the bundle according to the current state of the framework start-level and the bundle's start-level.

If I get you right, than the test case is invalid, or? If I first set framework start level of 11 (lower than the changed 20-40) after bundle M has been running, and then AFTERWARDS go to 100 then it does work, bundle start order will be recalculated. I verified that with test case.

I am unsure what shall happen if initial bundle start level is e.g. 12, bundle M will manipulate to 20-40, and framework start level is 15. We got this ERROR message

ERROR: Bundle A [1] Error starting file:/var/folders/vy/jx2pf8_12ygf55yqsp6xz9t00000gq/T/generated-bundles5059031965402415424.dir/bundleA-6783478623024065226.jar (org.osgi.framework.BundleException: Cannot start bundle A [1] because its start level is 40, which is greater than the framework's start level of 12.)
org.osgi.framework.BundleException: Cannot start bundle A [1] because its start level is 40, which is greater than the framework's start level of 12.
	at org.apache.felix.framework.Felix.startBundle(Felix.java:2227)
	at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1590)
	at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:315)
	at java.lang.Thread.run(Thread.java:748)

Is that ERROR ok if we change the bundle start level during the phase to go to request framework start level?
It seems it does not have any implications except we get this ERROR log.

@tjwatson
Copy link
Member

Is that ERROR ok if we change the bundle start level during the phase to go to request framework start level?
It seems it does not have any implications except we get this ERROR log.

I don't believe that error is OK. In my opinion the framework should avoid logging an error in this case and move on in the processing of the rest of framework start-level changes. But that is about it. It sounds like the framework is acting upon a snap-shot of the bundle start-levels for the duration of the framework start-level processing. Some bundle's start-levels may change that make it impossible to actual start the bundle at the previously determined bundle start-level snap-shot, because it has changed synchronously.

From a spec perspective, I think this is fine as long as the async operation that gets queued up will stop/start the bundle according to the levels that got set during the whole framework start-level change processing.

@pkriens
Copy link
Contributor

pkriens commented Dec 19, 2022

@tjwatson I agree. The behavior is ok, there should just not be an error reported in this case.

@JochenHiller
Copy link
Author

@karlpauls @cziegeler
Based on discussions above, I would propose:

  • I will recall this PR, as there is indeed no functional bug in Felix framework
  • I propose to rethink if this ERROR log should maybe a WARN log instead. Maybe in source code there could more clarification how this can happen and if there are investigations by framework user needed, e.g. check the way start levels will be set during startup
  • I you consider that as useful, I could re-work the test cases to avoid this problem, and use the test case as sample how to use the start levels CORRECT.

Any thoughts?

@JochenHiller
Copy link
Author

JochenHiller commented Dec 19, 2022

I've updated the PR according to my proposal above.

Feedback welcome.

@pkriens
Copy link
Contributor

pkriens commented Jan 31, 2023

@tjwatson @karlpauls @cziegeler Would it be possible to take this PR?

The log error message is not warranted here imho. In very large system, when this happens, it creates an avalanche of error logs. Would be very much appreciated if this could be in the next release.

@tjwatson
Copy link
Member

@karlpauls or @cziegeler should review the framework code.

@pkriens
Copy link
Contributor

pkriens commented Feb 1, 2023

@karlpauls or @cziegeler should review the framework code.

Any idea how to motivate them? :-) I could offer a beer but my (beloved) travel restrictions keep me out of their hoods ...

@cziegeler
Copy link
Contributor

I'll try to stay away from touching the framework code and leave it up to @karlpauls . I'm sure he'll have a look as soon as he has time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants