Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

way to elegantly stop MM2 in the middle of the execution #10

Open
mauricioaniche opened this issue Nov 26, 2015 · 7 comments
Open

way to elegantly stop MM2 in the middle of the execution #10

mauricioaniche opened this issue Nov 26, 2015 · 7 comments

Comments

@mauricioaniche
Copy link
Owner

No description provided.

@davisjam
Copy link
Contributor

I think we could do this readily in RepositoryMining.

  1. "Finish processing the current repo"
  2. "Finish processing the current commit"

Implementation options:

  1. Make mine asynchronous and add methods to poll for completion as well as to inject "finish up" requests.
  2. Add thread-safe RepositoryMining APIs and have mine check the state variables appropriately.

@mauricioaniche Preference?

@mauricioaniche
Copy link
Owner Author

I think we can do that in RepositoryMining#mine too. I even thought about an even simpler solution: at every iteration of the loop, we check whether System.getenv("repodrillerstop") is equals to TRUE. If so, we gracefully stop repodriller.

What do you think?

The rationale behind the feature suggestion: in practice, what happens is that, you put repodriller to run, and a few minutes later you find a problem by inspecting the CSV, and you want to stop it. When you kill repodriller's process directly, depending on the task that repodriller is doing, it leaves the Git project repository in a bad state; then, in the next run, repodriller fails because of it.

@davisjam
Copy link
Contributor

I'm a bit confused about your environment variable idea. I thought that once a process A has begun, another process B cannot (easily) change A's environment variables. SO seems to agree, e.g. this question.

Fundamentally, to exit gracefully, a RepoDriller process needs to monitor for external input somehow (signal handler, magic file, etc.) and then convince the mine method to finish.

On this topic...does RepoDriller strive to work outside of UNIX environments?

@mauricioaniche
Copy link
Owner Author

Woah, I did not know that! Interesting!

Yes, we should make sure it works in Windows as well, as we do have students using Windows machines.

A magic file, on the same directory of the jar file should do (and simple to implement). Sounds good?

@davisjam
Copy link
Contributor

Design

The overall design is independent of mechanism:

  • Some background thread listens for requests
  • It "does something"
  • It sends a response back

Requests might be "what is the status" (#75), "please die gracefully" (#10), etc.

External interface

To be honest I'm not really thrilled with the "magic file" approach as a portable signaling mechanism. It would work for this specific case, but it wouldn't extend well to other future needs.

  1. Since RepoDriller is a framework it should opt for readily-extendable mechanisms.
  2. Since it needs to be cross-platform, it should be something more generic than FS-based.

I suggest that the communication mechanism be HTTP-based instead. Then you can use your favorite HTTP Client (wget, curl, etc.) to talk to RepoDriller. This will be more portable than using the FS.

This will also be easier to use in a distributed setting, as you described in #24.

@mauricioaniche
Copy link
Owner Author

The HTTP is indeed a very elegant solution, but I fear it will just be a waste of our limited development time. I do not expect this feature to be that popular; I basically use it when I'm still testing/debugging my study. So, hopefully, other people also don't need fancy ways to stop the execution.

The magic file, although less elegant, will cost us just 2 lines of code.

I suggest we go for the simple implementation right now. Then, if we see more people asking about a fancier solution, we go for the HTTP solution.

@davisjam
Copy link
Contributor

davisjam commented Oct 1, 2017

I don't think the HTTP approach is all that scary. How to extend the various components to support various reporting/response mechanisms is the bigger question.

@mauricioaniche Can you advise on the places that might need extension? Here's my list so far:

Die gracefully

  1. RepositoryMining: An API that says "please die gracefully".
  2. CommitVisitor: We'll be terminating the study early. CommitVisitors should have their finalize callback invoked. Do they need a special finalizeEarly callback, or a way to know that they are being terminated early?
  3. Anything else affected?

The minimal change to support exiting early is just to short-circuit the loop in mine. I think this should be fine, because the expected use case as I understand it is to abort a study gone bad. I don't think informing the CommitVisitors about the exact circumstance of their finalize callback really matters.

Progress report

  1. RepositoryMining: An API to say "how far along are you?". The resulting ProgressReport would include the progress through the current repo (based on the CommitRange, which is potentially an upper bound due to filters), the number of completed repositories, and the number of pending repos. This is pretty easily obtained by having RepositoryMining maintain a MiningStatistics object.

Other things we might want to be able to query on-the-fly
Ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants