way to elegantly stop MM2 in the middle of the execution #10

mauricioaniche · 2015-11-26T16:50:15Z

No description provided.

davisjam · 2017-09-28T16:25:29Z

I think we could do this readily in RepositoryMining.

"Finish processing the current repo"
"Finish processing the current commit"

Implementation options:

Make mine asynchronous and add methods to poll for completion as well as to inject "finish up" requests.
Add thread-safe RepositoryMining APIs and have mine check the state variables appropriately.

@mauricioaniche Preference?

mauricioaniche · 2017-09-28T18:32:46Z

I think we can do that in RepositoryMining#mine too. I even thought about an even simpler solution: at every iteration of the loop, we check whether System.getenv("repodrillerstop") is equals to TRUE. If so, we gracefully stop repodriller.

What do you think?

The rationale behind the feature suggestion: in practice, what happens is that, you put repodriller to run, and a few minutes later you find a problem by inspecting the CSV, and you want to stop it. When you kill repodriller's process directly, depending on the task that repodriller is doing, it leaves the Git project repository in a bad state; then, in the next run, repodriller fails because of it.

davisjam · 2017-09-28T19:51:32Z

I'm a bit confused about your environment variable idea. I thought that once a process A has begun, another process B cannot (easily) change A's environment variables. SO seems to agree, e.g. this question.

Fundamentally, to exit gracefully, a RepoDriller process needs to monitor for external input somehow (signal handler, magic file, etc.) and then convince the mine method to finish.

On this topic...does RepoDriller strive to work outside of UNIX environments?

mauricioaniche · 2017-09-29T09:35:33Z

Woah, I did not know that! Interesting!

Yes, we should make sure it works in Windows as well, as we do have students using Windows machines.

A magic file, on the same directory of the jar file should do (and simple to implement). Sounds good?

davisjam · 2017-09-29T21:49:11Z

Design

The overall design is independent of mechanism:

Some background thread listens for requests
It "does something"
It sends a response back

Requests might be "what is the status" (#75), "please die gracefully" (#10), etc.

External interface

To be honest I'm not really thrilled with the "magic file" approach as a portable signaling mechanism. It would work for this specific case, but it wouldn't extend well to other future needs.

Since RepoDriller is a framework it should opt for readily-extendable mechanisms.
Since it needs to be cross-platform, it should be something more generic than FS-based.

I suggest that the communication mechanism be HTTP-based instead. Then you can use your favorite HTTP Client (wget, curl, etc.) to talk to RepoDriller. This will be more portable than using the FS.

This will also be easier to use in a distributed setting, as you described in #24.

mauricioaniche · 2017-09-30T19:32:02Z

The HTTP is indeed a very elegant solution, but I fear it will just be a waste of our limited development time. I do not expect this feature to be that popular; I basically use it when I'm still testing/debugging my study. So, hopefully, other people also don't need fancy ways to stop the execution.

The magic file, although less elegant, will cost us just 2 lines of code.

I suggest we go for the simple implementation right now. Then, if we see more people asking about a fancier solution, we go for the HTTP solution.

davisjam · 2017-10-01T01:32:32Z

I don't think the HTTP approach is all that scary. How to extend the various components to support various reporting/response mechanisms is the bigger question.

@mauricioaniche Can you advise on the places that might need extension? Here's my list so far:

Die gracefully

RepositoryMining: An API that says "please die gracefully".
CommitVisitor: We'll be terminating the study early. CommitVisitors should have their finalize callback invoked. Do they need a special finalizeEarly callback, or a way to know that they are being terminated early?
Anything else affected?

The minimal change to support exiting early is just to short-circuit the loop in mine. I think this should be fine, because the expected use case as I understand it is to abort a study gone bad. I don't think informing the CommitVisitors about the exact circumstance of their finalize callback really matters.

Progress report

RepositoryMining: An API to say "how far along are you?". The resulting ProgressReport would include the progress through the current repo (based on the CommitRange, which is potentially an upper bound due to filters), the number of completed repositories, and the number of pending repos. This is pretty easily obtained by having RepositoryMining maintain a MiningStatistics object.

Other things we might want to be able to query on-the-fly
Ideas?

mauricioaniche added the enhancement label Nov 26, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

way to elegantly stop MM2 in the middle of the execution #10

way to elegantly stop MM2 in the middle of the execution #10

mauricioaniche commented Nov 26, 2015

davisjam commented Sep 28, 2017

mauricioaniche commented Sep 28, 2017

davisjam commented Sep 28, 2017

mauricioaniche commented Sep 29, 2017

davisjam commented Sep 29, 2017

mauricioaniche commented Sep 30, 2017

davisjam commented Oct 1, 2017

way to elegantly stop MM2 in the middle of the execution #10

way to elegantly stop MM2 in the middle of the execution #10

Comments

mauricioaniche commented Nov 26, 2015

davisjam commented Sep 28, 2017

mauricioaniche commented Sep 28, 2017

davisjam commented Sep 28, 2017

mauricioaniche commented Sep 29, 2017

davisjam commented Sep 29, 2017

Design

External interface

mauricioaniche commented Sep 30, 2017

davisjam commented Oct 1, 2017