Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only requeue compaction when there was activity #759

Merged
merged 1 commit into from
Nov 9, 2018

Conversation

keith-turner
Copy link
Contributor

While working on an experimental compaction strategy for Fluo for apache/fluo#1054 I ran into a situation where Accumulo was calling my compaction strategy hundreds of thousands of times per second. I tracked down the problem and fixed it in this PR.

@@ -41,12 +41,18 @@ public void run() {
return;
}

tablet.majorCompact(reason, queued);
CompactionStats stats = tablet.majorCompact(reason, queued);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have thought Spotbugs would have dinged this method returning stats that aren't being used...

// make blocking calls to gather information. Without the following check these strategies would
// endlessly requeue. So only check if a subsequent compaction is needed if the previous
// compaction actually did something.
if (stats != null && stats.getEntriesRead() > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this prevent compactions from happening when stats.getEntriesRead() == 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, tablets are checked periodically to see if they need a compaction by the tablet server. This code is just an optimization to re-queue the tablet for compaction if there is still work to do (without waiting for the periodic check). If we run a compaction and it does nothing, its likely there is no follow on compaction work to be done.

For background, tablets are limited in the number of files they will compact at once. So if a tablet had 100 files to compact and the limit is 20, then when it compacts 20 files it would be nice if immediately re-queued instead of waiting for the tablet server to check all tablets again. With these changes, it will still re-queued immediately in that case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK thanks for making that clear.

@keith-turner keith-turner merged commit b6409ef into apache:1.9 Nov 9, 2018
@keith-turner keith-turner deleted the majc-requeue-check branch November 9, 2018 15:36
@ctubbsii ctubbsii added this to the 1.9.3 milestone Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants