Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add health checker #119

Closed
wants to merge 3 commits into from
Closed

Add health checker #119

wants to merge 3 commits into from

Conversation

mxpv
Copy link
Contributor

@mxpv mxpv commented Aug 29, 2019

Signed-off-by: Maksym Pavlenko [email protected]

Issue #, if available:
#115

Description of changes:
Adds IsAlive

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Signed-off-by: Maksym Pavlenko <[email protected]>
sipsma
sipsma previously approved these changes Aug 29, 2019
machine.go Outdated
@@ -707,6 +707,13 @@ func (m *Machine) UpdateGuestDrive(ctx context.Context, driveID, pathOnHost stri
return nil
}

// IsAlive makes sure that a Firecracker instance is running and responds via API by
// calling `DescribeInstance` endpoint behind.
func (m *Machine) IsAlive(ctx context.Context) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like one of the valid states that can be returned by DescribeInstance is Halted: https://github.com/firecracker-microvm/firecracker/blob/35b73437641122865b1603bd74c144035e1790c9/api_server/swagger/firecracker.yaml#L453

There's also Uninitialized. That makes me feel like IsAlive is not really what one would expect. It seems more like it's checking whether the VM manager is alive rather than what we'd consider the Machine.

If we want to keep the behavior as is, I can't think of any great names, but maybe something like Exists would be better than IsAlive?

On the other hand, would it make sense to change the behavior and return an error here if the State is Halted or Uninitialized? What's the actual motivation behind this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I accidentally clicked the approve button, I think this should be addressed before merging)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(@sipsma - I've "dismissed" your review per your comment here.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions for names: IsVMMAlive, IsVMMResponsive, Reachable

@sipsma sipsma self-requested a review August 29, 2019 22:03
@samuelkarp samuelkarp dismissed sipsma’s stale review August 29, 2019 22:07

"I accidentally clicked the approve button, I think this should be addressed before merging" - @sipsma

Signed-off-by: Maksym Pavlenko <[email protected]>
return err
}

state := StringValue(instance.Payload.State)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes sense to verify the state if the purpose of this method is to see whether Firecracker responds to an API. I think getting any non-error response would meet the success criteria.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the motivation behind the change is to test the responsiveness of the API, then agreed and also think IsVMMResponsive is a good name.

If the motivation is instead to test that the Machine is in the running state, then can we just add a State function that returns the state value, whatever it may be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haikuoliu which behavior would suite better here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now I've renamed it to IsRunning which tells whether a VM is in Starting or Running state.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit confusing to see that we have Running as a state, and IsRunning actually means IsRunningOrStarting. Naming is hard...

I'm inclined to expose Firecracker's state as is instead of introducing our own notion of the state (running, responsive, alive, ...).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The status type could provide rich methods later, like https://godoc.org/inet.af/http#Status.IsClientError (inet.af is bradfitz's net/http rework proposal, which is referenced from golang/go#29652)

Signed-off-by: Maksym Pavlenko <[email protected]>
@mxpv
Copy link
Contributor Author

mxpv commented Sep 15, 2019

Closing until we figure out requirements.

@mxpv mxpv closed this Sep 15, 2019
@mxpv mxpv deleted the ping branch September 15, 2019 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants