Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vTPM communication and error handling refactoring #4400

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

shjala
Copy link
Member

@shjala shjala commented Oct 25, 2024

This pull request includes changes to the vtpm, domainmgr and kvm components to improve the handling of virtual TPM (vTPM) instances. The changes enhance error handling, refactor the vTPM launch and termination processes, and introduce HTTP-based communication for vTPM commands.

  • vTPM : refactor control socket communication and error handling
    This changes refactors the control socket communication and error handling
    in the vTPM (server) and KVM (client). The control socket communication
    is now handled by HTTP over UDS, and the error handling is improved,
    since the vTPM server now returns an error message when an error occurs.

  • Domainmgr : refactor virtual TPM setup and termination
    Use a defer function to ensure that the virtual TPM is always terminated
    when the domain manager hits an error during the setup process or boot
    process.

  • vTPM : fix bug when getting launch request
    When server gets a launch request, it checks if the the requested
    instance is already running, but it only checks the internal list and
    not actually the running instances. This can lead to server thinking
    the instance is running but client fails to get the PID with error
    "failed to get pid from file ...".

  • domainmgr : call vTPM asynchronously
    Refactor vTPM setup/term/teardown functions to call the vTPM server
    endpoints asynchronously, this remove the timeout guessworks and make the
    vTPM setup more reliable.

@shjala shjala changed the title Vtpm.server vTPM communication and error handling refactoring Oct 25, 2024
pkg/vtpm/swtpm-vtpm/src/main.go Fixed Show fixed Hide fixed
pkg/vtpm/swtpm-vtpm/src/main.go Fixed Show fixed Hide fixed
pkg/vtpm/swtpm-vtpm/src/main.go Fixed Show fixed Hide fixed
pkg/vtpm/swtpm-vtpm/src/main.go Fixed Show fixed Hide fixed
pkg/vtpm/swtpm-vtpm/src/main.go Fixed Show fixed Hide fixed
pkg/vtpm/swtpm-vtpm/src/main.go Fixed Show fixed Hide fixed
@shjala shjala changed the title vTPM communication and error handling refactoring [WIP] vTPM communication and error handling refactoring Oct 25, 2024
@shjala shjala force-pushed the vtpm.server branch 2 times, most recently from c0960cb to e39c549 Compare October 28, 2024 10:26
This changes refactors the control socket communication and error handling
in the vTPM (server) and KVM (client). The control socket communication
is now handled by HTTP over UDS, and the error handling is improved,
since the vTPM server now returns an error message when an error occurs.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
Use a defer function to ensure that the virtual TPM is always terminated
when the domain manager hits an error during the setup process or boot
process.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
When server gets a launch request, it checks if the the requested
instance is already running, but it only checks the internal list and
not actually the running instances. This can lead to server thinking
the instance is running but client fails to get the PID with error
"failed to get pid from file ...".

Signed-off-by: Shahriyar Jalayeri <[email protected]>
Validate ID before using it in, it must be in form of a UUID.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
Kick watchdog more often when waiting, better be safe
than sorry.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
@shjala shjala force-pushed the vtpm.server branch 3 times, most recently from 36e4d69 to 16be694 Compare October 28, 2024 14:51
@shjala shjala changed the title [WIP] vTPM communication and error handling refactoring vTPM communication and error handling refactoring Oct 28, 2024
Copy link

codecov bot commented Oct 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 20.93%. Comparing base (dd27fe1) to head (4808f88).
Report is 18 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #4400   +/-   ##
=======================================
  Coverage   20.93%   20.93%           
=======================================
  Files          13       13           
  Lines        2895     2895           
=======================================
  Hits          606      606           
  Misses       2163     2163           
  Partials      126      126           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@OhmSpectator
Copy link
Member

@shjala. could you please add pkg/vtpm/swtpm-vtpm/vendor/ into https://github.com/lf-edge/eve/blob/e0b943fd55e773f2000c8e63611f242f615c902c/.spdxignore

@shjala
Copy link
Member Author

shjala commented Oct 29, 2024

@shjala. could you please add pkg/vtpm/swtpm-vtpm/vendor/ into https://github.com/lf-edge/eve/blob/e0b943fd55e773f2000c8e63611f242f615c902c/.spdxignore

sure can do.

@OhmSpectator
Copy link
Member

I'm looking into PR, but I'm frequently interrupted, so it will take time. However, it's not abandoned!

status.VirtualTPM = true
defer func(status *types.DomainStatus) {
if status.BootFailed || status.HasError() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have discussed these conditions with @shjala, and it looks like a more reliable way to check that we do not need to terminate the vTMP is status.Activated set to true, as it guarantees that ActivateTails finished successfully.

@shjala shjala force-pushed the vtpm.server branch 2 times, most recently from d51a9fa to 141ff1b Compare October 29, 2024 13:15
Refactor vTPM setup/term/teardown functions to call the vTPM server
endpoints asynchronously, this remove the timeout guessworks and make the
vTPM setup more reliable.

Refactor vTPM setup functions to accept all watchdog related parameters
as struct.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
The domainmanager calls vTPM server asynchronously, so we dont need to
worry and set the wait time too low to return quicly to prevent a watchdog
kill on pillar.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
@shjala shjala force-pushed the vtpm.server branch 2 times, most recently from 624a4e5 to 4fd6d20 Compare October 29, 2024 13:20
Add vtpm vendor directory to .spdxignore.

Signed-off-by: Shahriyar Jalayeri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants