Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting the MAC table fails on recent ONIE switches #7463

Closed
kcgthb opened this issue Aug 8, 2024 · 2 comments
Closed

Getting the MAC table fails on recent ONIE switches #7463

kcgthb opened this issue Aug 8, 2024 · 2 comments
Labels
Milestone

Comments

@kcgthb
Copy link
Member

kcgthb commented Aug 8, 2024

As describe in #7462 , getting the MAC table from ONIE switches is initially attempted via paswordless SSH:

The issue is that the command used in MacMap.pm doesn't seem to work on more recent Cumulus versions (I'm using 5.9), likely due to a path change:

my @res=xCAT::Utils->runcmd("ssh -o StrictHostKeyChecking=no -o PasswordAuthentication=no $switch 'bridge fdb show|grep -i -v permanent|tr A-Z a-z 2>/dev/null' 2>/dev/null",-1);

$ ssh $switch 'bridge fdb show'
Debian GNU/Linux 12
bash: line 1: bridge: command not found

Although it would be possible to force the full path:

$ ssh $switch '/usr/sbin/bridge fdb show'
Debian GNU/Linux 12
b0:cf:0e:0d:11:72 dev swp7 master br_default permanent
b0:cf:0e:0d:11:73 dev swp8 master br_default permanent
b0:cf:0e:0d:11:28 dev swp9 master br_default permanent
[...]

that's probably not a good fix, as it may not survive new path changes in the future.

But the real issue here is that the "command not found" error is masked by the grep and tr pipes, and the initial error code is not returned through SSH. So, from the script's perspective, the ssh command succeeds, but doesn't return any MAC address, so refresh_switch() returns an empty list. And nodes cannot be discovered.

The bridge command is not found because running ssh host cmd will not start a login shell, so the remote PATH variable is not properly set. So maybe the remote command should be wrapped in a login shell, to make sure all the environment is set correctly and commands are found?

I'll propose a PR shortly.

kcgthb added a commit to stanford-rc/xcat-core that referenced this issue Aug 8, 2024
- run the remote `bridge` command in a login shell, to make sure PATH is
  properly defined
- add `set -o pipefail` to ensure that errors are properly propagated
  back through the remote SSH command
@kcgthb
Copy link
Member Author

kcgthb commented Aug 8, 2024

Proposed fix in #7464

@Obihoernchen
Copy link
Member

Merged

@Obihoernchen Obihoernchen added this to the 2.17 milestone Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants