Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

suite.placeholder: Speed up ansible.cephlab #1821

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

zmc
Copy link
Member

@zmc zmc commented Mar 13, 2023

By skipping the tags "user,pubkeys,nagios,nrpe" we can save ~3min per job, assuming users won't need to ssh in to a testnode while a job is running - which is probably very close to 100% of jobs.

@zmc
Copy link
Member Author

zmc commented Mar 13, 2023

By skipping the tags "user,pubkeys,nagios,nrpe" we can save ~3min per
job, assuming users won't need to ssh in to a testnode while a job is
running - which is probably very close to 100% of jobs.

Signed-off-by: Zack Cerza <[email protected]>
@zmc
Copy link
Member Author

zmc commented Mar 23, 2023

Tested here: https://pulpito.ceph.com/zack-2023-03-23_17:45:29-smoke-main-distro-default-smithi/

ansible timing (skipping user,pubkeys,nagios,nrpe)

Thursday 23 March 2023  18:01:48 +0000 (0:00:00.384)       0:04:48.939 ******** 
=============================================================================== 
testnode : Install Amazon::S3. ---------------------------------------- 115.70s
testnode : Ensure perl-doc and cpanminus is installed on apt systems. -- 33.78s
testnode : Upgrade packages -------------------------------------------- 16.40s
container-host : Install container packages ---------------------------- 14.41s
testnode : Zap all non-root disks --------------------------------------- 9.45s
common : Update apt cache ----------------------------------------------- 4.17s
Gathering Facts --------------------------------------------------------- 3.44s
testnode : Create logical volume(s) ------------------------------------- 3.27s
container-host : Restart docker service --------------------------------- 2.49s
testnode : Blow away lingering OSD data and FSIDs ----------------------- 2.21s
testnode : Install packages --------------------------------------------- 2.10s
common : Mask sleep units ----------------------------------------------- 1.93s
container-host : Install registries-conf-ctl ---------------------------- 1.86s
testnode : Install packages via pip ------------------------------------- 1.69s
Gathering Facts --------------------------------------------------------- 1.31s
testnode : Write /scratch_devs ------------------------------------------ 1.12s
Gathering Facts --------------------------------------------------------- 1.11s
testnode : Update apt cache. -------------------------------------------- 1.09s
Gathering Facts --------------------------------------------------------- 1.07s
Gathering Facts --------------------------------------------------------- 0.95s

4m48s

@zmc
Copy link
Member Author

zmc commented Mar 23, 2023

Tested here: https://pulpito.ceph.com/zack-2023-03-23_18:57:45-smoke-main-distro-default-smithi/

ansible timing (skipping user,pubkeys,nagios,nrpe)

Thursday 23 March 2023  19:09:45 +0000 (0:00:00.410)       0:02:13.301 ******** 
=============================================================================== 

testnode : Upgrade packages -------------------------------------------- 17.71s

container-host : Install container packages ---------------------------- 13.81s
testnode : Zap all non-root disks --------------------------------------- 8.35s
testnode : Install packages --------------------------------------------- 3.18s
common : Update apt cache ----------------------------------------------- 3.10s
Gathering Facts --------------------------------------------------------- 2.80s
testnode : Install packages via pip ------------------------------------- 2.78s
testnode : Create logical volume(s) ------------------------------------- 2.29s
container-host : Restart docker service --------------------------------- 2.29s
container-host : Install registries-conf-ctl ---------------------------- 2.15s
common : Mask sleep units ----------------------------------------------- 2.07s
testnode : Blow away lingering OSD data and FSIDs ----------------------- 1.46s
testnode : Update apt cache. -------------------------------------------- 1.33s
testnode : List any leftover Ceph artifacts from previous jobs ---------- 1.18s

Gathering Facts --------------------------------------------------------- 1.17s
Gathering Facts --------------------------------------------------------- 1.17s
testnode : Stop apache2 ------------------------------------------------- 1.14s

Gathering Facts --------------------------------------------------------- 1.09s
Gathering Facts --------------------------------------------------------- 0.95s

testnode : Install apt keys ---------------------
------------------------ 0.92s

2m13s

@zmc
Copy link
Member Author

zmc commented Mar 23, 2023

Tested here: https://pulpito.ceph.com/zack-2023-03-22_22:52:17-smoke-main-distro-default-smithi/

ansible timing (skipping nothing - current main branch behavior)

Thursday 23 March 2023  05:16:06 +0000 (0:00:00.427)       0:07:42.674 ******** 
=============================================================================== 
testnode : Install Amazon::S3. ---------------------------------------- 106.82s
users : Create all admin users with sudo access. ----------------------- 59.39s
users : Update authorized_keys using the keys repo --------------------- 46.26s
testnode : Ensure perl-doc and cpanminus is installed on apt systems. -- 23.61s

container-host : Install container packages ---------------------------- 14.29s
testnode : Upgrade packages --------------------------------------------- 5.86s
users : Remove revoked users -------------------------------------------- 4.63s

testnode : Create logical volume(s) ------------------------------------- 3.24s
users : Update authorized_keys for each user with literal keys ---------- 2.69s

Gathering Facts --------------------------------------------------------- 2.57s

container-host : Restart docker service --------------------------------- 2.43s
testnode : Update apt cache. -------------------------------------------- 2.14s

testnode : Install apt keys --------------------------------------------- 1.81s
container-host : Install registries-conf-ctl ---------------------------- 1.80s

testnode : List any leftover Ceph artifacts from previous jobs ---------- 1.26s

testnode : Write /scratch_devs ------------------------------------------ 1.24s
Gathering Facts --------------------------------------------------------- 1.22s

testnode : Install packages via pip ------------------------------------- 1.18s
Gathering Facts --------------------------------------------------------- 1.14s

ansible-managed : Create the cephlab_sudo sudoers.d file. --------------- 1.08s

7m42s

@zmc
Copy link
Member Author

zmc commented Mar 23, 2023

So, if this behavior is consistent (and safe) it could save ~5m30s per job.

@kshtsk
Copy link
Contributor

kshtsk commented Jul 2, 2024

so, what is the decision?

@kshtsk
Copy link
Contributor

kshtsk commented Aug 8, 2024

took another look, looks like it is good speed improvement, just wondering if there is a sense to include user,pubkeys by user request, like a teuthology-suite options, so it can override this options... or instead of doing this via ansible there is more sense to implement this is a provisioning feature
from my experience, when we run teuthology tests downstream, or "hypothetically" if the one might want to run test on customer side, they for sure don't want to give access to their nodes for anyone from ceph/keys

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants