Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clock detectd is not right for hybrid CPU. #93

Open
edisonchan opened this issue Aug 15, 2022 · 6 comments
Open

clock detectd is not right for hybrid CPU. #93

edisonchan opened this issue Aug 15, 2022 · 6 comments

Comments

@edisonchan
Copy link

edisonchan commented Aug 15, 2022

OS: ubuntu 22.04.1 kernel 5.19.1
CPU: Intel 12900KF.

situation 1
bios: disable all e-core, CPU freq= 4GHz, grub kernel command line add idle=poll(disable C-state):
cpupower show all cores(pcore) run at 4GHz.
then:
export UARCH_BENCH_CLOCK_MHZ=4000
sudo ./uarch-bench.sh

Using timer: clock
Welcome to uarch-bench (e4f54d5)
Supported CPU features: SSE3 PCLMULQDQ VMX EST TM2 SSSE3 FMA CX16 SSE4_1 SSE4_2 MOVBE POPCNT AES AVX RDRND TSC_ADJ BMI1 AVX2 BMI2 ERMS RDSEED ADX CLFLUSHOPT CLWB INTEL_PT SHA
Pinned to CPU 0
Source pages allocated with transparent hugepages: 100.0
Median CPU speed: 3.201 GHz
Running benchmarks groups using timer clock

Median CPU speed: 3.201 GHz
UARCH_BENCH_CLOCK_MHZ does not work?

situation 2
bios: enable e-core , CPU freq= default, grub kernel command line add idle=poll(disable C-state):
cpupower show p-cores run at 5.2GHz and e-cores run at 3.7GHz

UARCH_BENCH_CLOCK_MHZ not set.

sudo ./uarch-bench.sh

Succesfully disabled turbo boost using intel_pstate/no_turbo
Using timer: clock
Welcome to uarch-bench (e4f54d5)
Supported CPU features: SSE3 PCLMULQDQ VMX EST TM2 SSSE3 FMA CX16 SSE4_1 SSE4_2 MOVBE POPCNT AES AVX RDRND TSC_ADJ BMI1 AVX2 BMI2 ERMS RDSEED ADX CLFLUSHOPT CLWB INTEL_PT SHA
Pinned to CPU 0
Source pages allocated with transparent hugepages: 100.0
Median CPU speed: 3.101 GHz
Running benchmarks groups using timer clock

Median CPU speed: 3.101 GHz

The problem here is the Median CPU speed was detected as 3.101 GHz, that is not right for p-cores and e-cores.

@edisonchan edisonchan changed the title UARCH_BENCH_CLOCK_MHZ does not work. clock detectd is not right for hybrid CPU. Aug 15, 2022
@travisdowns
Copy link
Owner

Hi Edison,

Thanks for this report.

export UARCH_BENCH_CLOCK_MHZ=4000
sudo ./uarch-bench.sh

Median CPU speed: 3.201 GHz
UARCH_BENCH_CLOCK_MHZ does not work?

This is because sudo, by default, does not pass through environment variable from the calling context. So you export that variable but it won't be seen by the process(es) running inside the sudo call.

You could do it like this:

 sudo UARCH_BENCH_CLOCK_MHZ=4000 ./uarch-bench.sh

This explicitly sets the variable for the sudo'd process. Or you can use -E to pass through all vars from the parent process. Finally you could use sudo --preserve-env=UARCH_BENCH_CLOCK_MHZ ./uarch-bench.sh to pass through only the specific variable you care about.

@travisdowns
Copy link
Owner

Median CPU speed: 3.201 GHz

@edisonchan that is because ./uarch-bench.sh disables turbo frequencies by default for the duration of the run and the 12900KF has a non-turbo frequency of 3.2 GHz. This gives a much more stable measurement in "cycles" which is usually what I'm interested in, since you don't see many frequency transitions as you would when running at max turbo speed.

However, you can run the test with turbo enabled if you'd like: just run the binary ./uarch-bench directly, rather than the ./uarch-bench.sh wrapper script. This wrapper just calls into the binary after disabling turbo and setting performance governor, but you can do this latter step by hand.

@travisdowns
Copy link
Owner

You can confirm this by doing a ./uarch-bench.sh run and then checking cpupower or other frequency reporting tool while the test is running. It should show 3.2 GHz.

@edisonchan
Copy link
Author

edisonchan commented Sep 12, 2022

@travisdowns thanks for the replies.
I choice set the UARCH_BENCH_CLOCK_MHZ because I can get very stable clock for Intel(add idle=poll in kernel cmdline and set 4 GHzin bios) and AMD(disable turbo clock in linux and set 4 GHz in bios).

and I have another question related. According Intel, when idle=poll, that is mean the CPU keep run NOPs, will that cause the test results not right?

@travisdowns
Copy link
Owner

I choice set the UARCH_BENCH_CLOCK_MHZ because I can get very stable clock for Intel(add idle=poll in kernel cmdline and set 4 GHzin bios) and AMD(disable turbo clock in linux and set 4 GHz in bios).

Makes sense. You can run it like ./uarch-bench then to avoid the turbo setting. I should add a mode to the wrapper to allow keeping turbo mode on.

@travisdowns
Copy link
Owner

and I have another question related. According Intel, when idle=poll, that is mean the CPU keep run NOPs, will that cause the test results not right?

It would not directly affect "steady state" test results. It just means the CPU will be "hot polling" while it has nothing to do, e.g., before and after the test. The effects are generally: (a) the CPU always runs at a high frequency, rather than ramping down during idle and then back up under load and (b) more heat generated and the CPU runs hotter for the same reason.

Effect (a) can mean that results are more stable since you don't have the frequency ramp-up period, but on the other hand uarch-bench already has lots of warmup in there to try to ensure the CPU is already running at its target frequency before the measurements start.

Effect (b) can have the opposite effect: poll might produce slower and less stable resluts if the CPU throttles from heat because it starts the test already at a high temperature (e.g., the heat sink is already saturated) versus the case where the CPU starts cool and take take advantage of the thermal mass of the cooling system to run at higher than steady-state frequency for a while which may be long enough to complete the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants