Relax a test to address poorer than expected performance on Windows? #60

ablaom · 2023-09-12T20:11:42Z

A Windows fail was detected for a PR making a change apparently unrelated to the failing test. Here is the log:

https://github.com/JuliaAI/CategoricalDistributions.jl/actions/runs/6160604583/job/16728423043?pr=59#step:6:232

Instead of a 10x speedup in a test for array-optimised arithmetic, we're only getting 2x speed up. I guess this is because of some change in the virtual machine used by GitHub for windows testing, or similar?

It may be the fail in intermittent.

In any case, a little investigation is probably warranted before relaxing the test.

skleinbo · 2023-09-13T12:08:31Z

I can reproduce the test regression locally on Mac.

versioninfo

julia> versioninfo()
Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (arm64-apple-darwin22.4.0)
  CPU: 8 × Apple M1
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 4 on 4 virtual cores
Environment:
  JULIA_CONDAPKG_BACKEND = Current
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 4

Runs fine on #dev (4be9db4), but fails with #59.

Curiously, I cannot reproduce on Windows 10.

ablaom · 2023-09-13T18:49:45Z

Thanks @skleinbo for the further investigation. This is pretty weird. I'm wondering if we can reproduce this error on any Julia version prior to 1.9 .

skleinbo · 2023-09-14T10:25:19Z

@ablaom Both versions, pre- and post #59, pass for me on Julia 1.8.5.

The poor performance isn't real by the way, but for some reason recompilation seems to be triggered.

This is going to be impossible to debug I fear. I can make the test pass by putting literally any kind of additional print statement in that test set. For example, annotating

CategoricalDistributions.jl/test/arithmetic.jl

Lines 60 to 61 in 66f23a3

    
           t_slow = @elapsed @eval 42*slow 
        
           t_fast = @elapsed @eval 42*fast

with @show, or even putting a completely unrelated @info "Test" somewhere will make the test pass. It even passes when I switch those two lines - no I am not kidding.
I haven't the slightest clue how any of this connects, or where to even start debugging this.

Is it a good ideal to rely on a single measurement of wall time anyway? This seems prone to spurious failure.
Instead these tests could use @belapsed from BenchmarkTools. Granted, it'll make the test suite run a couple seconds longer and adds a further test dependency, but seems solid.

ablaom · 2023-09-14T19:06:18Z

@skleinbo Thanks for the further analysis. BTW The problem is intermittent; see #62 .

Instead these tests could use @belapsed from BenchmarkTools. Granted, it'll make the test suite run a couple seconds longer and adds a further test dependency, but seems solid.

This is an excellent suggestion.

skleinbo · 2023-09-15T07:57:03Z

Okay, great! Please see #63.

ablaom · 2023-09-19T19:45:44Z

closed by #63

ablaom mentioned this issue Sep 12, 2023

Implement modes #59

Merged

ablaom mentioned this issue Sep 13, 2023

Add julia 1.7, 1.8 testing #62

Closed

ablaom closed this as completed Sep 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Relax a test to address poorer than expected performance on Windows? #60

Relax a test to address poorer than expected performance on Windows? #60

ablaom commented Sep 12, 2023

skleinbo commented Sep 13, 2023

ablaom commented Sep 13, 2023

skleinbo commented Sep 14, 2023 •

edited

Loading

ablaom commented Sep 14, 2023

skleinbo commented Sep 15, 2023

ablaom commented Sep 19, 2023

Relax a test to address poorer than expected performance on Windows? #60

Relax a test to address poorer than expected performance on Windows? #60

Comments

ablaom commented Sep 12, 2023

skleinbo commented Sep 13, 2023

ablaom commented Sep 13, 2023

skleinbo commented Sep 14, 2023 • edited Loading

ablaom commented Sep 14, 2023

skleinbo commented Sep 15, 2023

ablaom commented Sep 19, 2023

skleinbo commented Sep 14, 2023 •

edited

Loading