Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

evalv3 performance on cycles #3517

Open
b4nst opened this issue Oct 17, 2024 · 9 comments
Open

evalv3 performance on cycles #3517

b4nst opened this issue Oct 17, 2024 · 9 comments
Assignees
Labels
NeedsInvestigation Triage Requires triage/attention

Comments

@b4nst
Copy link

b4nst commented Oct 17, 2024

What version of CUE are you using (cue version)?

$ cue version
cue version v0.11.0-alpha.3

go version go1.23.2
      -buildmode exe
       -compiler gc
       -trimpath true
     CGO_ENABLED 0
          GOARCH arm64
            GOOS darwin
         GOARM64 v8.0
cue.lang.version v0.11.0

Does this issue reproduce with the latest stable release?

Yes

What did you do?

When evaluating that example

package main

#TimeSpan: {
	start:    float
	duration: float
	end:      float

	start:    end - duration
	duration: end - start
	end:      start + duration

	start: <=end
}

#Task: {
	action:   string
	timeSpan: #TimeSpan
}

#TaskSet: {
	action: string
	#tasks: [string]: #Task
	tasks: [for _, v in #tasks {v}]
	tasks: [...#Task]
	timeSpan?: #TimeSpan
}

#T1: #Task & {
	action: "Task 1"
	timeSpan: #TimeSpan & {
		duration: 4.2
	}
}

#T2: #Task & {
	action: "Task 2"
	timeSpan: #TimeSpan & {
		duration: 120.0
	}
}

ts: #TaskSet
ts: this={
	action: "Task 1 and Task 2"
	#tasks: {
		t1: #T1 & {
			timeSpan: end: this.#tasks.t2.timeSpan.start
		}
		t2: #T2 & {
			timeSpan: end: this.timeSpan.end
		}
	}
	timeSpan: start: this.#tasks.t1.timeSpan.start
}

performance without evalv3:

0.29 real         0.31 user         0.03 sys
            51806208  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
                2797  page reclaims
                 645  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                 626  signals received
                 561  voluntary context switches
                2136  involuntary context switches
          4477669464  instructions retired
          1141032608  cycles elapsed
            35964416  peak memory footprint

with evalv3:

26.65 real        28.49 user        19.72 sys
          5464260608  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
             4601153  page reclaims
                  11  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
               49995  signals received
               23226  voluntary context switches
             1507128  involuntary context switches
        533667430315  instructions retired
        149088659879  cycles elapsed
         31108296192  peak memory footprint

What did you expect to see?

Similar (or better) performance (both memory & time) with new evalv3 algorithm.

What did you see instead?

Significantly worth performances than the previous algorithm.

Notes

This might be related to #3208.
Also, #TaskSet.tasks sounds like the trouble maker here. When removing that part of the definition, it's back to nominal.

@b4nst b4nst added NeedsInvestigation Triage Requires triage/attention labels Oct 17, 2024
@mvdan mvdan self-assigned this Oct 17, 2024
@mvdan
Copy link
Member

mvdan commented Oct 17, 2024

Assigning to myself to attempt to reduce this a bit further.

As of 860906a, evalv2 seems to give an error similar to #3511, although that was an evalv3 regression, and not an error that evalv2 would give:

> env CUE_EXPERIMENT=evalv3=0
> exec cue export
[stderr]
cycle error
[exit status 1]

The new evaluator seems to want to consume all of my available RAM, so I killed it before it managed to do that. It was using about 30GiB of RAM when I kille dit.

@mvdan
Copy link
Member

mvdan commented Oct 17, 2024

@b4nst what exactly are you running? With cue export with the old evaluator as per the testscript below, I get the same cycle error failure on master, v0.11.0-alpha.3, v0.10.0, and v0.9.2:

env CUE_EXPERIMENT=evalv3=0
exec cue export

-- main.cue --
package p

#TimeSpan: {
	start:    float
	duration: float
	end:      float

	start:    end - duration
	duration: end - start
	end:      start + duration

	start: <=end
}

#Task: {
	action:   string
	timeSpan: #TimeSpan
}

#TaskSet: {
	action: string
	#tasks: [string]: #Task
	tasks: [for _, v in #tasks {v}]
	tasks: [...#Task]
	timeSpan?: #TimeSpan
}

#T1: #Task & {
	action: "Task 1"
	timeSpan: #TimeSpan & {
		duration: 4.2
	}
}

#T2: #Task & {
	action: "Task 2"
	timeSpan: #TimeSpan & {
		duration: 120.0
	}
}

ts: #TaskSet
ts: this={
	action: "Task 1 and Task 2"
	#tasks: {
		t1: #T1 & {
			timeSpan: end: this.#tasks.t2.timeSpan.start
		}
		t2: #T2 & {
			timeSpan: end: this.timeSpan.end
		}
	}
	timeSpan: start: this.#tasks.t1.timeSpan.start
}

You can run this with https://pkg.go.dev/github.com/rogpeppe/go-internal/cmd/testscript.

@b4nst
Copy link
Author

b4nst commented Oct 17, 2024

Oh sorry, thought I did put the command. I'm running cue eval -e ts main.cue, just for it to give me the output without trying to break the cycle (which is indeed unbreakable until you set ts.timeSpan.end).

When doing an export like you did, without CUE_EXPERIMENT set to evalv3 I do get the cycle error. With CUE_EXPERIMENT set to evalv3, it starts eating all my memory.

@mvdan
Copy link
Member

mvdan commented Oct 17, 2024

OK thank you. Is there a particular reason you would usually cue eval, or is it only for the sake of this bug report? I would typically want to reproduce performance issues with cue export instead, because typically that's the command that users actually care about.

Is there a cue export command I can use? Or is the bug still present with that command?

@b4nst
Copy link
Author

b4nst commented Oct 17, 2024

I'll try to provide a reproducible example without leaking too much details, but basically yeah we do use export eventually. The configuration is humongous tho, this was a short example that helped reproducing the error we're currently facing.

To give a little bit of context, in our use case some of the data is user provided (as tag, or input file). Some of that input ends up in one of a TimeSpan start, end or duration field. We've been hit by multiple issues, some user related (corrupted input), some because of configuration issues (multiple preference markers on the same field, value conflict, etc...) or CUE itself (float conflicts on the exact same value, because of different precision in different loop of the same cycle). But the root cause is always a conflict that lead to a cycle error on a TimeSpan (the conflict makes the cycle unbreakable).
The thing is, those errors are more and more hard to identify for us. Often CUE goes OOM, effectively hiding the actually issue. Sometimes it also "loses" (this is probably wrong technical term) the context, and makes it hard to track back the origin of the cycle error.
For that reason, we started evaluating the v3 algorithm. Long story short we experienced insane performance gain using it. Which lead to overall simpler identification of input / configuration issues.
However those tests were done on reduced and simplified code base. We cannot migrate just yet, and it seems to me the last issue comes from that tasks definition (the for loop). It seems it makes CUE going OOM with evalv3, and not only when there's a cycle error (I experienced it on code base where evalv2 was able to export properly).

@mvdan
Copy link
Member

mvdan commented Oct 17, 2024

Understood, thank you.

It seems it makes CUE going OOM with evalv3, and not only when there's a cycle error (I experienced it on code base where evalv2 was able to export properly).

If you would be able to share this with us, either as a reduced/simplified example you can share publicly, or sharing privately with me or via https://cuelabs.dev/unity/, that would be really helpful.

In the meantime, I'll try to reduce the cue eval example you already shared above, but ideally we would be testing your entire private project via Unity - which allows us to really ensure that your entire config works on the new evaluator, and test it for regressions with upcoming evaluator changes.

@b4nst
Copy link
Author

b4nst commented Oct 17, 2024

Thanks a lot! I'll work tomorrow on a reproducible example we can put on the unity project. It'll help me cutting the refactor down anyway.

@b4nst
Copy link
Author

b4nst commented Oct 18, 2024

Ok I think I can validate the behaviour with that example and export. Basically by adding ts: timeSpan: end: 100.0 to break the cycle, both evalv2 and v3 show similar performances:

env CUE_EXPERIMENT=evalv3
exec time -l cue export -e ts

-- main.cue --
package p

#TimeSpan: {
	start:    float
	duration: float
	end:      float

	start:    end - duration
	duration: end - start
	end:      start + duration

	start: <=end
}

#Task: {
	action:   string
	timeSpan: #TimeSpan
}

#TaskSet: {
	action: string
	#tasks: [string]: #Task
	tasks: [for _, v in #tasks {v}]
	tasks: [...#Task]
	timeSpan?: #TimeSpan
}

#T1: #Task & {
	action: "Task 1"
	timeSpan: #TimeSpan & {
		duration: 4.2
	}
}

#T2: #Task & {
	action: "Task 2"
	timeSpan: #TimeSpan & {
		duration: 120.0
	}
}

ts: #TaskSet
ts: this={
	action: "Task 1 and Task 2"
	#tasks: {
		t1: #T1 & {
			timeSpan: end: this.#tasks.t2.timeSpan.start
		}
		t2: #T2 & {
			timeSpan: end: this.timeSpan.end
		}
	}
	timeSpan: start: this.#tasks.t1.timeSpan.start
}

ts: timeSpan: end: 100.0
0.02 real         0.02 user         0.00 sys
            21364736  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
                1569  page reclaims
                   3  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                  33  signals received
                  39  voluntary context switches
                 222  involuntary context switches
           153678554  instructions retired
            62430448  cycles elapsed
            11503104  peak memory footprint

Our issue comes when there is a cycle error. evalv2 can detect it, although showing significantly worst performances than without the error:

ts: cycle error
       0.50 real         0.62 user         0.02 sys
           52264960  maximum resident set size
                  0  average shared memory size
                  0  average unshared data size
                  0  average unshared stack size
               3459  page reclaims
                  2  page faults
                  0  swaps
                  0  block input operations
                  0  block output operations
                  0  messages sent
                  0  messages received
                707  signals received
                544  voluntary context switches
               2150  involuntary context switches
         9502462162  instructions retired
         2105547898  cycles elapsed
           39781952  peak memory footprint

With evalv3 on the other hand, I can't get it to raise the cycle error. The process starts eating all my memory.

So our problem can be summarized as:

  • evalv2:
    • nominal execution can take a lot of resources (time + memory) especially when there's a lot of nested cycles in our configuration. This example being super simple, it's not really representative of that performance gap.
    • cycle error, when happening, are hard to track. No stack to trace to the actual definition raising the cycle error makes it often hard for us to identify the configuration or user input that caused it. Add on top of that preference markers and float operations in user inputs, and it quickly becomes an actual nightmare
  • evalv3:
    • we experienced massive performance gain in nominal situations with nested cycles, on dedicated examples (I can give some, but I bet you already have ton of them). However when applied to our code base (currently successfully resolving using 0.4.3), it goes OOM. I do suspect a cycle that 0.4.3 was somehow able to break, but it's hard to troubleshoot.
    • When a cycle error happens, it seems it cannot manage to get to the result (= cycle error). This simple example is enough to demonstrate it. This makes upgrade hard to happen, as we're not able to easily identify what's currently causing it on our code base.

On another note, our legal team is currently looking into the unity project policies. So hopefully we'll soon be allowed to share more concrete examples of those situations.

Thanks again for all your help, it is really aprecciated!

@b4nst
Copy link
Author

b4nst commented Oct 28, 2024

Hey, just for information I can reproduce the same error with v0.11.0-alpha.4 and evalv3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Triage Requires triage/attention
Projects
None yet
Development

No branches or pull requests

2 participants