evalv3 performance on cycles #3517

b4nst · 2024-10-17T13:19:20Z

What version of CUE are you using (`cue version`)?

$ cue version
cue version v0.11.0-alpha.3

go version go1.23.2
      -buildmode exe
       -compiler gc
       -trimpath true
     CGO_ENABLED 0
          GOARCH arm64
            GOOS darwin
         GOARM64 v8.0
cue.lang.version v0.11.0

Does this issue reproduce with the latest stable release?

Yes

What did you do?

When evaluating that example

package main

#TimeSpan: {
	start:    float
	duration: float
	end:      float

	start:    end - duration
	duration: end - start
	end:      start + duration

	start: <=end
}

#Task: {
	action:   string
	timeSpan: #TimeSpan
}

#TaskSet: {
	action: string
	#tasks: [string]: #Task
	tasks: [for _, v in #tasks {v}]
	tasks: [...#Task]
	timeSpan?: #TimeSpan
}

#T1: #Task & {
	action: "Task 1"
	timeSpan: #TimeSpan & {
		duration: 4.2
	}
}

#T2: #Task & {
	action: "Task 2"
	timeSpan: #TimeSpan & {
		duration: 120.0
	}
}

ts: #TaskSet
ts: this={
	action: "Task 1 and Task 2"
	#tasks: {
		t1: #T1 & {
			timeSpan: end: this.#tasks.t2.timeSpan.start
		}
		t2: #T2 & {
			timeSpan: end: this.timeSpan.end
		}
	}
	timeSpan: start: this.#tasks.t1.timeSpan.start
}

performance without evalv3:

0.29 real         0.31 user         0.03 sys
            51806208  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
                2797  page reclaims
                 645  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                 626  signals received
                 561  voluntary context switches
                2136  involuntary context switches
          4477669464  instructions retired
          1141032608  cycles elapsed
            35964416  peak memory footprint

with evalv3:

26.65 real        28.49 user        19.72 sys
          5464260608  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
             4601153  page reclaims
                  11  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
               49995  signals received
               23226  voluntary context switches
             1507128  involuntary context switches
        533667430315  instructions retired
        149088659879  cycles elapsed
         31108296192  peak memory footprint

What did you expect to see?

Similar (or better) performance (both memory & time) with new evalv3 algorithm.

What did you see instead?

Significantly worth performances than the previous algorithm.

Notes

This might be related to #3208.
Also, #TaskSet.tasks sounds like the trouble maker here. When removing that part of the definition, it's back to nominal.

The text was updated successfully, but these errors were encountered:

mvdan · 2024-10-17T13:53:01Z

Assigning to myself to attempt to reduce this a bit further.

As of 860906a, evalv2 seems to give an error similar to #3511, although that was an evalv3 regression, and not an error that evalv2 would give:

> env CUE_EXPERIMENT=evalv3=0
> exec cue export
[stderr]
cycle error
[exit status 1]

The new evaluator seems to want to consume all of my available RAM, so I killed it before it managed to do that. It was using about 30GiB of RAM when I kille dit.

mvdan · 2024-10-17T14:44:50Z

@b4nst what exactly are you running? With cue export with the old evaluator as per the testscript below, I get the same cycle error failure on master, v0.11.0-alpha.3, v0.10.0, and v0.9.2:

env CUE_EXPERIMENT=evalv3=0
exec cue export

-- main.cue --
package p

#TimeSpan: {
	start:    float
	duration: float
	end:      float

	start:    end - duration
	duration: end - start
	end:      start + duration

	start: <=end
}

#Task: {
	action:   string
	timeSpan: #TimeSpan
}

#TaskSet: {
	action: string
	#tasks: [string]: #Task
	tasks: [for _, v in #tasks {v}]
	tasks: [...#Task]
	timeSpan?: #TimeSpan
}

#T1: #Task & {
	action: "Task 1"
	timeSpan: #TimeSpan & {
		duration: 4.2
	}
}

#T2: #Task & {
	action: "Task 2"
	timeSpan: #TimeSpan & {
		duration: 120.0
	}
}

ts: #TaskSet
ts: this={
	action: "Task 1 and Task 2"
	#tasks: {
		t1: #T1 & {
			timeSpan: end: this.#tasks.t2.timeSpan.start
		}
		t2: #T2 & {
			timeSpan: end: this.timeSpan.end
		}
	}
	timeSpan: start: this.#tasks.t1.timeSpan.start
}

You can run this with https://pkg.go.dev/github.com/rogpeppe/go-internal/cmd/testscript.

b4nst · 2024-10-17T15:16:56Z

Oh sorry, thought I did put the command. I'm running cue eval -e ts main.cue, just for it to give me the output without trying to break the cycle (which is indeed unbreakable until you set ts.timeSpan.end).

When doing an export like you did, without CUE_EXPERIMENT set to evalv3 I do get the cycle error. With CUE_EXPERIMENT set to evalv3, it starts eating all my memory.

mvdan · 2024-10-17T15:19:38Z

OK thank you. Is there a particular reason you would usually cue eval, or is it only for the sake of this bug report? I would typically want to reproduce performance issues with cue export instead, because typically that's the command that users actually care about.

Is there a cue export command I can use? Or is the bug still present with that command?

b4nst · 2024-10-17T15:47:58Z

I'll try to provide a reproducible example without leaking too much details, but basically yeah we do use export eventually. The configuration is humongous tho, this was a short example that helped reproducing the error we're currently facing.

To give a little bit of context, in our use case some of the data is user provided (as tag, or input file). Some of that input ends up in one of a TimeSpan start, end or duration field. We've been hit by multiple issues, some user related (corrupted input), some because of configuration issues (multiple preference markers on the same field, value conflict, etc...) or CUE itself (float conflicts on the exact same value, because of different precision in different loop of the same cycle). But the root cause is always a conflict that lead to a cycle error on a TimeSpan (the conflict makes the cycle unbreakable).
The thing is, those errors are more and more hard to identify for us. Often CUE goes OOM, effectively hiding the actually issue. Sometimes it also "loses" (this is probably wrong technical term) the context, and makes it hard to track back the origin of the cycle error.
For that reason, we started evaluating the v3 algorithm. Long story short we experienced insane performance gain using it. Which lead to overall simpler identification of input / configuration issues.
However those tests were done on reduced and simplified code base. We cannot migrate just yet, and it seems to me the last issue comes from that tasks definition (the for loop). It seems it makes CUE going OOM with evalv3, and not only when there's a cycle error (I experienced it on code base where evalv2 was able to export properly).

mvdan · 2024-10-17T16:00:06Z

Understood, thank you.

It seems it makes CUE going OOM with evalv3, and not only when there's a cycle error (I experienced it on code base where evalv2 was able to export properly).

If you would be able to share this with us, either as a reduced/simplified example you can share publicly, or sharing privately with me or via https://cuelabs.dev/unity/, that would be really helpful.

In the meantime, I'll try to reduce the cue eval example you already shared above, but ideally we would be testing your entire private project via Unity - which allows us to really ensure that your entire config works on the new evaluator, and test it for regressions with upcoming evaluator changes.

b4nst · 2024-10-17T16:03:23Z

Thanks a lot! I'll work tomorrow on a reproducible example we can put on the unity project. It'll help me cutting the refactor down anyway.

b4nst · 2024-10-18T10:44:56Z

Ok I think I can validate the behaviour with that example and export. Basically by adding ts: timeSpan: end: 100.0 to break the cycle, both evalv2 and v3 show similar performances:

env CUE_EXPERIMENT=evalv3
exec time -l cue export -e ts

-- main.cue --
package p

#TimeSpan: {
	start:    float
	duration: float
	end:      float

	start:    end - duration
	duration: end - start
	end:      start + duration

	start: <=end
}

#Task: {
	action:   string
	timeSpan: #TimeSpan
}

#TaskSet: {
	action: string
	#tasks: [string]: #Task
	tasks: [for _, v in #tasks {v}]
	tasks: [...#Task]
	timeSpan?: #TimeSpan
}

#T1: #Task & {
	action: "Task 1"
	timeSpan: #TimeSpan & {
		duration: 4.2
	}
}

#T2: #Task & {
	action: "Task 2"
	timeSpan: #TimeSpan & {
		duration: 120.0
	}
}

ts: #TaskSet
ts: this={
	action: "Task 1 and Task 2"
	#tasks: {
		t1: #T1 & {
			timeSpan: end: this.#tasks.t2.timeSpan.start
		}
		t2: #T2 & {
			timeSpan: end: this.timeSpan.end
		}
	}
	timeSpan: start: this.#tasks.t1.timeSpan.start
}

ts: timeSpan: end: 100.0

0.02 real         0.02 user         0.00 sys
            21364736  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
                1569  page reclaims
                   3  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                  33  signals received
                  39  voluntary context switches
                 222  involuntary context switches
           153678554  instructions retired
            62430448  cycles elapsed
            11503104  peak memory footprint

Our issue comes when there is a cycle error. evalv2 can detect it, although showing significantly worst performances than without the error:

ts: cycle error
       0.50 real         0.62 user         0.02 sys
           52264960  maximum resident set size
                  0  average shared memory size
                  0  average unshared data size
                  0  average unshared stack size
               3459  page reclaims
                  2  page faults
                  0  swaps
                  0  block input operations
                  0  block output operations
                  0  messages sent
                  0  messages received
                707  signals received
                544  voluntary context switches
               2150  involuntary context switches
         9502462162  instructions retired
         2105547898  cycles elapsed
           39781952  peak memory footprint

With evalv3 on the other hand, I can't get it to raise the cycle error. The process starts eating all my memory.

So our problem can be summarized as:

evalv2:
- nominal execution can take a lot of resources (time + memory) especially when there's a lot of nested cycles in our configuration. This example being super simple, it's not really representative of that performance gap.
- cycle error, when happening, are hard to track. No stack to trace to the actual definition raising the cycle error makes it often hard for us to identify the configuration or user input that caused it. Add on top of that preference markers and float operations in user inputs, and it quickly becomes an actual nightmare
evalv3:
- we experienced massive performance gain in nominal situations with nested cycles, on dedicated examples (I can give some, but I bet you already have ton of them). However when applied to our code base (currently successfully resolving using 0.4.3), it goes OOM. I do suspect a cycle that 0.4.3 was somehow able to break, but it's hard to troubleshoot.
- When a cycle error happens, it seems it cannot manage to get to the result (= cycle error). This simple example is enough to demonstrate it. This makes upgrade hard to happen, as we're not able to easily identify what's currently causing it on our code base.

On another note, our legal team is currently looking into the unity project policies. So hopefully we'll soon be allowed to share more concrete examples of those situations.

Thanks again for all your help, it is really aprecciated!

b4nst · 2024-10-28T16:03:54Z

Hey, just for information I can reproduce the same error with v0.11.0-alpha.4 and evalv3

b4nst added NeedsInvestigation Triage Requires triage/attention labels Oct 17, 2024

mvdan self-assigned this Oct 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evalv3 performance on cycles #3517

evalv3 performance on cycles #3517

b4nst commented Oct 17, 2024 •

edited

Loading

mvdan commented Oct 17, 2024

mvdan commented Oct 17, 2024

b4nst commented Oct 17, 2024

mvdan commented Oct 17, 2024

b4nst commented Oct 17, 2024 •

edited

Loading

mvdan commented Oct 17, 2024

b4nst commented Oct 17, 2024

b4nst commented Oct 18, 2024 •

edited

Loading

b4nst commented Oct 28, 2024

evalv3 performance on cycles #3517

evalv3 performance on cycles #3517

Comments

b4nst commented Oct 17, 2024 • edited Loading

What version of CUE are you using (cue version)?

Does this issue reproduce with the latest stable release?

What did you do?

What did you expect to see?

What did you see instead?

Notes

mvdan commented Oct 17, 2024

mvdan commented Oct 17, 2024

b4nst commented Oct 17, 2024

mvdan commented Oct 17, 2024

b4nst commented Oct 17, 2024 • edited Loading

mvdan commented Oct 17, 2024

b4nst commented Oct 17, 2024

b4nst commented Oct 18, 2024 • edited Loading

b4nst commented Oct 28, 2024

b4nst commented Oct 17, 2024 •

edited

Loading

What version of CUE are you using (`cue version`)?

b4nst commented Oct 17, 2024 •

edited

Loading

b4nst commented Oct 18, 2024 •

edited

Loading