-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Push conversion of cumulative timeseries into deltas into the database #6888
Comments
Well that sucks. From the reference on
That is very frustrating, since it means we need to incur an additional sorting on the arrays as we group them. For example, doing some prototyping on data from the dogfood rack, we have this:
That group-array query is creating an array of all the start times, timestamps, and datapoints, along with the difference between adjacent array elements in those groups. We're then doing an That means we'll need to do an explicit sort of the arrays, all of them, by timestamp. That will either mean an array zip + sort by timestamp + array unzip; or some other form of indirect sort. That's not awesome. |
This monstrosity will I think do the trick, but it is pretty nauseating:
Let's check if any of the timestamp deltas are negative:
Whew, at least it seems to work. |
I think we should seriously consider just storing deltas in ClickHouse directly, moving this complexity to insertion time where we care much less about the latency (and are operating on much less data in each query). If we do the above query on the last 24 hours of data (from all 205 timeseries with data in that time range), here's the resource usage:
That efficiency is pretty incredible, but we could still avoid almost all of that if stored deltas, since we could just select the data directly from the table as-is. Another reason to do that is the above query is not actually correct. What I want to return is the adjacent difference between successive array elements, using the actual value for the first element. The |
This works, and is also terrible:
The trickery around the pushing / popping on the arrays is for two things:
We get things like this:
It's also important to note that this does not handle histograms at all. I don't have any idea what that would look like yet, but to do this correctly, you'd need to compute the difference between corresponding array elements of each histogram. That sounds like another array map / zip / reduce of some kind. |
One more thing to consider on this front -- we actually don't need the |
I talked with @ahl about this a bit. There is one huge benefit of storing the raw cumulative values in the database, which is that we can always select any time range of data, and have the full picture. If you don't store those, then we need to sum all the values before the start time of the query to correctly compute the values within the time range specified by the query. We could also entertain two other options:
Both of these have their own tradeoffs. The first obviously plays fast and loose with storage space. It's 100% redundant data, so while it might compress very well, we are definitely using customer space to store something we could otherwise get. In the second, we don't pay any storage costs since the view isn't materialized, but it might not be possible to actually do that. Specifically, the materialized view would need to collect the entire time history of a timeseries into an array in the As Adam pointed out, either of these basically make clear that what we're trading off is query latency vs storage space. In the absence of some compelling data that the query is "too slow", we should probably not use more disk at this point. The OxQL queries that prompted this investigation do time out on the client side, to be clear, but that can probably be improved by lengthening timeouts at this point. We should keep an eye on it, but it's still probably worth it to be more parsimonious with customer disk space at this point. |
Given a query like
get <some_cumulative_timeseries> | ....
, we automatically convert those cumulative values into deltas. That's to make the data easier to operate on, and is required for grouping, alignment, or joining. We currently do that outside the ClickHouse database, after selecting out the minimal set of raw data that matches the query's filtering predicates. That has been fine thus far, but to help with #6480, we'll need to figure out how to do that inside the database. That's needed so that we can push other operations, likealign
ment, into the database for cumulative timeseries.This one will be pretty tricky, I think. ClickHouse has some methods for computing adjacent differences, but they're explicitly only valid within a single block. There might be a way to force ClickHouse to select the data into exactly one block, but I'm not sure that's possible. Other options might be grouping all the data into an array, and then using
arrayDifference
. That makes us subject to the 1 million-element array size limit, though. That might be fine for today's data. For example, given a cumulative counter sampled at 1Hz, that would enable queries selecting up to about 11.5 days of data, which is pretty good.There might be smarter ways to do it though, such as using a window function or another method entirely. This will need a bit of research, but I think will bear lots of fruit.
The text was updated successfully, but these errors were encountered: