-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
state tag set under zpool_stats should be a field set instead #9
Comments
There is a generic problem here: we want to know the state as a field, but also query the state. In influxdb we query on tags, not values. So we do need the state as a tag in order to make queries such as "count the number of unhealthy pools, vdevs, or devices" or "show the devices that are offline" in an efficient manner (because tags are indexed). But there are also cases where state makes sense as a field. I'm not opposed to adding it as a field. NB, this is especially painful for prometheus because the only data type for a field (value) is double. For the zpool_prometheus implementation we put state as a tag (label) as well as a field (value). This is still quite painful because the data type must be a number and the state and other enums can change over the lifetime of the project. So it is only really feasible to do a state field query when it is text. Thoughts? |
Also, the default interactive query generator for grafana is somewhat limited. When we want to see the state column, we hand edit the query. If you have a desired grafana dashboard can you share? |
Hello, so maybe I am missing something, but without STATE as a _field, how can I display "zpool state" in a grafana card? That was going to be my primary use case for this but using the influxDB flux query generator I can't output the state since it isn't a field. Any way with a flux query to do this? Looking for a card like "Backup zPool status: DEGRADED". |
Would like to know this too. I am having the same problem. |
for grafana, there are two methods of creating an influxdb query: the query builder or edit by hand. In general, you can toggle back and forth between the two. Editing by hand will allow you to select a tag, but the grafana query builder only allows you to select a field. If you'd like to see an example, I can put together something over the weekend. |
Very much appreciated! |
Following this topic as it wasn't straightforward at first for me either. The I would recommend considering making that a field. |
Adds duplicate of tag "state" data into a field: + for `zpool_stats` the field is `vdev_state` + for `zpool_scan_stats` the field is `scan_state` Though the data is the same, the tag and field names are different to avoid confusion and the issues described in: https://docs.influxdata.com/influxdb/v1.8/troubleshooting/frequently-asked-questions/#tag-and-field-key-with-the-same-name Signed-off-by: Richard Elling <[email protected]>
Comments on #12 appreciated. If this satisfies, then I'll make a similar PR to OpenZFS |
ping |
Hi @richardelling, first off let me just say this is some great work. I've been following the original feature request for this on the telegraf side since late 2017 and I'm very excited to see it's now shipping in OpenZFS 2.1.0.
For the past 13 years, I've been using a Bash script to screen scrape the output of
zpool status
and feed it into various data stores and monitoring solutions. This has been brittle and given the sheer variety of useful metricszpool_influxd
is reporting, I'd like to now start usingzpool_influxd
.I've taken a look at the InfluxDB line protocol emitted by
zpool_influxdb
and I'm curious about the design decision behind makingstate
a tag set.At the moment, I'm trying to find a way to extract status information about a given disk in a given vdev in a given pool (a use case which lends itself very well to templating in Grafana). Currently,
zpool_influxdb
embeds things like thestate
,path
, andvdev
of a disk into a tag set associated with a measurement namedzpool_stats
:Including
path
andvdev
into thezpool_stats
measurement as a tag set makes sense: InfluxDB tags are typically expected to be invariant over the life of a measurement, and so it's reasonable to keep properties that normally don't change (i.e., a disk's by-id path and its vdev location) stored as tags. But thestate
of a disk can be quite dynamic (off the top of my head I know there'sOFFLINE
,UNAVAILABLE
,FAULTED
,REMOVED
,DEGRADED
, and probably a few others I'm forgetting), so I believe it'd make more sense forstate
to be a field set just like what you've done withalloc
,free
,size
,read_bytes
, etc.Shifting
state
from a tag to a field would not only be more in line with InfluxDB's world view, it would also help prevent InfluxDB instances consumingzpool_influxdb
's output from creating unnecessary measurement entries whenever a disk's, vdev's, or zpool'sstate
changes. See here for InfluxDB's perspective on tags sets and series cardinality, but basically: tags containing variable information will lead to a large number of unique series in an InfluxDB database each time those tags change. This in turn results in high series cardinality, which is a primary driver of high memory usage across many different types workloads, even low usage ones.Lastly, shifting
state
to a field would also enable the new Stats panel in Grafana to display the state of each disk under an alias constructed from that disk's path or vdev location (which I don't believe is currently possible ifstate
is stored in a tag since I'm not aware of a way to coax Grafana to convert a single tag set into a field set and display it under another tag set, though I could be wrong).To be explicit, it'd be great if
zpool_influxdb
's output look a little something like this:(Note the elimination of the
state
tag set and the addition of thestate
field set at the end right before the timestamp.)What do you think?
The text was updated successfully, but these errors were encountered: