Queries in Pathivu can be used to aggregate log data and perform operations on it. Multiple queries can be piped up to perform a complex action.
Queries can be made using to media in Pathivu:
- Pathivu Web: This is the web user-interface for Pathivu. It provides a simple UI for querying pathivu. Learn more about it here.
- Katchi CLI: This is a command line interface which allows seamless querying from the comfort of a terminal. Learn more about how to query using katchi here
Pathivu listens for query requests on port 5180
with the use of an HTTP(s) server. This exposes a simple way of sending commands and queries to the Pathivu backend. It currently supports the following queries:
If you would like to see more queries, feel free to create an issue on our source repository, and we will get back to you.
Before getting started with the various query commands supported by Pathivu, let us enumerate what all features we have while calling a query:
- Ordering: Ascending and descending order of log output can be decided at the query level.
- Pagination : You can set a maximum number of logs to show in output and an start displaying logs from a particular offset.
- Timestamping: We can declare a starting and ending timestamp to only output the logs between a given time-frame.
Consider the following JSON
{
"data": [
{
"ts": 3,
"level": "warn",
"details": {
"message": "APIKEY not provided",
}
"from": "app"
},
{
"ts": 2,
"level": "fatal"
"details": {
"message": "Error connecting to database",
"error_code": "500"
},
"error_code": "500",
"from": "app"
}
]
}
Pathivu supports two types of search queries, namely fuzzy search and structured query search.
The message
keyword is used for fuzzy searching in Pathivu. The fuzziness level is configurable. A simple example is given below:
message = "warn"
This query will give you the following output:
{
"data": [
{
"ts": 3,
"level": "warn",
"details": {
"message": "APIKEY not provided",
}
"from": "app"
}
]
}
Flattened JSON fields can be used for structured query searches and exact matches. In the following example, we are querying the logs which have error code 500 within an embedded struct.
details.error_code = "500"
This query will give you the following output:
{
"data": [
{
"ts": 3,
"level": "fatal",
"details": {
"message": "APIKEY not provided",
"error_code": "500"
}
"from": "app"
}
]
}
Count query can be used to get the total number of logs pertaining to a particular key in the log structure. Count can be of two types, namely base count and aggregated count.
Consider the following log JSON:
{
"data": [
{
"ts": 3,
"entry": {
"details": {
"error_code": "500",
"message": "Error connecting to database"
},
"level": "fatal",
"from": "backend"
},
"source": "demo"
},
{
"ts": 2,
"entry": {
"details": {
"error_code": "500",
"message": "Error connecting to database"
},
"level": "fatal",
"from": "app"
},
"source": "demo"
},
{
"ts": 1,
"entry": {
"details": {
"message": "APIKEY not provided"
},
"level": "warn",
"from": "app"
},
"source": "demo"
}
]
}
Base count is a powerful command that can be used for counting the number of logs that exist for a particular field. For example, the example below gives the count of all logs with from
defined.
count(from) as src
Running this command will give you the following output:
{
"data": [
{
"src": "3"
}
]
}
Aggregations can be added in the count query for grouping fields accoring to a particular field. This can be achieved using the by
keyword. For example, the following query will count all level
fields and group them by the from
.
count(level) as level_count by from
Result will looks like
{
"data": [
{
"level_count": "2",
"from": "app"
},
{
"level_count": "1",
"from": "backend"
}
]
}
Structured JSON matching can also be used for counting. For example, the following command returns the count of all logs that have the error_code
field inside details
sub-structure, grouped by from
.
count(details.error_code) as error_code_count by from
The output looks like this:
{
"data": [
{
"error_code_count": "1",
"from": "backend"
},
{
"error_code_count": "1",
"from": "app"
}
]
}
The avg
keyword can be used to find the average of numerical fields in a structured logging scheme. It supports aggregations as well.
Let us consider the following log JSON:
{
"data": [
{
"ts": 3,
"entry": {
"country": "Afghanistan",
"details": {
"latency": 9.82
},
"level": "info"
},
"source": "demo"
},
{
"ts": 2,
"entry": {
"country": "Pakistan",
"details": {
"latency": 6.45
},
"level": "info"
},
"source": "demo"
},
{
"ts": 1,
"entry": {
"country": "India",
"details": {
"latency": 3.26
},
"level": "info"
},
"source": "demo"
}
]
}
The following query will find the average latency of your service.
avg(details.latency) as average_latency
The output looks like this:
{
"data": [
{
"average_latency": "6.51"
}
]
}
Average also supports aggregations. For example, the follwoing query will country-wise average latency.
avg(details.latency) as average_latency by country
The output looks like this:
{
"data": [
{
"average_latency": "3.26",
"country": "India"
},
{
"average_latency": "6.45",
"country": "Pakistan"
},
{
"average_latency": "9.82",
"country": "Afghanistan"
}
]
}
Distinct elements can be found, aggregated and printed using the distinct
keyword. The distinct command also provides a feature to count the number of distinct logs matched.
Following the example from count, the following command will give you a list of all distinct levels in the logs.
distinct(level)
The output will look something like this:
{
"data": [
"fatal",
"warn"
]
}
In order to find distinct value count, you can use distinct_count
keyword. The following command will give you a list of all distinct levels in the logs along with their count.
distinct_count(level)
The output will look something like this:
{
"data": [
{
"fatal": 2
},
{
"warn": 1
}
]
}
Structured JSON matching can also be used here. For example, the following command will return a list of all distinct error codes along with their count.
distinct_count(details.error_code)
The output looks like this:
{
"data": [
{
"500": 2
}
]
}
Limit command can be used to limit the number of responses that we get out of Pathivu query results. For example, in the logs provided here, the following query can be used to limit the number of responses:
limit 1
The output will look like this:
{
"data": [
{
"ts": 3,
"entry": {
"country": "Afghanistan",
"details": {
"latency": 9.82
},
"level": "info"
},
"source": "demo"
}
]
}
By default, limits are applied from the latest timestamp in Pathivu.
Pathivu supports piping as well. Here, you can combine two or more queries, one after the other. This gives Pathivu immense querying capabilities.
Below are a couple of examples as to how piping can be used to make powerful and meaningful queries. Note that all of the queries are performed on the following JSON:
{
"data": [
{
"ts": 3,
"entry": {
"country": "Afghanistan",
"details": {
"latency": 9.82
},
"level": "info",
"transaction": "succeeded"
},
"source": "demo"
},
{
"ts": 2,
"entry": {
"country": "Pakistan",
"details": {
"latency": 6.45
},
"level": "info",
"transaction": "failed"
},
"source": "demo"
},
{
"ts": 1,
"entry": {
"country": "India",
"details": {
"latency": 3.26
},
"level": "info",
"transaction": "succeeded"
},
"source": "demo"
}
]
}
- The following query will give you all of the failed transaction logs grouped by country.
transaction="failed" | distinct_count(country) as failed_transaction_country_wise
So the output will look something like this:
{
"data": [
{
"Pakistan": 1
}
]
}
- The following command will give you the count of all info-level logs.
level="info" | count(level) as level_count
The output looks like this:
{
"data": [
{
"level_count": "3"
}
]
}
User can specify the sources that the user would like to serach on using source
keyword. Multiple sources are mentioned using ,
.
source=master,slave
This will output all logs with the source as master
and slave
.