Slack bot that listens for commands from Slack users to:
- Search the Elasticsearch cluster
- Query the health status of the Elasticsearch cluster
- Acknowledge an alert created by elastalert
- Triage alerts or arbitrary issues
Slack users can search the Elasticsearch cluster for arbitrary search criteria, using the Lucene syntax. This can be useful for maintaining a history of searches, but needs to be used with caution. Certain Slack communities with public access should not enable this feature if the Elasticsearch cluster contains sensitive data.
Examples:
- Generic search across all indices and fields
!search hello
Response:
Found 672664 matching record(s), showing 1:
index: myindex-2018.05.19
@timestamp: 2018-05-19T14:12:52.141Z
message: hello
@version: 1
- Provide 3 most recent records that match this search
!search hello|3
- Specific index and field match:
!search _index:"myindex-*" message:hello
Display the current health state of the Elasticsearch cluster, including number of active nodes, queue wait times, and more.
Example:
- Query the cluster health:
!health
Response:
Elasticsearch Cluster Health -> yellow
Name: docker-cluster
Nodes: 1
Data Nodes: 1
Active Shards: 817 (50%)
Initializing Shards: 0
Unassigned Shards: 795
Pending Tasks: 0
Inflight Tasks: 0
Max Queue Time MS: 0
When told to ack an alert generated by Elastalert, Elastabot will look for the alert and silence it by creating a silence document in the appropriate Elasticsearch index. Additionally, if the ack command includes a question mark, ?, then the alert will be sent through the triage process. The question mark symbolizes that there are unanswered questions related to the alert and therefore the alert needs to be triaged.
Deadman Switch rules are typically used in an inverse pattern, where the rule is always alerting, to show that the alerting system is working. Therefore, any rule beginning with "Deadman" will automatically be excluded from the ack command.
NOTE: Alert names provided in the command argument are searched as-is, with the only character replacement occuring on the space character, which is escaped prior to sending to Elasticsearch. This means that acknowledging an alert for a rule name provided as My rule*
will include the asterisk into the search without escaping it, and so Elasticsearch will apply the wildcard match. It is important to consider this raw behavior of the query before exposing this bot to a public-facing Slack community. No additional input filtering/cleansing is currently implemented.
Examples:
- Acknowledge the most recently triggered alert and start the triage process:
!ack ?
Response:
Acknowledged alert *IDS Offline* until 2018-05-18 16:59:13.595827 UTC
Triage process has started
- Acknowledge the most recently triggered alert for rule IDS Offline for the next 2 hours (no triage in this example):
!ack IDS Offline|120
Elastabot understands the vague notion of a triage command. Currently, this is simply the generation of an SMTP email. This is useful for pushing issues into a ticketing system, such as Atlassian's JIRA tool, etc, and avoids complexities of direct integration to those tools such as how to handle downtime of the tool itself, or licensing costs of additional service users.
Triage is included with the !ack command, provided a question mark is added to as an argument. However, to explicitly start the triage process for an arbitrary topic, without an associated alert, use the !triage command.
Examples:
- Start the triage process to investigate a drop in user logins:
!triage Unexpected drop in user logins
Response:
Triage process has started
Users will interact with Elastabot in the Slack interface. A Slack community admin will need to register a bot for Elastabot and provide bot token needed for Elastabot to connect to Slack. Once Elastabot connects to Slack, users can invite the bot into one or more channels, or send direct messages to interaction with Elastabot.
To see a list of available commands, users can type:
!help
Or, to get detailed help on a specific command, user can type:
!ack help
Elastabot expects two groups of configuration inputs.
- JSON configuration file with non-sensitive values
- Environment variables with sensitive values
An example configuration file is shown below, followed by descriptions of each setting.
{
"elasticsearch": {
"host": "elasticsearch",
"port": 9200,
"sslEnabled": false,
"sslStrictEnabled": false,
"timeoutSeconds": 10,
"urlPrefix":""
},
"elastalert": {
"index": "elastalert_status",
"silenceMinutes": 240,
"recentMinutes": 4320
},
"smtp": {
"host": "email-smtp.us-east-1.amazonaws.com",
"port": 587,
"secure": false,
"starttls": true,
"timeoutSeconds": 4,
"to": "[email protected]",
"from": "[email protected]",
"subjectPrefix": "[mini] ",
"debug": false
},
"commandPrefix": "!",
"triageTarget": "smtp",
"searchEnabled": true
}
setting | description |
---|---|
elasticsearch.host | Hostname for the Elasticsearch server |
elasticsearch.port | Port for the Elasticsearch server |
elasticsearch.sslEnabled | If true, uses SSL/TLS to connect to Elasticsearch |
elasticsearch.sslStrictEnabled | If true, the SSL/TLS certificates will be validated against known certificate authorities |
elasticsearch.timeoutSeconds | Number of seconds to wait for an Elasticsearch response |
elasticsearch.urlPrefix | URL prefix for Elasticsearch, typically an empty string |
elastalert.index | The index prefix used by Elastalert within Elasticsearch, typically elastalert or elastalert_status |
elastalert.silenceMinutes | Number of minutes to silence an acknowledge alert if a silence duration is not explicitly given with the ack command. |
elastalert.recentMinutes | Number of minutes to look back in history for a fired alert in the Elasticsearch index |
smtp.host | Hostname for the SMTP server |
smtp.port | Port for the SMTP server |
smtp.secure | If true, will connect to the SMTP host over SSL/TLS |
smtp.starttls | If true, will send the starttls command (typically not used with smtp.secure=true |
smtp.timeoutSeconds | Number of seconds to wait for the SMTP server to respond |
smtp.to | Email address that will receive the triage email |
smtp.from | Sender email address |
smtp.subjectPrefix | If non-empty string, will be prepended to each email subject |
smtp.debug | If true, the SMTP connectivity details will be logged to stdout |
commandPrefix | Special character or phrase to trigger the bot, typically an exclamation point, !. Ex: !ack |
triageTarget | How to initiate the triage process, currently only smtp is supported. |
searchEnabled | Allows generic, arbitrary searching of the Elasticsearch cluster. Should not be enabled if sensitive information is stored in the cluster. |
The following environment variables are used as inputs for sensitive information.
variable | required | description |
---|---|---|
SLACK_BOT_TOKEN | true | The Slack-generated bot token, provided by slack.com |
ELASTICSEARCH_USERNAME | false | Optional Elasticsearch username, provided by your ES admin |
ELASTICSEARCH_PASSWORD | false | Optional Elasticsearch password, provided by your ES admin |
SMTP_USERNAME | false | Optional SMTP username, provided by your SMTP admin |
SMTP_PASSWORD | false | Optional SMTP password, provided by your SMTP admin |
A Dockerfile is provided for Elastabot, and a Docker image will auto-build at hub.docker.com/jertel/elastabot.
The image will expect a configuration file to exist in the /opt/elastabot/elastabot.json location, so the recommended way to configure Elastabot is to use a file-based volume mount override from the host to this location.
Ex:
docker run --rm -v /host/path/elastabot.json:/opt/elastabot/elastabot.json jertel/elastabot
Elastabot was originally written for installation into a Kubernetes cluster via Helm. A chart is available in the official Kubernetes chart repository: https://github.com/kubernetes/charts/tree/master/stable/elastabot