Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for Outbound Integration for prometehus #3

Open
Alexvianet opened this issue Aug 21, 2019 · 19 comments
Open

Documentation for Outbound Integration for prometehus #3

Alexvianet opened this issue Aug 21, 2019 · 19 comments

Comments

@Alexvianet
Copy link

I can't find any documentation for Outbound Integration to acknowledge alerts and move them to silence for some period

@xMTinkerer
Copy link
Contributor

Hey @Alexvianet. Does Prometheus provide a silence API? They don't list one specifically here. But I did see this thread that mentions you might be able to do something like

curl -H "Content-Type: application/json" -X POST -d '{
    "comment": "test1",
    "createdBy": "test1",
    "endsAt": "2019-02-20T18:00:59.46418637Z",
    "matchers": [
    {
        "isRegex": false,
        "name": "severity",
        "value": "critical"
    },
    {
        "isRegex": false,
        "name": "job",
        "value": "prometheus"
    },
    {
        "isRegex": false,
        "name": "instance",
        "value": "localhost:9090"
    },
    {
        "isRegex": false,
        "name": "alertname",
        "value": "InstanceDown"
    }]
}' http://alerts.example.org:9093/api/v1/silences

So you could turn that into a request in an outbound integration (or hopefully a custom step in flow designer?) and make the request. Back when we first wrote this, there wasn't even an API to talk to, so we didn't create any of the outbound integration pieces to do so.

Actually, this outbound step wouldn't be a bad idea for our new Step Library.

@Alexvianet
Copy link
Author

Alexvianet commented Aug 21, 2019 via email

@xMTinkerer
Copy link
Contributor

Ah, I see. I think we'll probably take this one in the direction of Flow Designer, but we still need import/export support for the canvas, which should be coming in the next month or two.

Until then, it shouldn't be too hard to add this yourself. On the Flows tab, you can create a new flow and drag in the appropriate outbound trigger you want to work with. Some details here. The steps have support for running in an xMatters Agent, useful if the systems you want to call are behind a firewall.

Is that what you are looking for?

@Alexvianet
Copy link
Author

Alexvianet commented Aug 22, 2019 via email

@xMTinkerer
Copy link
Contributor

Cool. Happy to help. I'll keep this open so we can track the request when we revisit the integration.

@Alexvianet
Copy link
Author

@xMTinkerer any updates with import-export support? for Outbound Integration ?
i do not handle it myself =( needs help =)

@xMTinkerer
Copy link
Contributor

Hey @Alexvianet! Yep, we now have import-export!
They still don't have an API documented for silences, so I'm wary of building something on an undocumented API.
However, I was pointed to the amtool in Alertmanager. Have you used this? We could build a step to use the xMatters Agent installed on the Alertmanager box to fire this command and do the silence. Having an agent adds a little more complexity, but it also allows for behind a firewall interactions.

I've added this item to the stack, but I'm not sure I can provide a timeframe at this point.

@Alexvianet
Copy link
Author

I have installed xMatters Agent but I have not ideas how to make it run amtool and what to write in javascript:
Screenshot from 2019-10-25 16-34-29

@Alexvianet
Copy link
Author

Alexvianet commented Oct 25, 2019

For me curl:

https://alertnamager:9093/api/v1/silences -d {
    "comment": "COMMENT",
    "createdBy": "USER THAT PUSH Acknowledge"
 ",
    "endsAt": "CURRENT DATE +1h", #format e.g: "2019-10-26T17:09:53.652Z"
    "matchers": [
    {
        "isRegex": false,
        "name": "alertname",
        "value": "ALERTNAME FROM Inbound"
    }]
}

also works, but i also have no ideas how to write it in javascript)))
as amtool:
/usr/local/bin/amtool --alertmanager.url http://localhost:9093 silence add alertname="ALERTNAME FROM Inbound" --comment="COMMENT" --author="USER THAT PUSH Acknowledge" --duration="${X}h"
do the same.

It will be very cool if we ll be able to set a specific time for downtime alerts and get some popup num field near the comment field to set the number of hours that need to be added to CURRENT DATE e.g: "endsAt": "CURRENT DATE +${X}h"
${X} is getting from the new popup field e.g:
Screenshot from 2019-10-25 17-39-18

@xMTinkerer
Copy link
Contributor

Ok, cool. There are two options. The first is to use the "Response" trigger that would fire when a user responded. Unfortunately, this doesn't take dynamic input (such as the comment screen in your screenshot). This means you need to set up the static responses such as "Silence 1h", "Silence 2h". etc. Then the script will need to parse out the "2h" and run the script.
The other option is to use the "Comments" trigger and allow the user to enter a number. There isn't a way to add a second input field, so we'll have to use the comments input box.

Unfortunately there might be a problem. In order to enter comments, you have to respond first. In order to get your response into the system ASAP, we process the response immediately. But since we have to wait for user input for the comment, we wait to trigger the comment until they click submit on the comment. This means a comment will trigger the Response trigger as well as the Comment trigger.
I'm not sure the behavior of the amtool if someone adds multiple silences, so this is something to keep in mind. I'd be interested in your take on this.

Anyway, assuming you want to go with the comments route, you will add a new outbound script that looks like so:

image

The agent has access to a Shell (doc here). I took a stab based on the command you provided (caution, untested):

//Create a shell by requiring the xm-shell library.
//If the integration is running in the cloud then this call will fail.

if( annotation.response.response.toLowerCase() != 'silence' ) {
	console.log( 'Ignoring response "' + annotation.response.response + '"' );
	return;
}

try
{
 var Shell = require('xm-shell');
}
catch (e)
{
 console.log("Could not load library:" + e);
 console.log("Running in cloud - exiting.");
}

// Probably need some error checking if the user
// didn't enter a positive integer..
// or a bash injection...
var duration = annotation.comment;

// request the operating system name from the shell
var osname = Shell.osname();
// create a script and pass some parameters into it.
var script = Shell.script(function () {/*#### PLACE YOUR BASH SCRIPT BETWEEN HERE ####
echo hello world

/usr/local/bin/amtool --alertmanager.url http://localhost:9093 silence add \
   alertname="${alertname}" --comment="${comment}" --author="${author}" --duration="${duration}h"
 #### AND HERE #### */},
{ alertname: 'ALERTNAME FROM Inbound', 
  comment: 'User ' + annotation.author.targetName + ' slienced alert.', 
  author: annotation.author.targetName, 
  duration: annotation.comment 
});

// Execute the script.
console.log(Shell.exec('bash', script).output());
console.log(Shell.exec('bash', script).exitCode());    
console.log(Shell.exec('bash', script).error());

You can see the alertname, comment, author, and duration parameters being passed into the Shell script and referenced appropriately.

You can see details on the Comments trigger here and the annotation object here

Let me know how you get on with that!

@Alexvianet
Copy link
Author

Alexvianet commented Oct 25, 2019 via email

@xMTinkerer
Copy link
Contributor

If you just want to send the silence command to Alertmanager for a set duration every time, then you could use the Response trigger and set duration = 2h or whatever.
Note that the Response trigger uses the respondedTo object instead of the annotation object. Keep this in mind when you want to build out the comment.

@xMTinkerer
Copy link
Contributor

xMTinkerer commented Oct 25, 2019

Oh and I think for the Comments trigger, to help deal with shell injection issues and validate the input is in the correct format, you could use this little expression on the duration value:

var duration = annotation.comment;

// Validate the comment is in the form Nh, where
// N is 1 or 2 digits and the h is optional
if( !duration.match( /\d{1,2}h?/ ) ) {
   console.log( 'Comment "' + duration + '" is not in expected format of "Nh"' );
   return;
}

Edit, added a note explaining the RegEx

@Alexvianet
Copy link
Author

My script:

//Create a shell by requiring the xm-shell library.
//If the integration is running in the cloud then this call will fail.

if( annotation.response.response.toLowerCase() != 'silence' ) {
	console.log( 'Ignoring response "' + annotation.response.response + '"' );
	return;
}

try
{
 var Shell = require('xm-shell');
}
catch (e)
{
 console.log("Could not load library:" + e);
 console.log("Running in cloud - exiting.");
}

// Probably need some error checking if the user
// didn't enter a positive integer..
// or a bash injection...
var duration = annotation.comment;

// Validate the comment is in the form Nh, where
// N is 1 or 2 digits and the h is optional
if( !duration.match( /\d{1,2,3,4,5,6,7,8}h?/ ) ) {
   console.log( 'Comment "' + duration + '" is not in expected format of "Nh"' );
   return;
}
// request the operating system name from the shell
var osname = Shell.osname();
// create a script and pass some parameters into it.
var script = Shell.script(function () {/*#### PLACE YOUR BASH SCRIPT BETWEEN HERE ####
echo hello world

/usr/local/bin/amtool --alertmanager.url http://localhost:9093 silence add \
   alertname="${alertname}" --comment="${comment}" --author="${author}" --duration="${duration}h"
 #### AND HERE #### */},
{ alertname: 'ALERTNAME FROM Inbound', 
  comment: 'User ' + annotation.author.targetName + ' slienced alert.', 
  author: annotation.author.targetName, 
  duration: annotation.comment 
});

// Execute the script.
console.log(Shell.exec('bash', script).output());
console.log(Shell.exec('bash', script).exitCode());    
console.log(Shell.exec('bash', script).error());

Screenshot from 2019-11-06 15-56-21
Screenshot from 2019-11-06 15-56-12

and get:

Screenshot from 2019-11-06 15-56-58

@xMTinkerer
Copy link
Contributor

Hrm, that's not good. Can you post the details of the /var/log/xmatters/xmatters-xa/agent-communication-xmatters.log file? If you'd rather email them to me tdepuy at xmatters.com I can take a look.

@Alexvianet
Copy link
Author

Alexvianet commented Nov 11, 2019

xmatter systemd:

[Unit]
Description=xMatters Agent
After=syslog.target network.target

[Service]
Environment="XMATTERS_HOSTNAME=my.xmatters.com"
Environment="XMATTERS_KEY=key"
Environment="API_KEY=564caafcxxxxxxxxxxxxxxxx9763cfd"
Environment="XA_PROXY_IP=10.10.10.10"
Environment="XA_PROXY_PORT=8080"
Environment="XA_PROXY_DOMAIN=proxy.my.net"
Environment="XA_PROXY_BYPASS=127.0.0.1,localhost"
User=xmatters
WorkingDirectory=/opt/xmatters/xa
ExecStart=/opt/xmatters/xa/bin/systemd.start.sh
ExecStop=/opt/xmatters/xa/bin/systemd.stop.sh
Restart=on-failure
RestartSec=30s
SyslogFacility=local1
SyslogIdentifier=xmatters-xa

[Install]
WantedBy=multi-user.target

/var/log/xmatters/xmatters-xa/agent-communication-xmatters.log got:

2019-11-11 05:10:05,381 11742 [Thread-415] com.xmatters.xagent.services.RemoteHyraxService ERROR --- Websocket connection to xMatters is lost. Connection closed 
2019-11-11 05:10:05,620 11742 [pool-1-thread-1] com.xmatters.xagent.services.RemoteHyraxService INFO --- Attempting to connect to xMatters websocket: 1 
2019-11-11 05:10:06,196 11742 [Thread-418] com.xmatters.xagent.services.RemoteHyraxService INFO --- Websocket connection to xMatters established. 
2019-11-11 06:10:16,054 11742 [Thread-418] com.xmatters.xagent.services.RemoteHyraxService ERROR --- Websocket connection to xMatters is lost. Connection closed 
2019-11-11 06:10:16,290 11742 [pool-1-thread-1] com.xmatters.xagent.services.RemoteHyraxService INFO --- Attempting to connect to xMatters websocket: 1 
2019-11-11 06:10:16,879 11742 [Thread-421] com.xmatters.xagent.services.RemoteHyraxService INFO --- Websocket connection to xMatters established. 
2019-11-11 07:10:26,738 11742 [Thread-421] com.xmatters.xagent.services.RemoteHyraxService ERROR --- Websocket connection to xMatters is lost. Connection closed 
2019-11-11 07:10:26,976 11742 [pool-1-thread-1] com.xmatters.xagent.services.RemoteHyraxService INFO --- Attempting to connect to xMatters websocket: 1 
2019-11-11 07:10:27,695 11742 [Thread-424] com.xmatters.xagent.services.RemoteHyraxService INFO --- Websocket connection to xMatters established. 
2019-11-11 08:10:37,553 11742 [Thread-424] com.xmatters.xagent.services.RemoteHyraxService ERROR --- Websocket connection to xMatters is lost. Connection closed 
2019-11-11 08:10:37,798 11742 [pool-1-thread-1] com.xmatters.xagent.services.RemoteHyraxService INFO --- Attempting to connect to xMatters websocket: 1 
2019-11-11 08:10:38,370 11742 [Thread-427] com.xmatters.xagent.services.RemoteHyraxService INFO --- Websocket connection to xMatters established. 
2019-11-11 09:10:48,228 11742 [Thread-427] com.xmatters.xagent.services.RemoteHyraxService ERROR --- Websocket connection to xMatters is lost. Connection closed 
2019-11-11 09:10:48,470 11742 [pool-1-thread-1] com.xmatters.xagent.services.RemoteHyraxService INFO --- Attempting to connect to xMatters websocket: 1 
2019-11-11 09:10:49,028 11742 [Thread-430] com.xmatters.xagent.services.RemoteHyraxService INFO --- Websocket connection to xMatters established. 
2019-11-11 10:10:58,886 11742 [Thread-430] com.xmatters.xagent.services.RemoteHyraxService ERROR --- Websocket connection to xMatters is lost. Connection closed 
2019-11-11 10:10:59,124 11742 [pool-1-thread-1] com.xmatters.xagent.services.RemoteHyraxService INFO --- Attempting to connect to xMatters websocket: 1 
2019-11-11 10:10:59,698 11742 [Thread-433] com.xmatters.xagent.services.RemoteHyraxService INFO --- Websocket connection to xMatters established. 

@xMTinkerer
Copy link
Contributor

@Alexvianet Did you ever get this resolved? Are you still having issues with the agent?

@Alexvianet
Copy link
Author

nope i am stuck on previous comment, and was busy with other stuff

@xMTinkerer
Copy link
Contributor

Darn. Can you open a support ticket when you get a chance? I don't have insight to the backend to debug further. Cruise over here and they'll help you out:
https://support.xmatters.com/hc/en-us/requests/new

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants