{"id":3630,"date":"2018-10-07T08:14:26","date_gmt":"2018-10-07T08:14:26","guid":{"rendered":"http:\/\/dalelane.co.uk\/blog\/?p=3630"},"modified":"2019-07-07T11:38:38","modified_gmt":"2019-07-07T11:38:38","slug":"setting-up-slack-alerts-to-monitor-ibm-event-streams","status":"publish","type":"post","link":"https:\/\/dalelane.co.uk\/blog\/?p=3630","title":{"rendered":"Setting up Slack alerts to monitor IBM Event Streams"},"content":{"rendered":"<p><a href=\"https:\/\/www.ibm.com\/cloud\/event-streams\">IBM Event Streams<\/a> brings <a href=\"https:\/\/kafka.apache.org\/\">Apache Kafka<\/a> to IBM Cloud Private (together with a bunch of other useful stuff to make it easier to run and use Kafka). <\/p>\n<p>Monitoring is an important part of running a Kafka cluster. There are a variety of metrics that are useful indicators of the health of the cluster and serve as warnings of potential future problems. <\/p>\n<p>To that end, Event Streams collects metrics from all of the Kafka brokers and exports them to a <a href=\"https:\/\/prometheus.io\/docs\/introduction\/overview\/\">Prometheus<\/a>-based monitoring platform. <\/p>\n<p>There are three ways to use this:<\/p>\n<p><strong>1) A selection of metrics can be viewed from a dashboard in the Event Streams admin UI.<\/strong><br \/>\n<a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/44231396755\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-11\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1906\/44231396755_30627eacfb.jpg\" width=\"450\" height=\"211\" alt=\"eventstreams-monitoring-20181006-11\"\/><\/a><br \/>\nThis is good for a quick way to get started.  <\/p>\n<p><strong>2) Grafana is pre-configured and available out-of-the-box to create custom dashboards<\/strong><br \/>\n<a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/45143027761\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-13\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1952\/45143027761_2a6d7f9dc4.jpg\" width=\"450\" height=\"220\" alt=\"eventstreams-monitoring-20181006-13\"\/><\/a><br \/>\nThis will be useful for long-term projects, as <a href=\"https:\/\/grafana.com\/grafana\">Grafana<\/a> lets you create dashboards showing the metrics that are most important for your unique needs. A <a href=\"https:\/\/github.com\/IBM\/charts\/blob\/master\/stable\/ibm-eventstreams-dev\/additionalFiles\/ibm-eventstreams-grafanadashboard.json\">sample dashboard<\/a> is included to help get you started. <\/p>\n<p><strong>3) Alerts can be created, so that metrics that meet predefined criteria can be used to push notifications to a variety of tools, like Slack, PagerDuty, HipChat, OpsGenie, email, and many, many more.<\/strong><br \/>\n<a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/43329420650\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-20\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1903\/43329420650_7095f6dc53.jpg\" width=\"450\" height=\"142\" alt=\"eventstreams-monitoring-20181006-20\"\/><\/a><br \/>\nThis is useful for being able to respond to changes in the metrics values when you&#8217;re not looking at the Monitor UI or Grafana dashboard. <\/p>\n<p>For example, you might want a combination of alert approaches like:<\/p>\n<ul>\n<li>metrics and\/or metric values that might not be urgent but should get some attention result in <strong>an automated email<\/strong> being sent to a team email address\n<\/li>\n<li>metrics and\/or metric values that suggest a more severe issue could result in <strong>a Slack message<\/strong> to a team workspace\n<\/li>\n<li>metrics and\/or metric values that suggest an urgent critical issue could result in creating <strong>a PagerDuty ticket<\/strong> so that it gets immediate attention\n<\/li>\n<\/ul>\n<p>This post is about this third use of monitoring and metrics: <strong>how you can configure alerts based on the metrics available from your Kafka brokers in IBM Event Streams<\/strong>. <\/p>\n<p><!--more-->I&#8217;ll use Slack as a worked example for how to set up an alert for a specific metric value, but the same approach and config would work for the variety of use cases described above. <\/p>\n<p>The steps described below are:<\/p>\n<ul>\n<li type=\"square\">Preparing the destination where the alerts will be sent (e.g. Slack, PagerDuty, etc.)\n<\/li>\n<li type=\"square\">Choose the metric(s) that should trigger an alert\n<\/li>\n<li type=\"square\">Specify the criteria that should trigger the alert\n<\/li>\n<li type=\"square\">Define where the alert should be sent\n<\/li>\n<li type=\"square\">Testing and viewing the alerts\n<\/li>\n<\/ul>\n<h3>Preparing the alert destination<\/h3>\n<p>In this case, I&#8217;m using Slack. The first step is to create an Incoming Webhook for my Slack workspace.<\/p>\n<p><a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/30205748027\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-9\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1978\/30205748027_81f0ab98da.jpg\" width=\"450\" height=\"239\" alt=\"eventstreams-monitoring-20181006-9\"\/><\/a><\/p>\n<p>The value I need is the webhook URL. <\/p>\n<p><a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/45095031202\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-10\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1951\/45095031202_4238ac75a3.jpg\" width=\"450\" height=\"383\" alt=\"eventstreams-monitoring-20181006-10\"\/><\/a><\/p>\n<h3>Choose the metric(s) to trigger an alert<\/h3>\n<p>One quick way to get a list of the metrics that are available is to use the drop-down list provided when you add a new panel in Grafana.<\/p>\n<p><a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/43330385120\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-14\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1938\/43330385120_62a0b04297.jpg\" width=\"450\" height=\"182\" alt=\"eventstreams-monitoring-20181006-14\"\/><\/a><\/p>\n<p>A better way is to use the Prometheus API to fetch the labels available &#8211; an HTTP GET to https:\/\/CLUSTERIP:8443\/prometheus\/api\/v1\/label\/__name__\/values is a good starting point.<\/p>\n<p><a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/43330384120\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-16\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1914\/43330384120_8014dba7af_n.jpg\" width=\"272\" height=\"320\" alt=\"eventstreams-monitoring-20181006-16\"\/><\/a><\/p>\n<p>An explanation of what the different metrics mean can be found in <a href=\"https:\/\/docs.confluent.io\/current\/kafka\/monitoring.html\">Monitoring Kafka<\/a>.<\/p>\n<p>(Note that not all of the metrics that Kafka can produce are published to Prometheus by default. The metrics that are published are controlled by a <a href=\"https:\/\/github.com\/IBM\/charts\/blob\/master\/stable\/ibm-eventstreams-dev\/templates\/metrics-configmap.yaml\">config-map<\/a>.)<\/p>\n<p>For the rest of this post, I&#8217;ll be using the number of under-replicated partitions.<\/p>\n<h3>Specifying the criteria for an alert<\/h3>\n<p>The next step is to specify the criteria that should trigger an alert. The place to do that is the <code>monitoring-prometheus-alertrules<\/code> config map.<\/p>\n<p>Start by having a look at the default empty list of rules. It should look something like this:<\/p>\n<pre style=\"border: thin solid silver; background-color: #eeeeee; padding: 0.7em; font-size: 1.1em; overflow: auto;\">dalelane$ <strong>kubectl get configmap -n kube-system monitoring-prometheus-alertrules -o yaml<\/strong>\r\n\r\napiVersion: v1\r\n<strong>data:<\/strong>\r\n  <strong>alert.rules: \"\"<\/strong>\r\nkind: ConfigMap\r\nmetadata:\r\n  creationTimestamp: 2018-10-05T13:07:48Z\r\n  labels:\r\n    app: monitoring-prometheus\r\n    chart: ibm-icpmonitoring-1.2.0\r\n    component: prometheus\r\n    heritage: Tiller\r\n    release: monitoring\r\n  name: monitoring-prometheus-alertrules\r\n  namespace: kube-system\r\n  resourceVersion: \"4564\"\r\n  selfLink: \/api\/v1\/namespaces\/kube-system\/configmaps\/monitoring-prometheus-alertrules\r\n  uid: a87b5766-c89f-11e8-9f94-00000a3304c0<\/pre>\n<p>For this post, I&#8217;ll be adding a rule that will trigger if the number of under-replicated partitions is greater than 0 for over a minute.<\/p>\n<pre style=\"border: thin solid silver; background-color: #eeeeee; padding: 0.7em; font-size: 1.1em; overflow: auto;\">dalelane$ <strong>kubectl edit -n kube-system monitoring-prometheus-alertrules<\/strong>\r\n\r\napiVersion: v1\r\ndata:\r\n  <strong>sample.rules: |-<\/strong>\r\n    <strong>groups:<\/strong>\r\n    <strong>- name: alert.rules<\/strong>\r\n      #\r\n      # Each of the alerts you want to create will be listed here\r\n      <strong>rules:<\/strong>\r\n      # Posts an alert if there are any under-replicated partitions\r\n      #  for longer than a minute\r\n      <strong>- alert: under_replicated_partitions<\/strong>\r\n        <strong>expr: kafka_server_replicamanager_underreplicatedpartitions_value > 0<\/strong>\r\n        <strong>for: 1m<\/strong>\r\n        labels:\r\n          # Labels should match the alert manager so that it is received by the Slack hook\r\n          severity: critical\r\n        # The contents of the Slack messages that are posted are defined here\r\n        <strong>annotations:<\/strong>\r\n          <strong>identifier: \"Under-replicated partitions\"<\/strong>\r\n          <strong>description: \"There are {{ $value }} under-replicated partition(s) reported by broker {{ $labels.kafka }}\"<\/strong>\r\nkind: ConfigMap\r\nmetadata:\r\n  creationTimestamp: 2018-10-05T13:07:48Z\r\n  labels:\r\n    app: monitoring-prometheus\r\n    chart: ibm-icpmonitoring-1.2.0\r\n    component: prometheus\r\n    heritage: Tiller\r\n    release: monitoring\r\n  name: monitoring-prometheus-alertrules\r\n  namespace: kube-system\r\n  resourceVersion: \"84156\"\r\n  selfLink: \/api\/v1\/namespaces\/kube-system\/configmaps\/monitoring-prometheus-alertrules\r\n  uid: a87b5766-c89f-11e8-9f94-00000a3304c0<\/pre>\n<p><em>(You only need to fill in the <code>data<\/code> section &#8211; the <code>metadata<\/code> will change when you make changes, but you can leave that alone).<\/em><\/p>\n<h3>Define where the alert should be sent<\/h3>\n<p>The next step is to define what should happen with the alerts. The place to do that is the <code>monitoring-prometheus-alertmanager<\/code> config map.<\/p>\n<p>Start by having a look at the default empty list of receivers. It should look something like this:<\/p>\n<pre style=\"border: thin solid silver; background-color: #eeeeee; padding: 0.7em; font-size: 1.1em; overflow: auto;\">dalelane$ <strong>kubectl get configmap -n kube-system monitoring-prometheus-alertmanager -o yaml<\/strong>\r\n\r\napiVersion: v1\r\n<strong>data:<\/strong>\r\n  <strong>alertmanager.yml: |-<\/strong>\r\n    <strong>global:<\/strong>\r\n    <strong>receivers:<\/strong>\r\n      <strong>- name: default-receiver<\/strong>\r\n    <strong>route:<\/strong>\r\n      <strong>group_wait: 10s<\/strong>\r\n      <strong>group_interval: 5m<\/strong>\r\n      <strong>receiver: default-receiver<\/strong>\r\n      <strong>repeat_interval: 3h<\/strong>\r\nkind: ConfigMap\r\nmetadata:\r\n  creationTimestamp: 2018-10-05T13:07:48Z\r\n  labels:\r\n    app: monitoring-prometheus\r\n    chart: ibm-icpmonitoring-1.2.0\r\n    component: alertmanager\r\n    heritage: Tiller\r\n    release: monitoring\r\n  name: monitoring-prometheus-alertmanager\r\n  namespace: kube-system\r\n  resourceVersion: \"4565\"\r\n  selfLink: \/api\/v1\/namespaces\/kube-system\/configmaps\/monitoring-prometheus-alertmanager\r\n  uid: a87bdb44-c89f-11e8-9f94-00000a3304c0<\/pre>\n<p>For this post, I&#8217;ll be adding the Slack webhook I created before.<\/p>\n<pre style=\"border: thin solid silver; background-color: #eeeeee; padding: 0.7em; font-size: 1.1em; overflow: auto;\">dalelane$ <strong>kubectl edit configmap -n kube-system monitoring-prometheus-alertmanager<\/strong>\r\napiVersion: v1\r\ndata:\r\n  alertmanager.yml: |-\r\n    global:\r\n      # This is the URL for the Incoming Webhook you created in Slack\r\n      <strong>slack_api_url:  https:\/\/hooks.slack.com\/services\/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<\/strong>\r\n    receivers:\r\n      - name: default-receiver\r\n        #\r\n        # Adding a Slack channel integration to the default Prometheus receiver\r\n        #  see <a href=\"https:\/\/prometheus.io\/docs\/alerting\/configuration\/#%3Cslack_config%3E\">https:\/\/prometheus.io\/docs\/alerting\/configuration\/#%3Cslack_config%3E<\/a>\r\n        #  for details about the values to enter\r\n        slack_configs:\r\n        - <strong>send_resolved: true<\/strong>\r\n\r\n          # The name of the Slack channel that alerts should be posted to\r\n          <strong>channel: \"#ibm-eventstreams-demo\"<\/strong>\r\n\r\n          # The username to post alerts as\r\n          <strong>username: \"IBM Event Streams\"<\/strong>\r\n\r\n          # An icon for posts in Slack\r\n          <strong>icon_url: https:\/\/developer.ibm.com\/messaging\/wp-content\/uploads\/sites\/18\/2018\/09\/icon_dev_32_24x24.png<\/strong>\r\n\r\n          #\r\n          # The content for posts to Slack when alert conditions are fired\r\n          # Improves on the formatting from the default, with support for handling\r\n          #  alerts containing multiple events.\r\n          # (Modified from the examples in\r\n          #   <a href=\"https:\/\/medium.com\/quiq-blog\/better-slack-alerts-from-prometheus-49125c8c672b\">https:\/\/medium.com\/quiq-blog\/better-slack-alerts-from-prometheus-49125c8c672b<\/a>)\r\n          <strong>title: |-<\/strong>\r\n            <strong>[{{ .Status | toUpper }}{{ if eq .Status \"firing\" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ if or (and (eq (len .Alerts.Firing) 1) (eq (len .Alerts.Resolved) 0)) (and (eq (len .Alerts.Firing) 0) (eq (len .Alerts.Resolved) 1)) }}{{ range .Alerts.Firing }} @ {{ .Annotations.identifier }}{{ end }}{{ range .Alerts.Resolved }} @ {{ .Annotations.identifier }}{{ end }}{{ end }}<\/strong>\r\n          <strong>text: |-<\/strong>\r\n            <strong>{{ if or (and (eq (len .Alerts.Firing) 1) (eq (len .Alerts.Resolved) 0)) (and (eq (len .Alerts.Firing) 0) (eq (len .Alerts.Resolved) 1)) }}<\/strong>\r\n            <strong>{{ range .Alerts.Firing }}{{ .Annotations.description }}{{ end }}{{ range .Alerts.Resolved }}{{ .Annotations.description }}{{ end }}<\/strong>\r\n            <strong>{{ else }}<\/strong>\r\n            <strong>{{ if gt (len .Alerts.Firing) 0 }}<\/strong>\r\n            <strong>*Alerts Firing:*<\/strong>\r\n            <strong>{{ range .Alerts.Firing }}- {{ .Annotations.identifier }}: {{ .Annotations.description }}<\/strong>\r\n            <strong>{{ end }}{{ end }}<\/strong>\r\n            <strong>{{ if gt (len .Alerts.Resolved) 0 }}<\/strong>\r\n            <strong>*Alerts Resolved:*<\/strong>\r\n            <strong>{{ range .Alerts.Resolved }}- {{ .Annotations.identifier }}: {{ .Annotations.description }}<\/strong>\r\n            <strong>{{ end }}{{ end }}<\/strong>\r\n            <strong>{{ end }}<\/strong>\r\n    route:\r\n      group_wait: 10s\r\n      group_interval: 5m\r\n      receiver: default-receiver\r\n      repeat_interval: 3h\r\n      #\r\n      # The criteria for events that should go to Slack\r\n      <strong>routes:<\/strong>\r\n      <strong>- match:<\/strong>\r\n          <strong>severity: critical<\/strong>\r\n        <strong>receiver: default-receiver<\/strong>\r\nkind: ConfigMap\r\nmetadata:\r\n  creationTimestamp: 2018-10-05T13:07:48Z\r\n  labels:\r\n    app: monitoring-prometheus\r\n    chart: ibm-icpmonitoring-1.2.0\r\n    component: alertmanager\r\n    heritage: Tiller\r\n    release: monitoring\r\n  name: monitoring-prometheus-alertmanager\r\n  namespace: kube-system\r\n  resourceVersion: \"4565\"\r\n  selfLink: \/api\/v1\/namespaces\/kube-system\/configmaps\/monitoring-prometheus-alertmanager\r\n  uid: a87bdb44-c89f-11e8-9f94-00000a3304c0<\/pre>\n<p><em>(As before, you only need to fill in the <code>data<\/code> section &#8211; the <code>metadata<\/code> will change when you make changes, but you can leave that alone)<\/em>.<\/p>\n<p>It might take a minute or two before the alert rule takes effect. You can tell when it is ready by reviewing the <strong>Alerts<\/strong> or <strong>Rules<\/strong> tabs in the Prometheus UI ( https:\/\/CLUSTER.IP:8443\/prometheus )  <\/p>\n<p><a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/30206221287\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-18\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1931\/30206221287_8d22c453f1.jpg\" width=\"450\" height=\"212\" alt=\"eventstreams-monitoring-20181006-18\"\/><\/a><\/p>\n<p><a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/43330787530\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-17\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1978\/43330787530_cff56c65b6.jpg\" width=\"450\" height=\"176\" alt=\"eventstreams-monitoring-20181006-17\"\/><\/a><\/p>\n<p>That&#8217;s it.<\/p>\n<p>All that remains is to show the alerts in action.<\/p>\n<h3>Testing &#8211; viewing the alerts<\/h3>\n<h4>Individual alerts<\/h4>\n<p>If I intentionally knock over one of the Kafka brokers in my cluster, I can see the effect in the Prometheus alerts view. In the time before the 1 minute threshold I specified is exceeded, the alert will show as <code>PENDING<\/code>.<\/p>\n<p><a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/31277195798\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-1\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1966\/31277195798_ee6663be04.jpg\" width=\"450\" height=\"182\" alt=\"eventstreams-monitoring-20181006-1\"\/><\/a><\/p>\n<p>If the number of under-replicated partitions remains above 0 for a minute, that status changes to <code>FIRING<\/code>.<\/p>\n<p><a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/43337062620\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-2\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\"  src=\"https:\/\/farm2.staticflickr.com\/1931\/43337062620_912c29d471.jpg\" width=\"450\" height=\"185\" alt=\"eventstreams-monitoring-20181006-2\"\/><\/a><\/p>\n<p>More importantly, an alert is posted to the receiver that I defined before &#8211; in this case, my Slack channel.<\/p>\n<p><a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/43329420650\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-20\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1903\/43329420650_7095f6dc53.jpg\" width=\"450\" height=\"142\" alt=\"eventstreams-monitoring-20181006-20\"\/><\/a><\/p>\n<p>If I leave the cluster alone to recover, a new alert is posted once the number of under-replicated partitions returns to 0. (If you don&#8217;t want it to do that, you can set <code>send_resolved<\/code> to false in the config above).<\/p>\n<p><a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/43337105950\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-21\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1973\/43337105950_dbe5f2cf62.jpg\" width=\"450\" height=\"156\" alt=\"eventstreams-monitoring-20181006-21\"\/><\/a><\/p>\n<h4>Multiple alerts<\/h4>\n<p>Another example &#8211; but this time I&#8217;ll leave the broker &#8220;broken&#8221; for a bit longer. From the Grafana dashboard, you can create a view to monitor the same value.<\/p>\n<p><a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/30212860757\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-6\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1961\/30212860757_c466a498c8.jpg\" width=\"450\" height=\"194\" alt=\"eventstreams-monitoring-20181006-6\"\/><\/a><\/p>\n<p>This time, there are two brokers that are reporting under-replicated partitions. The alerts that are posted can include multiple values in such cases:<\/p>\n<p><a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/44429179854\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-22\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1936\/44429179854_858e3829fc.jpg\" width=\"450\" height=\"234\" alt=\"eventstreams-monitoring-20181006-22\"\/><\/a><\/p>\n<p>I left the cluster in this state for a little longer this time.<\/p>\n<p><a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/44429182584\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-23\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1924\/44429182584_b62946b7d6.jpg\" width=\"450\" height=\"189\" alt=\"eventstreams-monitoring-20181006-23\"\/><\/a><\/p>\n<p>As before, once the value returns to 0, a new notification is posted that the alert has been resolved.<\/p>\n<p><a href=\"https:\/\/www.flickr.com\/photos\/dalelane\/44429185314\/in\/datetaken-public\/\" title=\"eventstreams-monitoring-20181006-24\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid\" src=\"https:\/\/farm2.staticflickr.com\/1969\/44429185314_6b7c5da887.jpg\" width=\"450\" height=\"340\" alt=\"eventstreams-monitoring-20181006-24\"\/><\/a><\/p>\n<h3>Summary<\/h3>\n<p>You can use this technique to generate alerts in a variety of different applications, including <a href=\"https:\/\/prometheus.io\/docs\/alerting\/configuration\/#%3Chipchat_config%3E\">HipChat<\/a>, <a href=\"https:\/\/prometheus.io\/docs\/alerting\/configuration\/#%3Cpagerduty_config%3E\">PagerDuty<\/a>, <a href=\"https:\/\/prometheus.io\/docs\/alerting\/configuration\/#%3Copsgenie_config%3E\">OpsGenie<\/a>, <a href=\"https:\/\/prometheus.io\/docs\/alerting\/configuration\/#%3Cwechat_config%3E\">WeChat<\/a> and <a href=\"https:\/\/prometheus.io\/docs\/alerting\/configuration\/#%3Cemail_config%3E\">sending emails<\/a>. You can also use this technique to generate <a href=\"https:\/\/prometheus.io\/docs\/alerting\/configuration\/#%3Chttp_config%3E\">HTTP calls<\/a>, which lets you easily do custom things with the alerts if you define a flow in something like Node-RED or App Connect. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>IBM Event Streams brings Apache Kafka to IBM Cloud Private (together with a bunch of other useful stuff to make it easier to run and use Kafka). Monitoring is an important part of running a Kafka cluster. There are a variety of metrics that are useful indicators of the health of the cluster and serve [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7,4],"tags":[593,582,583,584],"class_list":["post-3630","post","type-post","status-publish","format-standard","hentry","category-code","category-ibm","tag-apachekafka","tag-eventstreams","tag-ibmeventstreams","tag-kafka"],"_links":{"self":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/3630","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3630"}],"version-history":[{"count":0,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/3630\/revisions"}],"wp:attachment":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3630"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3630"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3630"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}