{"id":5152,"date":"2024-04-08T21:14:43","date_gmt":"2024-04-08T21:14:43","guid":{"rendered":"https:\/\/dalelane.co.uk\/blog\/?p=5152"},"modified":"2024-04-18T20:06:22","modified_gmt":"2024-04-18T20:06:22","slug":"using-mirror-maker-2-with-ibm-event-streams-to-create-a-failover-cluster","status":"publish","type":"post","link":"https:\/\/dalelane.co.uk\/blog\/?p=5152","title":{"rendered":"Using Mirror Maker 2 with IBM Event Streams to create a failover cluster"},"content":{"rendered":"<p><strong>This is the fourth in a series of blog posts sharing examples of ways to use Mirror Maker 2 with IBM Event Streams.<\/strong><\/p>\n<ul>\n<li><a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5098\"><strong>Using Mirror Maker 2 to aggregate events from multiple regions<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5111\"><strong>Using Mirror Maker 2 to broadcast events to multiple regions<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5142\"><strong>Using Mirror Maker 2 to share topics across multiple regions<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5152\"><strong>Using Mirror Maker 2 to create a failover cluster<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5171\"><strong>Using Mirror Maker 2 to restore events from a backup cluster<\/strong><\/a><\/li>\n<li><a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5191\"><strong>Using Mirror Maker 2 to migrate to a different region<\/strong><\/a><\/li>\n<\/ul>\n<p>Mirror Maker 2 is a powerful and flexible tool for moving Kafka events between Kafka clusters.<\/p>\n<p>For this fourth post, I\u2019ll look at using Mirror Maker to create an <strong>active\/passive topology with a backup cluster ready to failover to<\/strong>.<\/p>\n<p><!--more-->This is a different type of demo to the last three. In previous posts, I&#8217;ve shown how to use MM2 as a part of your topology &#8211; illustrating the benefits of mirroring to optimize normal day-to-day operation.<\/p>\n<p>In this post, I&#8217;ll show how to use MM2 to create a passive backup environment, only to be used in the event of the loss of the active primary environment.<\/p>\n<h3>Demo<\/h3>\n<p>For a demonstration of this, I created a three-region version of this pattern:<\/p>\n<p><img decoding=\"async\" style=\"border: thin black solid; width: 100%; max-width: 600px;\" src=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/05-active-passive\/diagram.png?raw=true\"\/><\/p>\n<p>Three Kubernetes namespaces (&#8220;north-america&#8221;, &#8220;south-america&#8221;, &#8220;europe&#8221;) represent three different regions.<\/p>\n<p>The &#8220;North America region&#8221; represents the primary, active environment for the Kafka cluster.<\/p>\n<p>Applications run in the &#8220;South America region&#8221; and produce and consume from topics in the Kafka cluster.<\/p>\n<p>As with previous posts, the producer application is regularly producing randomly generated events, themed around a fictional clothing retailer, <a href=\"https:\/\/github.com\/IBM\/kafka-connect-loosehangerjeans-source\/\">Loosehanger Jeans<\/a>.<\/p>\n<p>In the background, Mirror Maker 2 is maintaining a passive mirror of the Kafka cluster in the &#8220;Europe region&#8221;.<\/p>\n<p>After simulating the loss of the &#8220;North America region&#8221;, applications switch to using the &#8220;Europe region&#8221;.<\/p>\n<p><img decoding=\"async\" style=\"border: thin black solid; width: 100%; max-width: 600px;\" src=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/05-active-passive\/diagram-failover.png?raw=true\"\/><\/p>\n<p>The applications are able to resume more-or-less where they left off before the failover, enabling our fictional Loosehanger Jeans business to continue.<\/p>\n<h3>To create the demo for yourself<\/h3>\n<h4>Setting everything up<\/h4>\n<p>There is an Ansible playbook here which creates the first stage of this:<br \/>\n<a href=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/05-active-passive\/setup.yaml\">github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/05-active-passive\/setup.yaml<\/a><\/p>\n<p>An example of how to run it can be found in the script at: <br \/><a href=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/setup-05-active-passive.sh\"><code>setup-05-active-passive.sh<\/code><\/a><\/p>\n<p>This gets you to this state:<\/p>\n<p><img decoding=\"async\" style=\"border: thin black solid; width: 100%; max-width: 600px;\" src=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/05-active-passive\/diagram.png?raw=true\"\/><\/p>\n<p>This script will also display the URL and username\/password for the Event Streams web UI for North America and Europe regions, to make it easier to log in and see the events.<\/p>\n<p>Once you&#8217;ve created the demo, you can run the <a href=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/consumer-southamerica.sh\"><code>consumer-southamerica.sh<\/code><\/a> script to see the events being received by the consumer application in the &#8220;South America region&#8221;.<\/p>\n<p><img decoding=\"async\" style=\"border: thin black solid; width: 100%; max-width: 600px;\" src=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/docs\/activepassive-consumer.png?raw=true\"\/><\/p>\n<p>If you log in to the Event Streams web UI for the active cluster in the &#8220;North America region&#8221;, you will see information about the consumer application listed there. <\/p>\n<p>You should see that the offset lag for every partition is 0 (or close to it) as the consumer is running, and keeping up with the events on each of the LH topics.<\/p>\n<p><img decoding=\"async\" style=\"border: thin black solid; width: 100%; max-width: 600px;\" src=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/docs\/activepassive-consumer-group-active.png?raw=true\"\/><\/p>\n<p>If you log in to the Event Streams web UI for the passive cluster in the &#8220;Europe region&#8221;, you will also see the consumer application listed there as well. <\/p>\n<p>This isn&#8217;t a separate application. There is no consumer running connected to the &#8220;Europe region&#8221;. <\/p>\n<p>For this scenario, Mirror Maker is mirroring the state of consumer applications as well as topics &#8211; this is a mirrored record of the same application consuming from the &#8220;North America region&#8221;.<\/p>\n<p>Notice that you will likely see that the offset lag described for every partition is up to 25-or-thereabouts.<\/p>\n<p><img decoding=\"async\" style=\"border: thin black solid; width: 100%; max-width: 600px;\" src=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/docs\/activepassive-consumer-group-passive.png?raw=true\"\/><\/p>\n<p>This is because for this demo, Mirror Maker is configured to mirror the offset every time the lag reaches 25. I configured Mirror Maker to wait until the application&#8217;s offset has increased by 25 (since the last time it was sync&#8217;ed) before sending an update. <\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Feventstreams-mirrormaker2-demos%2Fblob%2F019a09af57dc5a82f40b027dbf06efafc9904c11%2F05-active-passive%2Ftemplates%2Fmm2.yaml%23L43-L44&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showFullPath=on\"><\/script><\/p>\n<h4>Simulating a disaster<\/h4>\n<p>In this demo, Kubernetes namespaces are being used to represent regions. To represent the total loss of the &#8220;North America region&#8221;, you can delete the <code>north-america<\/code> namespace.<\/p>\n<pre style=\"border: thin #AA0000 solid; color: #770000; padding: 1em; font-size: 0.95em; background-color: #ffffc0;\">\noc delete project north-america\n<\/pre>\n<h4>Failing over to the &#8220;Europe region&#8221;<\/h4>\n<p>There is an Ansible playbook which performs the failover. <\/p>\n<p>This playbook re-configures both the producer and consumer application to resume what they were doing, using the Kafka cluster in the &#8220;Europe region&#8221;: <a href=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/05-active-passive\/failover.yaml\">github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/05-active-passive\/failover.yaml<\/a><\/p>\n<p>An example of how to run it can be found in the script at: <br \/><a href=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/setup-05-failover-to-passive.sh\"><code>setup-05-failover-to-passive.sh<\/code><\/a><\/p>\n<p>This gets you to this state:<\/p>\n<p><img decoding=\"async\" style=\"border: thin black solid; width: 100%; max-width: 600px;\" src=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/05-active-passive\/diagram-failover.png?raw=true\"\/><\/p>\n<p>Now that there really is an application consuming from the Kafka cluster in the &#8220;Europe region&#8221;, the Event Streams UI for the &#8220;Europe region&#8221; will show a live view of the consumer.<\/p>\n<p><img decoding=\"async\" style=\"border: thin black solid; width: 100%; max-width: 600px;\" src=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/docs\/activepassive-consumer-group-afterfailover.png?raw=true\"\/><\/p>\n<p>After the failover, you can run the <a href=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/consumer-southamerica.sh\"><code>consumer-southamerica.sh<\/code><\/a> script again to see the events received by the consumer application.<\/p>\n<h3>Observing what happened<\/h3>\n<p>You can use the output from the consumer application before and after the failover to see what happened for yourself.<\/p>\n<p>Filtering the logs for a specific topic name can help make it easier to follow. For example, when I grep&#8217;ped my logs to show only lines with <code>LH.ORDERS<\/code>, I got this for the end of the <a href=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/docs\/logs\/activepassive-consumer-beforefailover.log\">consumer log immediately before the failover<\/a>:<\/p>\n<p><img decoding=\"async\" style=\"border: thin black solid; width: 100%; max-width: 600px;\" src=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/docs\/activepassive-consumer-beforefailover.png?raw=true\"\/><\/p>\n<p>And I got this when I grep&#8217;ped the start of the <a href=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/docs\/logs\/activepassive-consumer-afterfailover.log\">consumer log immediately after the failover<\/a>:<\/p>\n<p><img decoding=\"async\" style=\"border: thin black solid; width: 100%; max-width: 600px;\" src=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/docs\/activepassive-consumer-afterfailover.png?raw=true\"\/><\/p>\n<p>I&#8217;ve colour-coded the rows to make it easier to see what happened.<\/p>\n<p><strong>Green<\/strong> : messages that were produced to the &#8220;North American region&#8221;, and consumed from the &#8220;North American region&#8221;<\/p>\n<p><strong>Red<\/strong> : messages that were produced to the &#8220;North American region&#8221;, and consumed <strong>twice<\/strong> &#8211; from the &#8220;North American region&#8221;, and then again from the &#8220;Europe region&#8221;<\/p>\n<p><strong>Blue<\/strong> : messages that were produced to the &#8220;Europe region&#8221;, and consumed from the &#8220;Europe region&#8221;.<\/p>\n<p>Roughly 25 events (29 in this case) were processed twice &#8211; before and after the failover.<\/p>\n<p>However, if you check the topic you should see that there are no duplicate events on the topic, as Mirror Maker has been configured to avoid this:<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Feventstreams-mirrormaker2-demos%2Fblob%2F019a09af57dc5a82f40b027dbf06efafc9904c11%2F05-active-passive%2Ftemplates%2Fmm2.yaml%23L115-L116&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showFullPath=on\"><\/script><\/p>\n<p>These are events that were processed twice &#8211; once when they were consumed from the &#8220;North America region&#8221;, and then again when they were consumed from the &#8220;Europe region&#8221;. <\/p>\n<h3>Understanding what happened<\/h3>\n<p>Here is an simplified illustration of what the logs show. To keep the diagrams simple, I&#8217;ve reduced the numbers of events.<\/p>\n<p>At the point where the disaster happens and the &#8220;North America region&#8221; is lost, imagine if this was the state:<\/p>\n<p><img decoding=\"async\" style=\"border: thin black solid; width: 100%; max-width: 600px;\" src=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/docs\/activepassive-before.png?raw=true\"\/><\/p>\n<p>The producer has sent events about these orders to the <code>LH.ORDERS<\/code> topic:<br \/>\n1, 2, 3, 4, 5, 6, 7, 8, 9, 10<\/p>\n<p>There are two orders that the consumer hasn&#8217;t yet processed. It has so far only processed orders:<br \/>\n1, 2, 3, 4, 5, 6, 7, 8<\/p>\n<p>It has committed an offset so the Kafka cluster in the &#8220;North America region&#8221; has a record that it has processed these orders.<\/p>\n<p>Mirror Maker is mirroring the order events to the &#8220;Europe region&#8221;. It has mirrored the events for orders:<br \/>\n1, 2, 3, 4, 5, 6, 7, 8, 9<\/p>\n<p>Mirror Maker hasn&#8217;t mirrored the event for order 10 yet. It was just about to, before the region was lost.<\/p>\n<p>Mirror Maker mirrored the consumer application&#8217;s consumer offset when it was up to the offset for order 5. It was waiting for the lag to be higher before it updated this again.<\/p>\n<p>Then the region was lost, and everything failed over to the &#8220;Europe region&#8221;.<\/p>\n<p>Everything started running again, and after a brief time, things looked like this:<\/p>\n<p><img decoding=\"async\" style=\"border: thin black solid; width: 100%; max-width: 600px;\" src=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/docs\/activepassive-after.png?raw=true\"\/><\/p>\n<p>The producer started sending events about new orders:<br \/>\n11, 12, 13, 14<\/p>\n<p>The consumer started consuming from the offset that was stored before the failover, so it started by processing orders 6, 7, 8 and 9 (again). Then it carried on processing the new orders, 11, 12, 13, 14.<\/p>\n<h4>What is this meant to illustrate<\/h4>\n<p><strong>Impact on the producer:<\/strong> None<br \/>\nIt was able to resume producing events to the topic on the failover cluster.<\/p>\n<p><strong>Impact on the consumer:<\/strong> Some events were re-processed<br \/>\nConsumer offsets are mirrored periodically. Events since the last time the offset was mirrored will be re-processed because the application doesn&#8217;t have a record of having already processed them. The number of events that are likely to be re-processed are related to the offset lag parameter discussed above.<\/p>\n<p><strong>Impact on the topic:<\/strong> An order event was lost.<br \/>\nThe event for order 10 was produced just before the active region was lost, so there wasn&#8217;t time for it to be mirrored.<br \/>\n(<em>I didn&#8217;t see this happen in my logs &#8211; as far as I can see, I got lucky this time and everything was mirrored in time. But I&#8217;m mentioning it as it is possible<\/em>.)<\/p>\n<p>Overall, my aim here is to highlight the limitations of asynchronous, background mirroring of a constant stream of events. There will always be some data around the time of the loss of an active region that will not have yet been mirrored. <\/p>\n<p>However, it does show that <strong>mirroring is an effective way to enable business continuity in the event of a diaster, where applications can be designed to tolerate re-processing of events<\/strong>.<\/p>\n<h3>How the demo is configured<\/h3>\n<p>The Mirror Maker config can be found here: <a href=\"https:\/\/github.com\/dalelane\/eventstreams-mirrormaker2-demos\/blob\/master\/05-active-passive\/templates\/mm2.yaml\"><code>mm2.yaml<\/code><\/a>.<\/p>\n<p>The spec is commented so that is the main file to read if you want to see how to configure Mirror Maker to satisfy this kind of scenario.<\/p>\n<h3>More scenarios to come<\/h3>\n<p>I&#8217;ve still got some more ideas of scenarios where Mirror Maker is useful, so more posts will come soon.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This is the fourth in a series of blog posts sharing examples of ways to use Mirror Maker 2 with IBM Event Streams. Using Mirror Maker 2 to aggregate events from multiple regions Using Mirror Maker 2 to broadcast events to multiple regions Using Mirror Maker 2 to share topics across multiple regions Using Mirror [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":5153,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[593,583,584],"class_list":["post-5152","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-code","tag-apachekafka","tag-ibmeventstreams","tag-kafka"],"_links":{"self":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/5152","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5152"}],"version-history":[{"count":0,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/5152\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/media\/5153"}],"wp:attachment":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5152"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5152"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5152"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}