{"id":5806,"date":"2026-01-12T16:28:42","date_gmt":"2026-01-12T16:28:42","guid":{"rendered":"https:\/\/dalelane.co.uk\/blog\/?p=5806"},"modified":"2026-03-16T23:14:45","modified_gmt":"2026-03-16T23:14:45","slug":"flink-sql-examples-with-click-tracking-events","status":"publish","type":"post","link":"https:\/\/dalelane.co.uk\/blog\/?p=5806","title":{"rendered":"Flink SQL examples with click tracking events"},"content":{"rendered":"<p><strong>In this post, I introduce a few core <a href=\"https:\/\/nightlies.apache.org\/flink\/flink-docs-master\/docs\/dev\/table\/functions\/systemfunctions\/\">Flink SQL functions<\/a> using worked examples of processing a stream of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Click_tracking\">click tracking<\/a> events from a retail website.<\/strong><\/p>\n<p>I find that a practical, real-world (ish) example can help to explain how to use Flink SQL in a way that abstract descriptions, such as <a href=\"https:\/\/dalelane.co.uk\/blog\/?p=4998\">processing coloured blocks<\/a> sometimes doesn&#8217;t quite achieve.<\/p>\n<p>I&#8217;ll use this post to give examples of my most-used Flink SQL functions, in the context of a retail scenario: a stream of events from customers on the website for a clothing retailer.<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/01-ep-overview.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/01-ep-overview.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p><em>Note: I used <a href=\"https:\/\/www.ibm.com\/products\/event-automation\/event-processing\">Event Processing<\/a> to create the flows, as the assistants in the canvas helped me create examples quickly. Everything I&#8217;ve created is standard Apache Flink SQL, so you don&#8217;t need to have Event Processing to try these examples. <\/em><\/p>\n<ul>\n<li>The examples:\n<ul>\n<li><a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5806#consuming\"><sup style=\"font-size: 0.6em;\">0<\/sup> Consuming Avro<\/a> &#8211; bring click tracking events into Flink<\/li>\n<li><a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5806#transform\"><sup style=\"font-size: 0.6em;\">1<\/sup> Transforming<\/a> &#8211; deriving new properties<\/li>\n<li><a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5806#join\"><sup style=\"font-size: 0.6em;\">2<\/sup> Joining<\/a> &#8211; correlating with related event streams<\/li>\n<li><a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5806#tumble\"><sup style=\"font-size: 0.6em;\">3<\/sup> Aggregating (tumble)<\/a> &#8211; counting in a tumble window<\/li>\n<li><a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5806#session\"><sup style=\"font-size: 0.6em;\">4<\/sup> Aggregating (session)<\/a> &#8211; counting in a session window<\/li>\n<li><a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5806#sessionfilter\"><sup style=\"font-size: 0.6em;\">5<\/sup> Aggregating (session)<\/a> &#8211; collecting in a session window<\/li>\n<\/ul>\n<\/li>\n<li><a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5806#data\">Data<\/a> &#8211; the events I&#8217;m processing in these examples<\/li>\n<li><a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5806#setup\">Setup<\/a> &#8211; how to recreate this if you want to try this for yourself<\/li>\n<\/ul>\n<p><!--more--><\/p>\n<h3 style=\"color: white; background-color: #000099; padding: 5px; margin-top: 25px;\"><a style=\"color: white;\" name=\"consuming\"><sup style=\"font-size: 0.6em;\">0<\/sup> Consuming click tracking events<\/a><\/h3>\n<p><strong>Demonstrating how to consume Avro events and connect to a schema registry to fetch schemas as needed.<\/strong><\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/0-setup\/0-nodes.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/0-setup\/0-nodes.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/ep-flows\/0%20-%20consuming%20click%20tracking%20events.json\">Event Processing project<\/a> for this example is just a single event source node.<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/0-setup\/1-source-config.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/0-setup\/1-source-config.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>I want Flink to dynamically fetch schemas on-demand for the events that it consumes, so I provided connection details for a Confluent-compatible schema registry API.<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/0-setup\/2-source-config.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/0-setup\/2-source-config.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>I still need to define the table for Flink, which Event Processing generated by fetching the current version of the schema for the topic and converting the Avro schema (<a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5806#data\">see below<\/a>) into the equivalent SQL table definition.<\/p>\n<p>I&#8217;ve added a few comments to the <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/0-setup.sql\">SQL generated for this example<\/a> for readability.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsql%2F0-setup.sql&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=450\"><\/script><br \/>\n<small><a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/0-setup.sql\">0-setup.sql<\/a><\/small><\/p>\n<p>(<em>If you use this, you&#8217;ll need to <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/0-setup.sql#L55\">replace the password<\/a>  but otherwise that SQL will work as-is if you&#8217;ve used <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/SETUP.md\">the same setup as me<\/a>.<\/em>)<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/0-setup\/3-running.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/0-setup\/3-running.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>Running this SQL isn&#8217;t super exciting, but this is getting us started.<\/p>\n<h3 style=\"color: white; background-color: #000099; padding: 5px; margin-top: 25px;\"><a style=\"color: white;\" name=\"transform\"><sup style=\"font-size: 0.6em;\">1<\/sup> Identifying marketing campaign effectiveness<\/a><\/h3>\n<p><strong>Demonstrating how to transform events by deriving marketing campaign properties from query parameters in click event URLs.<\/strong><\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/1-transform\/0-nodes.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/1-transform\/0-nodes.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/ep-flows\/1%20-%20identifying%20campaign%20effectiveness.json\">Event Processing project<\/a> for this builds on the <a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5806#consuming\">previous example<\/a> by adding a Transform node.<\/p>\n<p>Click events <a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5806#data\">contain the URL that was the origin of the click event<\/a>. This example demonstrates how to use some of <a href=\"https:\/\/nightlies.apache.org\/flink\/flink-docs-master\/docs\/dev\/table\/functions\/systemfunctions\/\">Flink&#8217;s built-in functions<\/a> by chopping up that URL to extract <a href=\"https:\/\/en.wikipedia.org\/wiki\/UTM_parameters\">marketing campaign properties<\/a>.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsql%2F1-transform.sql%23L66-L99&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=500\"><\/script><small><a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/1-transform.sql\">1-transform.sql<\/a><\/small><\/p>\n<p>The interesting bits are <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/1-transform.sql#L91-L96\">lines 91-96<\/a> which are a nice example of how Flink has <a href=\"https:\/\/nightlies.apache.org\/flink\/flink-docs-master\/docs\/dev\/table\/functions\/systemfunctions\/\">functions to solve a wide variety of data parsing<\/a> and processing use cases.<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/1-transform\/1-running.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/1-transform\/1-running.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The screenshot above from running the flow in Event Processing shows a comparison between the raw URL from the click tracking event with the different <code style=\"background-color: #FFFFC0; color: #770000; padding: 4px; font-weight: 600;\">utm_<\/code> parameters that it has extracted.<\/p>\n<p>(<em>Many of them are null because not all URLs include marketing campaign properties.<\/em>)<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/1-transform\/2-running-filtered.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/1-transform\/2-running-filtered.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>Adding an additional filter for <code style=\"background-color: #FFFFC0; color: #770000; padding: 4px; font-weight: 600;\">digital_marketing IS TRUE<\/code> lets me show only the click tracking events that contained a URL with marketing campaign properties.<\/p>\n<h3 style=\"color: white; background-color: #000099; padding: 5px; margin-top: 25px;\"><a style=\"color: white;\" name=\"join\"><sup style=\"font-size: 0.6em;\">2<\/sup> Click tracking activity by new customers<\/a><\/h3>\n<p><strong>Demonstrating an interval join to correlate between related streams of events, by identifying click tracking events that occur within a short time of a new customer registration event for the same user.<\/strong><\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/2-interval-join\/0-nodes.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/2-interval-join\/0-nodes.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/ep-flows\/2%20-%20click%20activity%20by%20new%20customers.json\">Event Processing project<\/a> contains two event source nodes (one for a click tracking topic, the second for a customer registrations topic) and an Interval Join node to correlate events between them.<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/2-interval-join\/1-join-config.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/2-interval-join\/1-join-config.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The assistant was helpful here to configure the interval join. I needed to decide on the time window to use (how soon after a new customer registers I&#8217;m interested in seeing their click tracking events).<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/2-interval-join\/2-join-config.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/2-interval-join\/2-join-config.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The assistant also helped me choose an appropriate join type. In this case, I wanted an inner join &#8211; I want to see only click tracking events where a customer registration event was observed within the time window.<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/2-interval-join\/3-output-config.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/2-interval-join\/3-output-config.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>Finally, the assistant helped me define the format for the output I wanted &#8211; choosing (and renaming) the properties from the two different input streams to keep.<\/p>\n<p>As before, I&#8217;ve added comments to the generated SQL to make it more readable.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsql%2F2-interval-join.sql%23L104-L134&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=500\"><\/script><small><a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/2-interval-join.sql\">2-interval-join.sql<\/a><\/small><\/p>\n<p>The interesting bits are <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/2-interval-join.sql#L122-L134\">lines 122-134<\/a> which are an example of how to use an interval join to correlate between two streams of events.<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/2-interval-join\/4-running.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/2-interval-join\/4-running.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The screenshot above from running the flow in Event Processing shows just click tracking events for customers within the first hour after they create their new account.<\/p>\n<h3 style=\"color: white; background-color: #000099; padding: 5px; margin-top: 25px;\"><a style=\"color: white;\" name=\"tumble\"><sup style=\"font-size: 0.6em;\">3<\/sup> Browser and device usage<\/a><\/h3>\n<p><strong>Demonstrating how to aggregate events within a tumble window to determine the types of device (desktop, tablet, mobile) used each hour.<\/strong><\/p>\n<p><strong>Demonstrating how to use a Top-N query to identify the most-used web browser in each hour.<\/strong><\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/3-aggregate-tumble\/0-nodes.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/3-aggregate-tumble\/0-nodes.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/ep-flows\/3%20-%20browser%20and%20device%20usage.json\">Event Processing project<\/a> for this example contains two different tumble window aggregate nodes &#8211; one to count device types per hour, the second to count browser names per hour. The second aggregate also has a Top-N node to identify the most used browser name in each hour.<\/p>\n<p>All <a href=\"https:\/\/dalelane.co.uk\/blog\/?p=5806#data\">click tracking events contain a<\/a> <code style=\"background-color: #FFFFC0; color: #770000; padding: 4px; font-weight: 600;\">sessionid<\/code>. This can be used to correlate the multiple different events from a single user as being part of the same overall user session.<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/3-aggregate-tumble\/1-aggregate-config.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/3-aggregate-tumble\/1-aggregate-config.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The first aggregate is a tumble window to count the number of different session ids that occur in each hour, grouped by the type of device (i.e. desktop, mobile, tablet).<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/3-aggregate-tumble\/4-running.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/3-aggregate-tumble\/4-running.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>When the flow is running, it will output three events at the end of each hour &#8211; one for each device type (desktop, mobile, tablet) with a count of how many unique sessions have been observed with that device type during the hour.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsql%2F3-aggregate-tumble.sql%23L65-L93&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=500\"><\/script><small><a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/3-aggregate-tumble.sql\">3-aggregate-tumble.sql<\/a><\/small><\/p>\n<p>The first aggregate (number of sessions for each device type) is in <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/3-aggregate-tumble.sql#L65-L92\">lines 71-92<\/a>. I&#8217;ve added some comments to explain what it is doing.<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/3-aggregate-tumble\/2-aggregate-config.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/3-aggregate-tumble\/2-aggregate-config.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The second aggregate is another tumble window to count the number of different session ids that occur in each hour, grouped by the browser name (e.g. Chrome, Firefox, Safari, etc.).<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/3-aggregate-tumble\/3-topn-config.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/3-aggregate-tumble\/3-topn-config.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The Top-N query is added to the second aggregate, and configured so that the three browser names with the highest number of sessions are emitted at the end of each hour window.<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/3-aggregate-tumble\/5-running.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/3-aggregate-tumble\/5-running.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>When the flow is running, it will output three events at the end of each hour &#8211; one for each of the three most-used browsers, with a count of how many unique session shave been observed with that browser during the hour.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsql%2F3-aggregate-tumble.sql%23L96-L159&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=500\"><\/script><small><a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/3-aggregate-tumble.sql\">3-aggregate-tumble.sql<\/a><\/small><\/p>\n<p>The second aggregate (most used browser for each hour) is in <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/3-aggregate-tumble.sql#L96-L158\">lines 102-158<\/a>. I&#8217;ve added some comments to explain what it is doing.<\/p>\n<h3 style=\"color: white; background-color: #000099; padding: 5px; margin-top: 25px;\"><a style=\"color: white;\" name=\"session\"><sup style=\"font-size: 0.6em;\">4<\/sup> User session duration<\/a><\/h3>\n<p><strong>Demonstrating how to aggregate events using a session window to identify click tracking events from the same user as part of the same user session.<\/strong><\/p>\n<p><strong>Demonstrating how to aggregate those user sessions using a tumble window to identify the attributes of an average user session.<\/strong><\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/4-aggregate-session\/0-nodes.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/4-aggregate-session\/0-nodes.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/ep-flows\/4%20-%20tracking%20session%20duration.json\">Event Processing project<\/a> for this example uses two aggregate nodes.<\/p>\n<p>The first aggregate node collects individual click events into complete user sessions &#8211; all of the clicks that a user performed as part of a single active session.<\/p>\n<p>The SQL for this first aggregate is at <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/4-aggregate-session.sql#L66-L100\">4-aggregate-session.sql<\/a> (lines 72-100) with comments added to explain what it is doing.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsql%2F4-aggregate-session.sql%23L66-L100&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=500\"><\/script><\/p>\n<p>I&#8217;m calculating the duration of each user session by using a <code style=\"background-color: #FFFFC0; color: #770000; padding: 4px; font-weight: 600;\">TIMESTAMPDIFF<\/code> function to compute the difference between the timestamps for the first and last click tracking event observed with each sessionid.<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/4-aggregate-session\/2-running.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/4-aggregate-session\/2-running.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The result from this is that every time a session discontinues (which I&#8217;ve defined as &#8220;no events with that session id being received for 15 minutes&#8221;) an event is emitted with the duration of that session.<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/4-aggregate-session\/1-aggregate-config.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/4-aggregate-session\/1-aggregate-config.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The second aggregate node takes those individual session durations, and counts the number of sessions, identifies the longest and shortest session that was recorded during each hour, and calculates the average session duration.<\/p>\n<p>The SQL for this second aggregate is at <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/4-aggregate-session.sql#L103-L133\">4-aggregate-session.sql<\/a> (lines 109-133) with comments addded to explain what it is doing.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsql%2F4-aggregate-session.sql%23L103-L133&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=500\"><\/script><\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/4-aggregate-session\/3-running.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/4-aggregate-session\/3-running.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The result from this is that at the end of every hour, an event is emitted with the number of user sessions observed in the previous hour of click tracking events, the shortest and longest user session, and the average user session duration.<\/p>\n<h3 style=\"color: white; background-color: #000099; padding: 5px; margin-top: 25px;\"><a style=\"color: white;\" name=\"sessionfilter\"><sup style=\"font-size: 0.6em;\">5<\/sup> Abandoned baskets<\/a><\/h3>\n<p><strong>Demonstrating how to filter sessions that match certain criteria to identify user sessions that included adding products to a shopping cart but that did not result in a purchase.<\/strong><\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/5-session-incomplete\/0-nodes.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/5-session-incomplete\/0-nodes.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/ep-flows\/5%20-%20responding%20to%20abandoned%20baskets.json\">Event Processing project<\/a> for this example collects click tracking events with the same <code style=\"background-color: #FFFFC0; color: #770000; padding: 4px; font-weight: 600;\">sessionid<\/code> into user sessions, and identifies attributes of that session &#8211; such as the number of products added to the shopping cart during the session, and whether or not checkout events were included within the session. It then adds a filter node so that only sessions with an abandoned basket are kept.<\/p>\n<p>The SQL to define the session in this example is at <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/5-aggregate-session.sql#L66-L125\">5-aggregate-session.sql<\/a> in lines 72-125.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsql%2F5-aggregate-session.sql%23L66-L125&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=500\"><\/script><small><a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/5-aggregate-session.sql#L66-L125\">5-aggregate-session.sql<\/a><\/small><\/p>\n<p>The session window let me collect together all individual click tracking events that have the same sessionid value &#8211; that are all part of the same overall user session. This aggregate SQL demonstrates a few of <a href=\"https:\/\/nightlies.apache.org\/flink\/flink-docs-master\/docs\/dev\/table\/functions\/systemfunctions\/\">the functions that can be applied to the events in each session<\/a>:<\/p>\n<ul>\n<li><code style=\"background-color: #FFFFC0; color: #770000; padding: 4px; font-weight: 600;\">TIMESTAMPDIFF<\/code> &#8211; to compute the duration of the session<\/li>\n<li><code style=\"background-color: #FFFFC0; color: #770000; padding: 4px; font-weight: 600;\">ARRAY_AGG<\/code> &#8211; to collect a list of the products added to the shopping basket across multiple separate events within the session<\/li>\n<li><code style=\"background-color: #FFFFC0; color: #770000; padding: 4px; font-weight: 600;\">ARRAY_EXCEPT<\/code> &#8211; to remove products that were removed from the shopping basket from that collected list<\/li>\n<li><code style=\"background-color: #FFFFC0; color: #770000; padding: 4px; font-weight: 600;\">MAX<\/code> and <code style=\"background-color: #FFFFC0; color: #770000; padding: 4px; font-weight: 600;\">CASE<\/code> &#8211; to identify whether a checkout-complete event was contained in the session<\/li>\n<\/ul>\n<p>The result from this is that every time a session discontinues (which I&#8217;ve defined as no events with that session id being received for 15 minutes) an event is emitted with details from that user session, such as how long the session lasted, the customer id, the contents of their shopping basket at the end of the session, and whether or not they made a purchase.<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/5-session-incomplete\/1-filter-config.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/5-session-incomplete\/1-filter-config.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>These session properties can be used in a filter to just keep events from the end of a session that resulted in an abandoned basket:<\/p>\n<ul>\n<li>duration of at least 60 seconds &#8211; to ignore users who clicked away very quickly<\/li>\n<li>at least one product in their shopping basket<\/li>\n<li>no checkout complete event<\/li>\n<li>logged in user<\/li>\n<\/ul>\n<p>The SQL to do this is at <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/5-aggregate-session.sql#L128-L161\">5-aggregate-session.sql<\/a> in lines 143-161.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsql%2F5-aggregate-session.sql%23L128-L162&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=500\"><\/script><small><a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/5-aggregate-session.sql#L128-L161\">5-aggregate-session.sql<\/a><\/small><\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/5-session-incomplete\/2-running.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/5-session-incomplete\/2-running.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The result from this is that when a logged in user abandons their session (which I&#8217;ve defined as no click events observed for 15 minutes), an event is emitted with their customer id and the contents of their shopping basket.<\/p>\n<p>This could be used to trigger some automated promotional activity for that customer.<\/p>\n<h3 style=\"color: white; background-color: #000099; padding: 5px; margin-top: 25px;\"><a style=\"color: white;\" name=\"matchrecognize\"><sup style=\"font-size: 0.6em;\">6<\/sup> Buying behaviour<\/a><\/h3>\n<p><strong>Demonstrating how to recognize sequences of events that match a defined pattern to identify different shopper behaviours in a stream of click tracking events.<\/strong><\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/6-match-recognize\/0-nodes.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/6-match-recognize\/0-nodes.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/ep-flows\/6%20-%20understanding%20buying%20behaviour.json\">Event Processing project<\/a> for this example contains two different pattern detection nodes:<\/p>\n<ul>\n<li>looking for a sequence of click tracking events that indicates a customer making a purchase following <strong>searching<\/strong> for specific items<\/li>\n<li>looking for a sequence of click tracking events that indicates a customer making a purchase following <strong>browsing<\/strong> lists of products within a category<\/li>\n<\/ul>\n<p>These are then combined into a single stream of purchase events, enriched with a label that describes the type of behaviour suggested by the click tracking events that led to the purchase.<\/p>\n<p>The SQL to recognise searching behaviour is at <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/6-match-recognize.sql#L66-L129\">6-match-recognize.sql<\/a> in lines 96-128.<\/p>\n<p>The simplified sequence I&#8217;ve defined is a search, followed by adding something to the shopping basket, followed by completing a purchase.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsql%2F6-match-recognize.sql%23L66-L129&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=500\"><\/script><\/p>\n<p>The SQL to recognize browsing behaviour is at <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/6-match-recognize.sql#L132-L195\">6-match-recognize.sql<\/a> in lines 162-194.<\/p>\n<p>The simplified sequence I&#8217;ve defined is a browse, followed by adding something to the shopping basket, followed by completing a purchase.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsql%2F6-match-recognize.sql%23L132-L195&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=500\"><\/script><\/p>\n<p>More complex and advanced patterns can be implemented &#8211; these are simple examples to illustrate what is possible.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsql%2F6-match-recognize.sql%23L198-L207&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=500\"><\/script><small><a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/6-match-recognize.sql#L198-L207\">6-match-recognize.sql<\/a><\/small><\/p>\n<p>The results from each of these are then combined together using a <code style=\"background-color: #FFFFC0; color: #770000; padding: 4px; font-weight: 600;\">UNION<\/code> on line 206.<\/p>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/6-match-recognize\/1-running.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/6-match-recognize\/1-running.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>The result is a stream of purchase events, enriched with a label that describes the type of behaviour suggested by the click tracking events that led to the purchase.<\/p>\n<h3 style=\"color: white; background-color: #000099; padding: 5px; margin-top: 25px;\"><a style=\"color: white;\" name=\"data\">The data<\/a><\/h3>\n<p><a href=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/full\/00-es-overview.png\"><img decoding=\"async\" src=\"https:\/\/images.dalelane.co.uk\/2026-01-12-clicktracking\/00-es-overview.png?raw=true\" style=\"border: thin black solid; width: 100%; max-width: 550px;\"\/><\/a><\/p>\n<p>I&#8217;m processing Avro-encoded events in these examples, so looking at the raw data as in this screenshot isn&#8217;t super helpful.<\/p>\n<p>If you&#8217;re familiar with Avro, you can look at <a href=\"https:\/\/github.com\/IBM\/kafka-connect-loosehangerjeans-source\/blob\/main\/doc\/CLICKTRACKING\/schema.avro\">the Avro schema<\/a> for the click tracking events, which includes a description of each of the fields.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2FIBM%2Fkafka-connect-loosehangerjeans-source%2Fblob%2Fmain%2Fdoc%2FCLICKTRACKING%2Fschema.avro&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=380\"><\/script><small><a href=\"https:\/\/github.com\/IBM\/kafka-connect-loosehangerjeans-source\/blob\/main\/doc\/CLICKTRACKING\/schema.avro\">schema.avro<\/a><\/small><\/p>\n<p>To make it easier to understand, I ran <code style=\"background-color: #FFFFC0; color: #770000; padding: 4px; font-weight: 600;\">kafka-console-consumer<\/code> on the topic <a href=\"https:\/\/github.com\/IBM\/kafka-avro-formatters\/blob\/main\/README.md#combine-with-jq\">using an Avro formatter with jq<\/a> to prepare these more readable JSON representations.<\/p>\n<pre style=\"font-size: 0.9em; white-space: pre !important; overflow-x: scroll; background-color: #FFFFC0; color: #770000; padding: 4px;\">kafka-console-consumer.sh \\\n  --bootstrap-server  my-kafka-cluster-bootstrap-event-automation.apps.dale-lane.cp.fyre.ibm.com:443 \\\n  --topic             CLICKTRACKING.REG \\\n  --consumer.config   es.properties \\\n  --formatter         com.ibm.eventautomation.kafka.formatters.ApicurioFormatter \\\n  --formatter-config  es-formatter.properties<\/pre>\n<p>This isn&#8217;t an exhaustive list of what sorts of events are possible (look at <a href=\"https:\/\/github.com\/IBM\/kafka-connect-loosehangerjeans-source\/blob\/main\/doc\/CLICKTRACKING\/schema.avro\">the schema<\/a> to understand that) but rather a few illustrative examples to understand the sort of events I&#8217;ve been processing.<\/p>\n<h4>Page views<\/h4>\n<p>Someone has viewed a page on the Loosehanger Jeans website.<\/p>\n<p>The event includes information about what page they viewed, and what device they&#8217;re using to view it. If the user is logged in, user details will be included. If they&#8217;re not logged in, <code style=\"background-color: #FFFFC0; color: #770000; padding: 4px; font-weight: 600;\">customer<\/code> will be null. If they came to the Loosehanger Jeans website from somewhere else, the referrer URL will be included.<\/p>\n<p>Pages can contain a list of products (e.g. products in a category, search results, etc.) or be static content (e.g. company information). Pages for an individual product are identified with a different PRODUCT_TYPE event.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsample-events%2F1-page-view.json&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=300\"><\/script><small><a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sample-events\/1-page-view.json\">sample page view event 1-page-view.json<\/a><\/small><\/p>\n<h4>Search<\/h4>\n<p>Someone has searched for a product on the Loosehanger Jeans website.<\/p>\n<p>If the user is logged in, user details will be included. If they&#8217;re not logged in, <code style=\"background-color: #FFFFC0; color: #770000; padding: 4px; font-weight: 600;\">customer<\/code> will be null. The event includes information about the device that the user is using.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsample-events%2F2-search.json&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=300\"><\/script><small><a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sample-events\/2-search.json\">sample search event 2-search.json<\/a><\/small><\/p>\n<h4>Add to shopping cart<\/h4>\n<p>Someone has added a product to their shopping basket.<\/p>\n<p>The event includes a description of the product the user has added. Users do not need to be logged in to do this, they can log in at the point they are ready to check out. The event includes information about the device that the user is using.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsample-events%2F3-add-to-cart.json&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=300\"><\/script><br \/>\n<small><a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sample-events\/3-add-to-cart.json\">sample add to cart event 3-add-to-cart.json<\/a><\/small><\/p>\n<h4>Remove from shopping cart<\/h4>\n<p>Someone has removed a product from their shopping basket.<\/p>\n<p>The event includes a description of the product the user removed. Users do not need to be logged in to do this, they can log in at the point they are ready to check out. The event includes information about the device that the user is using.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsample-events%2F4-remove-from-cart.json&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=300\"><\/script><small><a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sample-events\/4-remove-from-cart.json\">sample remove from cart event 4-remove-from-cart.json<\/a><\/small><\/p>\n<h4>Login<\/h4>\n<p>A registered user has logged into the Loosehanger Jeans website.<\/p>\n<p>The event includes details about the user who has logged in, and the device that the user is using.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsample-events%2F5-login.json&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=300\"><\/script><small><a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sample-events\/5-login.json\">sample login event 5-login.json<\/a><\/small><\/p>\n<h4>Checkout<\/h4>\n<p>A registered user has completed a purchase on the Loosehanger Jeans website. There are a series of events that occur when someone makes a purchase (e.g. CART_VIEW, CHECKOUT_START) but CHECKOUT_COMPLETE is the interesting one that is emitted when a purchase is complete.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2Fdalelane%2Fflink-sql-clicktracking-demo%2Fblob%2Fmain%2Fsample-events%2F6-checkout.json&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=300\"><\/script><small><a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sample-events\/6-checkout.json\">sample checkout event 6-checkout.json<\/a><\/small><\/p>\n<h4>Others<\/h4>\n<p>There are other events, but those are enough to give you the idea of the sort of thing that we&#8217;ve got to play with.<\/p>\n<h3 style=\"color: white; background-color: #000099; padding: 5px; margin-top: 25px;\"><a style=\"color: white;\" name=\"setup\">Setup<\/a><\/h3>\n<p>My goal with this post was to inspire &#8211; when introducing teams to Flink, I find that it helps to have tangible concrete examples of what it can be used for, and how to turn ideas into Flink SQL.<\/p>\n<p>If you&#8217;d like to try these examples for yourself, you can follow <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/SETUP.md\">these setup instructions<\/a>. In summary, I&#8217;m using the <a href=\"https:\/\/github.com\/IBM\/kafka-connect-loosehangerjeans-source\">&#8220;Loosehanger Jeans&#8221; data generator<\/a>, which is a configurable Kafka Connect source connector that generates sythentic events for demo and development projects.<\/p>\n<p>I&#8217;m using it to produce <a href=\"https:\/\/github.com\/IBM\/event-automation-demo\/blob\/main\/install\/eventstreams\/templates\/08-datagen.yaml#L83-L119\">Avro-encoded<\/a> events using a Confluent-compatible schema registry.<\/p>\n<p><script src=\"https:\/\/emgithub.com\/embed-v2.js?target=https%3A%2F%2Fgithub.com%2FIBM%2Fevent-automation-demo%2Fblob%2F494221e7dd958b6f089b793feb7797b29965ee61%2Finstall%2Feventstreams%2Ftemplates%2F08-datagen.yaml%23L83-L119&#038;style=default&#038;type=code&#038;showBorder=on&#038;showLineNumbers=on&#038;showFileMeta=on&#038;showCopy=on&#038;maxHeight=300\"><\/script><small><a href=\"https:\/\/github.com\/IBM\/event-automation-demo\/blob\/494221e7dd958b6f089b793feb7797b29965ee61\/install\/eventstreams\/templates\/08-datagen.yaml#L83-L119\">data generator config that I used<\/a><\/small><\/p>\n<p>I used Avro as I wanted to demonstrate the way to integrate with a schema registry, but you could simplify this by <a href=\"https:\/\/github.com\/IBM\/event-automation-demo\/blob\/main\/install\/eventstreams\/templates\/08-datagen.yaml#L13-L14\">generating JSON events<\/a> and then <a href=\"https:\/\/github.com\/dalelane\/flink-sql-clicktracking-demo\/blob\/main\/sql\/0-setup.sql#L58\">changing the Flink connector format to json<\/a>. In that way, you could even run all of it using OSS Kafka and Flink on your own laptop.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this post, I introduce a few core Flink SQL functions using worked examples of processing a stream of click tracking events from a retail website. I find that a practical, real-world (ish) example can help to explain how to use Flink SQL in a way that abstract descriptions, such as processing coloured blocks sometimes [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[610],"class_list":["post-5806","post","type-post","status-publish","format-standard","hentry","category-code","tag-flink"],"_links":{"self":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/5806","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5806"}],"version-history":[{"count":13,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/5806\/revisions"}],"predecessor-version":[{"id":5904,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/5806\/revisions\/5904"}],"wp:attachment":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5806"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5806"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5806"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}