{"id":3646,"date":"2018-11-18T02:53:54","date_gmt":"2018-11-18T02:53:54","guid":{"rendered":"http:\/\/dalelane.co.uk\/blog\/?p=3646"},"modified":"2018-12-14T14:23:37","modified_gmt":"2018-12-14T14:23:37","slug":"how-to-create-a-twitter-api-proxy-using-nginx-in-cloud-foundry","status":"publish","type":"post","link":"https:\/\/dalelane.co.uk\/blog\/?p=3646","title":{"rendered":"How to create a Twitter API proxy using nginx in Cloud Foundry"},"content":{"rendered":"<p><strong>In this post, I\u2019ll describe how to run nginx in Cloud Foundry to provide a Twitter API proxy that includes authentication and caching.<\/strong><\/p>\n<p>First, I want to talk a bit about why I wanted this, but if you don\u2019t care about any of that, you can just <a href=\"http:\/\/dalelane.co.uk\/blog\/?p=3646#thecode\">skip to the code<\/a> at the end of the post. \ud83d\ude42<\/p>\n<p>I\u2019ve wanted for a while to enable projects in <a href=\"https:\/\/machinelearningforkids.co.uk\/\">Machine Learning for Kids<\/a> that use tweets. Using live tweets is a great way to make text analytics real for students, and a good example of how natural language processing is used in the real world.<\/p>\n<p>The question was how to enable this from <a href=\"https:\/\/scratch.mit.edu\/\">Scratch<\/a> in a way that would be easy to use by schools.<\/p>\n<p>The title of this post gives away the answer I ended up with, but I&#8217;ll describe why.<\/p>\n<p><!--more-->As the <a href=\"https:\/\/developer.twitter.com\/en\/docs\">Twitter API<\/a> doesn\u2019t like CORS requests, and Scratch is a web application, this means I wouldn\u2019t be able to make requests directly to Twitter from the Scratch extension.<\/p>\n<p>This introduced <strong>requirement 1: I needed a proxy<\/strong> that could receive API requests from the Scratch extension and forward them to the Twitter API.<\/p>\n<p>The Twitter API requires <a href=\"https:\/\/developer.twitter.com\/en\/docs\/basics\/authentication\/overview\/oauth\">authentication<\/a>.<\/p>\n<p>I could ask students to provide their own log on, but it\u2019d be difficult to do that in a simple way. (And I\u2019d feel uncomfortable asking under 16\u2019s to create an account with Twitter just to complete a coding lesson. Even if Twitter allows accounts for young people over 13).<\/p>\n<p>I tried building a mechanism based on asking teachers to provide a log on for use by their class. It\u2019s technically possible. However, it has been very challenging in getting teachers and code club leaders to create the Watson API keys they need for their class as it is, so I have a good idea of how difficult it would be to get teachers to create developer accounts on Twitter. As such, I\u2019d rather avoid this approach for now. (But if my current approach doesn\u2019t work out, this will be the last resort fall-back).<\/p>\n<p>This introduced <strong>requirement 2: I needed the proxy to use my Twitter credentials<\/strong> for all API requests without exposing them to clients.<\/p>\n<p>The free Twitter API <a href=\"https:\/\/developer.twitter.com\/en\/docs\/basics\/rate-limiting\">rate limits<\/a> aggressively &#8211; I\u2019ll only get 450 API requests every 15 minutes. I can\u2019t afford to pay to get premium access to the Twitter API. And there are currently tens of thousands of students using Machine Learning for Kids, so I could very quickly burn through that limit.<\/p>\n<p>This introduced <strong>requirement 3: I needed the proxy to cache<\/strong> API responses from Twitter to reduce the number of API calls that it makes.<\/p>\n<p>I\u2019ve written before about <a href=\"http:\/\/dalelane.co.uk\/blog\/?p=3611\">where Machine Learning for Kids is running<\/a>. But to recap, it\u2019s running as a set of Cloud Foundry applications running in <a href=\"https:\/\/www.ibm.com\/cloud\/cloud-foundry\">IBM Cloud<\/a>.<\/p>\n<p>Combine this with the fact that I\u2019m inherently lazy and prefer to avoid reinventing wheels where possible. This all introduced <strong>sort-of-requirement 4: I need a proxy that would be easy to run in Cloud Foundry with minimal code<\/strong> needed.<\/p>\n<p>Before I share how I\u2019ve done it, it\u2019s worth pointing out that I\u2019m obviously not the first person to do something like this.<\/p>\n<p>There is a <a href=\"https:\/\/technoboy10.github.io\/twitter\/\">Twitter extension for ScratchX<\/a> that is using a proxy running in Heroku. It\u2019s also blocking cross-origin requests so I couldn\u2019t use it from machinelearningforkids.co.uk and it only lets you fetch a single tweet, which wouldn\u2019t work for the sort of projects I want to enable. I don\u2019t know how it\u2019s implemented, but I\u2019d guess that it isn\u2019t too far from what I need. I\u2019m not sure if it\u2019s still being maintained &#8211; I did try emailing the developer but didn\u2019t get a reply.<\/p>\n<p>This <a href=\"http:\/\/blog.etianen.com\/blog\/2013\/04\/12\/nginx-twitter-api-proxy\/\">blog post<\/a> by <a href=\"https:\/\/twitter.com\/etianen\">Dave Hall<\/a> describes how to use nginx for a Twitter API proxy, and got me 90% of the way to what I needed. What I\u2019ve ended up with was his blog post, tweaked to run in Cloud Foundry and with the caching turned up to 11. So a huge thanks to him for sharing it.<\/p>\n<p><a name=\"thecode\"> <\/a><\/p>\n<h3>Step 1 &#8211; Create a developer account with Twitter<\/h3>\n<p>Go to <a href=\"https:\/\/developer.twitter.com\/\">https:\/\/developer.twitter.com\/<\/a> and follow the instructions<\/p>\n<p>Create an application and copy the consumer key and consumer secret.<\/p>\n<h3>Step 2 &#8211; Create an access token<\/h3>\n<p>This script in Dave&#8217;s blog post makes that easy:<\/p>\n<pre style=\"border: thin solid silver; background-color: #eeeeee; padding: 0.7em; font-size: 1.1em; overflow: auto;\">$ export CONSUMER_KEY=XXXXXXXXXXXXXXXXXXXXX\r\n$ export CONSUMER_SECRET=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\r\n$ curl -H \"Authorization: Basic `echo -ne \"$CONSUMER_KEY:$CONSUMER_SECRET\" | base64`\" -d \"grant_type=client_credentials\" https:\/\/api.twitter.com\/oauth2\/token<\/pre>\n<p>Grab the access_token from the response<\/p>\n<h3>Step 3 &#8211; Create a manifest.yml file<\/h3>\n<p>You&#8217;ll need to replace <code>your-proxy-name<\/code>, <code>your-host.com<\/code> and <code>the-access-token-from-step-2<\/code>.<\/p>\n<p><code><strong>manifest.yml<\/strong><\/code><\/p>\n<pre style=\"border: thin solid silver; background-color: #eeeeee; padding: 0.7em; font-size: 1.1em; overflow: auto;\">applications:\r\n- name: your-proxy-name\r\n  instances: 1\r\n  memory: 256M\r\n  disk_quota: 256M\r\n  path: .\r\n  routes:\r\n    - route: https:\/\/your-host.com\/proxies\/twitter\r\n  buildpack: https:\/\/github.com\/cloudfoundry\/nginx-buildpack.git\r\n  env:\r\n    TWITTER_BEARER_TOKEN: the-access-token-from-step-2\r\n<\/pre>\n<h3>Step 4 &#8211; Create an nginx.conf file<\/h3>\n<p><code><strong>nginx.conf<\/strong><\/code><\/p>\n<pre style=\"border: thin solid silver; background-color: #eeeeee; padding: 0.7em; font-size: 1.1em; overflow: auto;\"># write errors to stderr where Cloud Foundry can grab them\r\nerror_log stderr;\r\n\r\n# leave as default for now\r\nevents { worker_connections 1024; }\r\n\r\nhttp {\r\n  # Defines a 200 megabyte space for the API cache\r\n  proxy_cache_path  {{env \"HOME\"}}\/api_cache_space levels=1:2 keys_zone=twitter_api_proxy:200m;\r\n\r\n  server {\r\n    # get the port number from Cloud Foundry\r\n    listen {{port}};\r\n\r\n    # defines the Twitter proxy\r\n    location \/proxies\/twitter\/ {\r\n\r\n      # Cloud Foundry's access log is good enough, so save a little\r\n      # disk space by not asking nginx to create another\r\n      access_log off;\r\n\r\n      # Use the 200m cache space defined above\r\n      proxy_cache twitter_api_proxy;\r\n\r\n      # Cache successful API requests for 15 minutes\r\n      #  as the aim is to avoid sending the same request to Twitter\r\n      #  more than once within a rate-limiting request window\r\n      proxy_cache_valid 200 302 404 15m;\r\n\r\n      # Use the cache even after 15 minutes if we get API errors\r\n      proxy_cache_use_stale error updating timeout;\r\n\r\n      # Ignore and strip the cache headers set by the Twitter API\r\n      proxy_ignore_headers X-Accel-Expires Expires Cache-Control Set-Cookie;\r\n      proxy_hide_header X-Accel-Expires;\r\n      proxy_hide_header Expires;\r\n      proxy_hide_header Cache-Control;\r\n      proxy_hide_header pragma;\r\n      proxy_hide_header set-cookie;\r\n\r\n      # Tells the client to cache this for 15 minutes\r\n      expires 15m;\r\n\r\n      # Set the correct host name to connect to the Twitter API.\r\n      proxy_set_header Host api.twitter.com;\r\n\r\n      # Get the auth header from manifest.yml\r\n      proxy_set_header Authorization \"Bearer {{env \"TWITTER_BEARER_TOKEN\"}}\";\r\n\r\n      # Location of the Twitter API\r\n      #  (The trailing slash is important for the URL rewriting - don't remove it)\r\n      proxy_pass https:\/\/api.twitter.com\/;\r\n\r\n      # Add a header to the response that tells us if it came from the cache or not\r\n      add_header X-Cache-Status $upstream_cache_status;\r\n    }\r\n\r\n  }\r\n}<\/pre>\n<h3>Step 5 &#8211; Deploy<\/h3>\n<p>That&#8217;s it.<\/p>\n<p>All that&#8217;s left is to deploy it.<\/p>\n<pre style=\"border: thin solid silver; background-color: #eeeeee; padding: 0.7em; font-size: 1.1em; overflow: auto;\">$ cf push<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>In this post, I\u2019ll describe how to run nginx in Cloud Foundry to provide a Twitter API proxy that includes authentication and caching. First, I want to talk a bit about why I wanted this, but if you don\u2019t care about any of that, you can just skip to the code at the end of [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[586,536,151],"class_list":["post-3646","post","type-post","status-publish","format-standard","hentry","category-code","tag-nginx","tag-scratch","tag-twitter"],"_links":{"self":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/3646","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3646"}],"version-history":[{"count":0,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/3646\/revisions"}],"wp:attachment":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3646"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3646"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3646"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}