{"id":825,"date":"2009-07-28T10:10:16","date_gmt":"2009-07-28T10:10:16","guid":{"rendered":"http:\/\/dalelane.co.uk\/blog\/?p=825"},"modified":"2009-07-28T10:10:16","modified_gmt":"2009-07-28T10:10:16","slug":"what-programme-was-on-channel-4","status":"publish","type":"post","link":"https:\/\/dalelane.co.uk\/blog\/?p=825","title":{"rendered":"What programme was on Channel 4&#8230;?"},"content":{"rendered":"<p>I <a href=\"http:\/\/dalelane.co.uk\/blog\/?p=818\">posted yesterday<\/a> about my quick play with the <a href=\"http:\/\/www0.rdthdo.bbc.co.uk\/services\/api\/\" target=\"_blank\">BBC Web API<\/a> for programme schedules. I wanted to be able to programmatically find out what programme was on a particular channel at a given time. <\/p>\n<p>The problem with the <a href=\"http:\/\/dalelane.co.uk\/blog\/?p=818#more-818\">quick code I came up with<\/a> was that it only gets me BBC channels. What if I want to know what was on a non-BBC channel?<\/p>\n<p><a href=\"http:\/\/twitter.com\/awhitehouse\" target=\"_blank\">Andrew<\/a> <a href=\"http:\/\/twitter.com\/awhitehouse\/status\/2601087078\" target=\"_blank\">pointed me<\/a> at the <a href=\"http:\/\/www.radiotimes.com\/\" target=\"_blank\">Radio Times website<\/a>, which makes <a href=\"http:\/\/xmltv.radiotimes.com\/xmltv\/channels.dat\" target=\"_blank\">programme schedule data<\/a> available in <a href=\"http:\/\/wiki.xmltv.org\/index.php\/XMLTVFormat\" target=\"_blank\">XMLTV format<\/a>. <\/p>\n<p>And <a href=\"http:\/\/twitter.com\/oldmanuk\/\" target=\"_blank\">Dom<\/a> <a href=\"http:\/\/twitter.com\/oldmanuk\/status\/2601734175\" target=\"_blank\">pointed me<\/a> at a neat <a href=\"http:\/\/code.google.com\/p\/python-xmltv\/\" target=\"_blank\">Python library for parsing XMLTV data<\/a>.<\/p>\n<p><!--more--><strong>Getting the XMLTV data<\/strong><\/p>\n<p>Radio Times make the XMLTV data available for each channel individually. For example, BBC1 programme data is at <a href=\"http:\/\/xmltv.radiotimes.com\/xmltv\/92.dat\" target=\"_blank\">xmltv.radiotimes.com\/xmltv\/92.dat<\/a>.<\/p>\n<p>Rather than download each file individually, I downloaded the XMLTV program to do this for me. There is a Windows exe version available in <a href=\"http:\/\/sourceforge.net\/projects\/xmltv\/files\/\" target=\"_blank\">the XMLTV SourceForge project<\/a>.<\/p>\n<p>Step one is to run the exe:<\/p>\n<pre style=\"overflow: scroll; font-size: 1.1em; border: thin solid silver; background-color: #eeeeee; padding: 0.8em\">&gt; xmltv tv_grab_uk_rt --configure<\/pre>\n<p>The tv_grab_uk_rt tells XMLTV to get the data from the Radio Times website.<\/p>\n<p>This takes you through some config steps to choose which channels you want to download &#8211; either individually choosing each channel you want, or choosing from a preset group (e.g. all FreeView channels). <\/p>\n<p>Step two is to export all of the data to a single XMLTV file:<\/p>\n<pre style=\"overflow: scroll; font-size: 1.1em; border: thin solid silver; background-color: #eeeeee; padding: 0.8em\">&gt; xmltv tv_grab_uk_rt --output xmltv.out.xml<\/pre>\n<p>This creates an xmltv.out.xml file containing the schedule information for all the channels I chose, in a format that the <a href=\"http:\/\/code.google.com\/p\/python-xmltv\/\" target=\"_blank\">Python library<\/a> can understand.<\/p>\n<p><strong>Using the XMLTV data &#8211; locally<\/strong><\/p>\n<p>A quick Python script lets me recreate what I was <a href=\"http:\/\/dalelane.co.uk\/blog\/?p=818\">doing in Java before<\/a>:<\/p>\n<pre style=\"overflow: scroll; font-size: 1.1em; border: thin solid silver; background-color: #eeeeee; padding: 0.8em\">#\r\n# IMPORTS\r\n# \r\n \r\nimport xmltv\r\nfrom datetime import *\r\n\r\nfrom pprint import pprint\r\n\r\n#\r\n# INPUTS\r\n# \r\n\r\nfilename = 'C:\\\\location\\\\of\\\\my\\\\xmltv.out.xml'\r\nrequestedChannel = \"channel4.com\"\r\nrequestedStart   = \"20090727190500\"\r\nrequestedFinish  = \"20090727201500\"\r\n\r\n#\r\n# CODE\r\n# \r\n\r\ndateFormat = \"%Y%m%d%H%M%S\"\r\n\r\nxmltv.locale = 'Latin-1'\r\nprogrammes = xmltv.read_programmes(open(filename, 'r'))\r\n\r\n\r\nrequestedStartTime = datetime.strptime(requestedStart, dateFormat)\r\nrequestedEndTime   = datetime.strptime(requestedFinish, dateFormat)\r\n\r\n\r\nfor programme in programmes:\r\n    if programme['channel'] == requestedChannel:\r\n        progStartTime = datetime.strptime(programme['start'][:-6], dateFormat)\r\n\r\n        if requestedEndTime &gt;= progStartTime:\r\n            progFinishTime = datetime.strptime(programme['stop'][:-6], dateFormat)\r\n\r\n            if requestedStartTime &lt;= progFinishTime:\r\n                duration = None\r\n                if progFinishTime &gt; requestedEndTime:\r\n                    duration = requestedEndTime - progStartTime                    \r\n                elif requestedEndTime &gt; progFinishTime:\r\n                    duration = progFinishTime - requestedStartTime\r\n                else:\r\n                    duration = progFinishTime - progStartTime\r\n                if duration != timedelta(0):\r\n                    print programme['title'][0][0]\r\n                    print progStartTime\r\n                    print progFinishTime\r\n                    print duration\r\n                    print '--------------------'<\/pre>\n<p>Running this will output:<\/p>\n<pre style=\"overflow: scroll; font-size: 1.1em; border: thin solid silver; background-color: #eeeeee; padding: 0.8em\">&gt; testtvguide.py\r\nChannel 4 News\r\n2009-07-27 19:00:00\r\n2009-07-27 19:55:00\r\n0:50:00\r\n--------------------\r\n3 Minute Wonder: The Estate\r\n2009-07-27 19:55:00\r\n2009-07-27 20:00:00\r\n0:55:00\r\n--------------------\r\nDispatches\r\n2009-07-27 20:00:00\r\n2009-07-27 21:00:00\r\n0:15:00\r\n--------------------<\/pre>\n<p>It tells me what programmes were on that time, and how much of it was on in the provided time window.<\/p>\n<p><strong>Making this available over the web<\/strong><\/p>\n<p>For what I want to do with this service, I need to be able to access it remotely &#8211; so I wrapped this bit of Python in a web service that I could run in <a href=\"http:\/\/code.google.com\/appengine\/\" target=\"_blank\">Google App Engine<\/a>.<\/p>\n<pre style=\"overflow: scroll; font-size: 1.1em; border: thin solid silver; background-color: #eeeeee; padding: 0.8em\">import cgi\r\nfrom datetime import * \r\nimport wsgiref.handlers\r\nimport xmltv\r\nfrom django.utils import simplejson\r\nfrom google.appengine.ext import webapp\r\n\r\n\r\nclass QueryPage(webapp.RequestHandler):\r\n\r\n    dateFormat = \"%Y%m%d%H%M%S\"\r\n\r\n    def getChannelNameSynonyms(self, channelname):\r\n        channelname = channelname.lower()\r\n        if channelname == '1' or channelname == 'bbc1' or channelname == 'south.bbc1.bbc.co.uk':\r\n            return 'south.bbc1.bbc.co.uk'\r\n        if channelname == '2' or channelname == 'bbc2' or channelname == 'south.bbc2.bbc.co.uk':\r\n            return 'south.bbc2.bbc.co.uk'\r\n        if channelname == 'bbc3' or channelname == 'bbcthree.bbc.co.uk':\r\n            return 'bbcthree.bbc.co.uk'\r\n        if channelname == 'bbc4' or channelname == 'bbcfour' or channelname == 'bbcfour.bbc.co.uk':\r\n            return 'bbcfour.bbc.co.uk'\r\n        if channelname == 'bbc24' or channelname == 'news.bbc.co.uk' or channelname == 'news' or channelname == 'news24':\r\n            return 'news.bbc.co.uk'\r\n        if channelname == 'cbbc' or channelname == 'cbbc.bbc.co.uk':\r\n            return 'cbbc.bbc.co.uk'\r\n        if channelname == 'cbeebies' or channelname == 'cbeebies.bbc.co.uk':\r\n            return 'cbeebies.bbc.co.uk'\r\n        if channelname == '3' or channelname == 'meridian' or channelname == 'itv' or channelname == 'meridian.itv1.itv.co.uk':\r\n            return 'meridian.itv1.itv.co.uk'\r\n        if channelname == '4' or channelname == 'ch4' or channelname == 'channel4' or channelname == 'channel4.com':\r\n            return 'channel4.com'\r\n        if channelname == 'e4' or channelname == 'e4.channel4.com':\r\n            return 'e4.channel4.com'\r\n        if channelname == 'film4' or channelname == 'filmfour' or channelname == 'filmfour.channel4.com':\r\n            return 'filmfour.channel4.com'\r\n        if channelname == 'dave' or channelname == 'dave.uktv.co.uk':\r\n            return 'dave.uktv.co.uk'\r\n\r\n\r\n\r\n    def get(self):\r\n        xmltv.locale = 'Latin-1'\r\n        filename = 'xmltv.out.xml'\r\n        filehandle = open(filename, 'r')\r\n\r\n        programmes = xmltv.read_programmes(open(filename, 'r'))\r\n\r\n        requestedChannel = self.getChannelNameSynonyms(self.request.get('channel'))\r\n\r\n        requestedStartTime = datetime.strptime(self.request.get('start'), self.dateFormat)\r\n        requestedEndTime   = datetime.strptime(self.request.get('stop'),  self.dateFormat)\r\n\r\n        matchedProgrammes = []\r\n\r\n        for programme in programmes:\r\n            if programme['channel'] == requestedChannel:\r\n                progStartTime = datetime.strptime(programme['start'][:-6], self.dateFormat)\r\n\r\n                if requestedEndTime &gt;= progStartTime:\r\n                    progFinishTime = datetime.strptime(programme['stop'][:-6], self.dateFormat)\r\n        \r\n                    if requestedStartTime &lt;= progFinishTime:\r\n                        duration = None\r\n                        if progStartTime &gt; requestedStartTime and requestedEndTime &gt; progFinishTime:\r\n                            duration = progFinishTime - progStartTime\r\n                        elif requestedStartTime &gt; progStartTime and progFinishTime &gt; requestedEndTime:\r\n                            duration = requestedEndTime - requestedStartTime\r\n                        elif progFinishTime &gt; requestedEndTime:\r\n                            duration = requestedEndTime - progStartTime                    \r\n                        elif requestedEndTime &gt; progFinishTime:\r\n                            duration = progFinishTime - requestedStartTime\r\n                        else:\r\n                            duration = None\r\n                        if duration != timedelta(0):\r\n                            matchedProgrammes.append({ 'title'   : programme['title'][0][0],\r\n                                                       'start'   : str(progStartTime),\r\n                                                       'stop'    : str(progFinishTime),\r\n                                                       'watched' : str(duration) })\r\n\r\n        self.response.out.write(simplejson.dumps(matchedProgrammes))\r\n\r\napplication = webapp.WSGIApplication([('\/tvquery',  QueryPage)],\r\n                                     debug=True)\r\ndef main():\r\n    wsgiref.handlers.CGIHandler().run(application)\r\n\r\nif __name__ == \"__main__\":\r\n    main()<\/pre>\n<p>This means that I now have a web service roughly similar to the BBC Web API one, but one which can give me information for non-BBC channels, too.<\/p>\n<p>For example, using <a href=\"http:\/\/www.gnu.org\/software\/wget\/\" target=\"_blank\">wget<\/a> to test it:<\/p>\n<pre style=\"overflow: scroll; font-size: 1.1em; border: thin solid silver; background-color: #eeeeee; padding: 0.8em\">&gt; wget -O - \"http:\/\/not-telling-you\/tvquery?channel=channel4&start=20090727190500&stop=20090727201500\" \r\n\r\n[\r\n    {\r\n        \"start\":   \"2009-07-27 19:00:00\",\r\n        \"stop\":    \"2009-07-27 19:55:00\",\r\n        \"title\":   \"Channel 4 News\",\r\n        \"watched\": \"0:50:00\"\r\n    },\r\n    {\r\n        \"start\":   \"2009-07-27 19:55:00\",\r\n        \"stop\":    \"2009-07-27 20:00:00\",\r\n        \"title\":   \"3 Minute Wonder: The Estate\",\r\n        \"watched\": \"0:05:00\"\r\n    },\r\n    {\r\n        \"start\":   \"2009-07-27 20:00:00\",\r\n        \"stop\":    \"2009-07-27 21:00:00\",\r\n        \"title\":   \"Dispatches\",\r\n        \"watched\": \"0:15:00\"\r\n    }\r\n]<\/pre>\n<p><strong>The problem<\/strong><\/p>\n<p>You might have noticed that I hid the hostname where I&#8217;ve put this service in the example above. I&#8217;ve done this because not doing so would be breaching the conditions regarding the use of the Radio Times data:<\/p>\n<blockquote><p>In accessing this XML feed, you agree that you will only access its contents for your own personal and non-commercial use and not for any commercial or other purposes, including advertising or selling any goods or services, including any third-party software applications available to the general public.<\/p><\/blockquote>\n<p>I can do this as long as it&#8217;s for my own personal use, but I can&#8217;t make it available as a web service to anyone else. <\/p>\n<p>Shame.<\/p>\n<p>Is there anywhere else I can get XMLTV format data that wouldn&#8217;t have this restriction?<\/p>\n<p>If I stick a TV tuner card in my server and get the data from the digital TV signal that set-top boxes use to produce <a href=\"http:\/\/en.wikipedia.org\/wiki\/Electronic_program_guide\" target=\"_blank\">EPGs<\/a>, would that be okay?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I posted yesterday about my quick play with the BBC Web API for programme schedules. I wanted to be able to programmatically find out what programme was on a particular channel at a given time. The problem with the quick code I came up with was that it only gets me BBC channels. What if [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[404,231,403,212,401,402,398,400],"class_list":["post-825","post","type-post","status-publish","format-standard","hentry","category-code","tag-gae","tag-google-app-engine","tag-programmes","tag-python","tag-radiotimes","tag-schedules","tag-tv","tag-xmltv"],"_links":{"self":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/825","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=825"}],"version-history":[{"count":0,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/825\/revisions"}],"wp:attachment":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=825"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=825"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=825"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}