{"id":1782,"date":"2011-09-11T12:07:33","date_gmt":"2011-09-11T12:07:33","guid":{"rendered":"http:\/\/dalelane.co.uk\/blog\/?p=1782"},"modified":"2011-09-11T12:39:15","modified_gmt":"2011-09-11T12:39:15","slug":"annotating-photos-with-tweets","status":"publish","type":"post","link":"https:\/\/dalelane.co.uk\/blog\/?p=1782","title":{"rendered":"Annotating photos with tweets"},"content":{"rendered":"<p>I have a lot of digital photos. <\/p>\n<p>An insane amount &#8211; something like 40,000 photos that go back over a dozen years since I first got a digital camera at University.<\/p>\n<p>I store them based on the date that they were taken, using a folder structure like this:<\/p>\n<p><a href=\"http:\/\/www.flickr.com\/photos\/dalelane\/6135324063\/\" title=\"screenshot of folder structure where I store photos\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/farm7.static.flickr.com\/6170\/6135324063_d7a91d7312.jpg\" width=\"450\" height=\"306\" alt=\"screenshot of folder structure where I store photos\"\/><\/a><\/p>\n<p>For a while, I used to drop a readme.txt text file into some of the folders saying where the photos were taken or what I was doing. This was partly so that when I look at the photos ten years later I&#8217;ve got something to remind me what is going on, but mainly to make it possible for me to search for photos of something when I can&#8217;t remember the date it happened. <\/p>\n<p>But in recent years, I&#8217;ve been too lazy to keep that up, and rarely ever add a readme file.<\/p>\n<p>I thought that my tweets might be a good alternative. There is a reasonable chance that if I took a photo of something interesting, that I might have tweeted sometime that day about where I am or what I&#8217;m doing. <\/p>\n<p>I wanted to populate each of my folders with a day&#8217;s photos in it with a tweets.txt text file containing tweets posted on that day.<\/p>\n<p><!--more-->I signed up for <a href=\"https:\/\/tweetstreamapp.com\/\">Twitter Backup from Tweetstream<\/a> back in March (because I tend to sign up to any new website I hear of in case it ends up being useful!) and it&#8217;s been downloading my tweets. <\/p>\n<p><a href=\"http:\/\/www.flickr.com\/photos\/dalelane\/6135869980\/\" title=\"screenshot of download options provided by tweetstream\"><img loading=\"lazy\" decoding=\"async\" style=\"border: thin black solid;\" src=\"http:\/\/farm7.static.flickr.com\/6170\/6135869980_2d172c0a81.jpg\" width=\"450\" height=\"338\" alt=\"screenshot of download options provided by tweetstream\"\/><\/a><\/p>\n<p>Finally, I found a use for tweetstream &#8211; they let me download my twitter history (as far back as July 2009, anyway&#8230; because I had <a href=\"http:\/\/blog.downstreamapp.com\/tweet-history-limit\">more than 3,200 tweets<\/a> when I first signed up with them) as a json file. <\/p>\n<p>With that, it was just a matter of hacking a little bit of Python to grab the tweets and copy them into the folders with my photos. <\/p>\n<pre style=\"border: thin solid silver; background-color: #eeeeee; padding: 0.7em; font-size: 1.1em; overflow: auto;\">#######################################\r\n# IMPORTS\r\n#######################################\r\n\r\n# lets us parse the twitter data\r\nimport json\r\n\r\n# lets us check for existence of photo folders\r\nimport os\r\n\r\n# lets us parse tweet timestamps\r\nfrom datetime import datetime\r\n\r\n# lets us persist state between runs of the script\r\nimport pickle\r\n\r\n\r\n\r\n#######################################\r\n# CONSTANTS\r\n#######################################\r\n\r\n# the root folder for where my photos are stored\r\nphotos_root = \"\/media\/mybook\/photos\/\"\r\n\r\n# the folder structure for where photos taken on a given date\r\n#  are stored, relative to photos_root\r\nfilesystem_date_format = \"photos_%Y\/photos_%y%m\/photos_%y%m%d\/\"\r\n\r\n# the name of the file used to store tweets\r\ntweets_file_name = \"tweets.txt\"\r\n\r\n# the time string format used in the twitter data being parsed\r\n#\r\n# Notice that I'm ignoring the timezone, as the built-in\r\n#  datetime library makes parsing it a pain (and I'm being\r\n#  lazy).\r\n#\r\n# I can ignore it as I'm in the UK, so this is always \"+0000\"\r\n#  for me anyway. You might need to change this to fit your\r\n#  locale.\r\ntwitter_date_format = \"%a %b %d %H:%M:%S +0000 %Y\"\r\n\r\n# for getting a time string from a datetime object\r\nshort_time_format = \"%H:%M:%S\"\r\n\r\n\r\n\r\n#######################################\r\n# STATE\r\n#######################################\r\n\r\n# last tweet stored - so that the next time we run this script\r\n#  with a more recent set of tweets, we don't re-store the same\r\n#  older tweets again\r\n# first time we run it, just leave it as 0\r\n\r\nmost_recent_tweet_id = 0\r\n\r\nif os.path.exists('last_tweet_processed.dat'):\r\n    with open('last_tweet_processed.dat', 'r') as state_file:\r\n        most_recent_tweet_id = pickle.load(state_file)\r\n\r\n   \r\n   \r\n#######################################\r\n# SCRIPT\r\n#######################################\r\n\r\n# open the file containing tweets from my timeline and parse\r\n#  it into an array of tweet objects\r\njson_data = open('timeline.json')\r\ntimelinedata = json.load(json_data)\r\njson_data.close()\r\n\r\n# for each tweet in the array contained in the json file...\r\n#  (Note that the tweets are stored in reverse chronological\r\n#   order. I prefer chronological order, so I go through the\r\n#   array in reverse.)\r\nfor tweet in reversed(timelinedata):\r\n\r\n    # if we've not already looked at this tweet...\r\n    #  (notice that we're assuming tweets in the data will\r\n    #   be stored in id order - but this appears to be a\r\n    #   safe assumption)\r\n    if tweet['id'] > most_recent_tweet_id:\r\n       \r\n        # use the time the tweet was created to work out\r\n        #  the folder where photos on this date will be\r\n        #  stored\r\n        tweetdate = tweet['created_at']\r\n        tweettimestamp = datetime.strptime(tweetdate, twitter_date_format)\r\n        relative_folder_path = tweettimestamp.strftime(filesystem_date_format)\r\n        folder_path = photos_root + relative_folder_path\r\n        photos_exist = os.path.exists(folder_path)\r\n       \r\n        # do we have any photos for this date?\r\n        if photos_exist:\r\n           \r\n            # get the bits of the tweet that we want to use to\r\n            #  annotate the photos - the URL for this tweet,\r\n            #  the time it was sent (we don't need the date as\r\n            #  the location of the file tells us that), and\r\n            #  the tweet text itself\r\n            tweet_url = \"http:\/\/twitter.com\/\" + tweet['user']['screen_name'] + \"\/status\/\" + tweet['id_str']\r\n            tweet_time = tweettimestamp.strftime(short_time_format)\r\n            # some tweets can have unusual characters - ignore them\r\n            tweet_text = tweet['text'].encode('ascii','ignore')\r\n       \r\n            # we also get the human-readable description of\r\n            #  the place that the tweet was sent from, if it\r\n            #  is available\r\n            tweet_place = None\r\n            if tweet['place']:\r\n                tweet_place = tweet['place']['full_name']\r\n\r\n            # append the tweet info to a 'tweets.txt' file in\r\n            #   the photos directory\r\n            tweets_file_path = folder_path + tweets_file_name\r\n            with open(tweets_file_path, 'a') as tweets_file:\r\n                tweets_file.write(tweet_time + \"\\n\" + tweet_text + \"\\n\" + ((tweet_place + \"\\n\") if tweet_place else \"\") + tweet_url + \"\\n\\n\")\r\n\r\n        # update the ID of the tweet that we have processed\r\n        most_recent_tweet_id = tweet['id']\r\n       \r\n# update the most recent tweet that we processed so that\r\n#  we don't duplicate it\r\nwith open('last_tweet_processed.dat', 'w') as state_file:\r\n    pickle.dump(most_recent_tweet_id, state_file)<\/pre>\n<p>And there we go. <\/p>\n<p>With tweetstream doing most of the hard work, I can use my tweets to annotate my photo collection. Neat \ud83d\ude42<\/p>\n<pre style=\"border: thin solid silver; background-color: #eeeeee; padding: 0.7em; font-size: 1.1em; overflow: auto;\">dale@media-hub:\/media\/mybook\/photos\/photos_2011\/photos_1108\/photos_110828$ ls\r\nDSC02602.JPG  DSC02615.JPG  DSC02630.JPG  DSC02644.JPG  DSC02656.JPG\r\nDSC02609.JPG  DSC02623.JPG  DSC02636.JPG  DSC02650.JPG  DSC02662.JPG\r\nDSC02610.JPG  DSC02624.JPG  DSC02637.JPG  DSC02651.JPG  MOV02664.MPG\r\nDSC02611.JPG  DSC02625.JPG  DSC02638.JPG  DSC02652.JPG  tweets.txt\r\nDSC02612.JPG  DSC02626.JPG  DSC02639.JPG  DSC02653.JPG\r\nDSC02613.JPG  DSC02627.JPG  DSC02641.JPG  DSC02654.JPG\r\n\r\ndale@media-hub:\/media\/mybook\/photos\/photos_2011\/photos_1108\/photos_110828$ more tweets.txt\r\n13:37:15\r\nI've had a productive day, digging holes to put children in. :-&#41; http:\/\/t.co\/4T2Ma16\r\nChristchurch, Dorset\r\n<blockquote class=\"twitter-tweet\" data-width=\"450\" data-dnt=\"true\"><p lang=\"en\" dir=\"ltr\">I&#39;ve had a productive day, digging holes to put children in. :-) <a href=\"http:\/\/t.co\/4T2Ma16\">http:\/\/t.co\/4T2Ma16<\/a><\/p>&mdash; Dale Lane (inactive) (@dalelane) <a href=\"https:\/\/twitter.com\/dalelane\/status\/107808966280617984?ref_src=twsrc%5Etfw\">August 28, 2011<\/a><\/blockquote><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script>\r\n\r\ndale@media-hub:\/media\/mybook\/photos\/photos_2011\/photos_1108\/photos_110828$ <\/pre>\n","protected":false},"excerpt":{"rendered":"<p>I have a lot of digital photos. An insane amount &#8211; something like 40,000 photos that go back over a dozen years since I first got a digital camera at University. I store them based on the date that they were taken, using a folder structure like this: For a while, I used to drop [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[116,499,212,151],"class_list":["post-1782","post","type-post","status-publish","format-standard","hentry","category-code","tag-backup","tag-photos","tag-python","tag-twitter"],"_links":{"self":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1782","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1782"}],"version-history":[{"count":0,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1782\/revisions"}],"wp:attachment":[{"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1782"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1782"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dalelane.co.uk\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1782"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}