Overview
Counting and visualising which keyboard keys I press the most often
Background
I noticed something yesterday.
The lettering on my left Ctrl button is a lot more faded than the lettering on my right Ctrl button.
I must press the left Ctrl button more often than I do the right one.
That got me looking at the rest of the keys on the keyboard. Some of them are faded, too. Some a little, some a lot.
I must press some of those keys more than I do others.
I’ve played hangman, so I know that there are some letters that occur more frequently in English words than others. So I could have just said that those are the keys I probably use the most and left it at that.
But… I don’t spend that much time writing documents or large chunks of English.
I’m a code monkey. I’m not sure that distribution would necessarily apply to me.
For example, I probably use the semi-colon key quite a lot – at the end of every line when writing in some languages. That wouldn’t be true for people writing English.
So I wondered how much I use each keyboard key in comparison to each other.
Being an obsessive compulsive geek, I couldn’t leave that as an idle wondering. I had to find out. I had to go and get the data.
Getting the data
I wrote a quick-and-dirty keylogger in Python. It counts how many times I press each key, and every 15 minutes write the counts to a file.
It’s very hacky – for example, I’ve not added a way to stop the keylogger, so I have to use taskkill when I want to stop it 🙂
Oh, and it’s Windows-specific, as I used pyHook.
import pythoncom, pyHook import timer from copy import copy from time import sleep from threading import Timer currentCounts = {} countsToStore = {} def onKeyDown(event): keyname = event.GetKey() if keyname not in currentCounts: currentCounts[keyname] = 1 else: currentCounts[keyname] += 1 def storeCounts(): while True: countsToStore = copy(currentCounts) wordlefile = open("keyswordle.txt", "w") print>>wordlefile, "key\tcount" for keyname in countsToStore: label = keyname if keyname.startswith("Media_"): label = keyname[6:] print>>wordlefile, label + "\t" + str(countsToStore[keyname]) wordlefile.close() countsToStore = {} sleep(900) captureThread = Timer(900.0, storeCounts) captureThread.start() hookmgr = pyHook.HookManager() hookmgr.KeyDown = onKeyDown hookmgr.HookKeyboard() pythoncom.PumpMessages()
Visualising the data
I used the IBM Word Cloud Generator to turn the counts into a word cloud.
I wrote my Python script so that it would write the key-press counts to a file in a format that the word cloud generator would understand.
To run it, I used the following config file:
font: c:/windows/fonts/georgiab.ttf format: tab inputencoding: UTF-8 firstline: headings wordcolumn: key weightcolumn: count background: FFFFFF palette: 2367CD, 3423CD, 8923CD, CD23BC, CD2367, CD3423, CD8923, BCCD23, 67CD23, 23CD34, 23CD89, 23BCCD, 4C88E1, 81ABEA, E1A54C, EAC081 placement: HorizontalCenterLine shape: SQUARISH orientation: HORIZONTAL stripnumbers: false
and the following command:
java -jar ibm-word-cloud.jar -c config.txt -w 800 -h 600 < keyswordle.txt > keyswordle.png
And that’s pretty much it. I had a pretty word cloud to show me how much I press each key on my keyboard compared with the others.
Why did I write my own keylogger?
I’m a little paranoid. Googling for “keylogger” and running the first setup.exe I found didn’t feel like it’d be too safe.
I’m sure there are trustworthy keyloggers out there, but writing a bit of Python only took me ten minutes which is probably less time than I would have spent researching ones already out there!
Did you “fix” the data?
I removed ‘Space’ from the visualisation. It turns out I press the space bar a lot – making the rest of the word cloud a little difficult to interpret.
What have I learnt doing this?
Erm… almost nothing. I think I learnt a depressing little bit about just how easily I can get side-tracked obsessing about trivial and unimportant details. 🙂
Tags: keylogger, python, visualisation, wordle
I don’t see semi-colon in the visualisation at all. Does that mean you were wrong about pressing that a lot?
Nah, it means I was too impatient to post this before letting it run for a while. 🙂
These word clouds are the result of a few hours logging, and the only code I’ve written since yesterday evening was in Python. (no semi-colons)
I’ve got some Java and some JavaScript development I need to do in the next few days, so I’ve left the keylogger running and will come back and update the wordclouds in a week or so.
Very nice – you might like http://iographica.com/ as well 🙂
Mike Knuepfel did a much better visualisation than my word cloud – making an actual 3D model with frequency represented as the height of a column over the keyboard keys.
Keyboard Frequency Sculpture from Michael Knuepfel on Vimeo.
He used the standard distribution of letter frequencies from Wikipedia, rather than tracking his own usage, but as a visualisation it has much more impact.
I’d love to see my own data represented like this.
See his blog for his write-up
(Thanks to Kevin for pointing this project out to me)
Cracking post!! Very true, it is very easy to be motivated to do the things you want to do, rather than the things you should.
Chris
PS I’m supposed to be doing a doc review!
The results looked very different after leaving it to run for a few days…
No wonder my left Ctrl key is faded – I use it a lot
[…] CurrentCosts to monitor my home energy usage on the web and on my mobile, I wrote code to find out which keyboard keys I press most often, I made a whole website to visualise patterns in what I watch on TV, I wrote code to make map […]