Archive for the ‘code’ Category

Implementing a text box for entering tags in a dojo web app

Sunday, August 19th, 2012

I needed a text box for entering tags on a dojo web app. I ended up making my own – it was only a hundred or so lines of code, but I’m sharing it here as it might be useful to others.

The text box needed to provide auto-complete when you start typing something that matches an existing tag. Dojo already has a text box widget that does auto-complete : dijit.form.ComboBox – so I started from there, modifying it’s behaviour so that

  • the options it offers are based on the current tag you’re typing (instead of the whole contents of the text box)
  • if you pick one of the options, it only replaces the current tag you’re typing with what you select (instead of replacing the whole contents)

See it in action in this short video clip.

Because I’ve based it on dijit.form.ComboBox, I also get a bunch of features for free, including that options it offers are based on the contents of a data store, which can be backed by a REST API.

This supports paging, which means my REST API doesn’t have to return all of the tags – just enough to populate the visible bit of the drop-down list. I’m using Lucene to implement filtering in the REST API, so it can quickly return a subset of tags that matches what the user has started typing. I don’t need to download everything and filter it client-side – it can be smarter and more efficient than that.

That said, this might be overkill for some needs – you can easily create a client-side store in memory, without needing to write a REST API to back it.

(more…)

Preventing Internet Explorer from using Compatibility View

Wednesday, August 15th, 2012

I’ve had some trouble with Internet Explorer recently.

I was making a new web tool which looked fine in all browsers. Except Internet Explorer, where it looked a bit squiffy.

Internet Explorer has “Compatibility View”. Compatibility View makes IE behave like the older versions of Internet Explorer, the ones before Microsoft started paying more attention to web standards.

It makes sense – there are a lot of websites out there that were written to render well on old versions of Internet Explorer, and Microsoft needed to make the move to standards compliance in a way that doesn’t break all of them.

The problem is, Compatibility View can be a little… insistent.

It kept turning on, even though I didn’t want it, even though my site worked fine in new shiny standards mode, and looked horribly broken in Compatibility View.

You can manually disable it, but I don’t want to have to make users do that. As the web developer, I want to be able to disable it – to tell IE that I want the site to be rendered in standards mode.

It was a bit fiddly. Here’s how I did it.

(more…)

Using JMX to monitor UIMA running in a servlet

Wednesday, August 1st, 2012

Overview

A quick howto for if you’re running UIMA in a servlet, and want to be able to monitor your AE performance using JMX

Background

I’ve mentioned JMX before. Basically, a Java app can expose information and methods through a standard interface. Tools like jconsole, which come with Java, can then be used to monitor and administer the Java app.

UIMA (Unstructured Information Management Architecture) is an Apache project, providing a standards-based way to perform analytics on unstructured text. It hosts a pipeline of annotators: individual components each performing a specific text analytics task. As a document moves down the pipeline UIMA runs each of the annotators on the document. Each annotator adds it’s own annotations for the things it looks for in the text.

UIMA and JMX

UIMA supports JMX. UIMA registers an MBean for each annotator, letting you see the performance info for each annotator. In a pipeline of several annotators, it lets you see (amongst other things) how much time your document is spending in each annotator.

jconsole

In a stand-alone UIMA application, you basically get this for free. Start the application with the standard Java -D property for enabling JMX:

-Dcom.sun.management.jmxremote

It is ready to let jconsole connect to it.

(more…)

Has today been a good day?

Monday, April 16th, 2012

Last week, I came up with a quick hack, explained quite neatly by @crouchingbadger:

It was a bit of fun, even if it did seem to convince a group of commenters on engadget that I was a rage-fuelled XBox gamer. 🙂

There’s one big limitation with the hack, though: I don’t spend that much of my day in front of the TV.

It’s interesting to use it to measure my reactions to specific TV programmes or games. But thinking bigger, it’d be cool to try a hack that monitors me throughout the day to measure what kind of day I’m having.

I don’t spend much time in front of the TV, but I do spend a *lot* of time in front of my Macbook. And it has a camera, too!

What if my MacBook could look out for my face, and whenever it can see it, monitor what facial expression I have and whether I’m smiling? And while I’m at it, as I’ve been playing with sentiment analysis recently, add in whether the tweets I post sound positive or neutral.

Add that together, and could I make a reasonable automated estimate as to whether I’m having a good day?

(more…)

Smile!

Tuesday, April 3rd, 2012

The visualisations on this page need Flash and Javascript. Apologies if that means most of this page doesn’t work for you!

This is my mood (as identified from my facial expressions) over time while watching Never Mind the Buzzcocks.

The green areas are times where I looked happy.

This shows my mood while playing XBox Live. Badly.

The red areas are times where I looked cross.

I smile more while watching comedies than when getting shot in the head. Shocker, eh?

(more…)

Avoiding my Lucene TooManyClauses exceptions

Tuesday, March 20th, 2012

Before I start, I should point out that I’m not a Lucene expert. This post isn’t a definitive “you should do things this way” commandment from a Lucene mage. Think of it more as “I had this problem, and this seemed to work for me. I’m sharing it in case it helps you, too”.

I’m using Lucene to implement searches. Recently, as my Lucene index has grown (a lot), I was getting a lot of these errors when I tried to do a search:

org.apache.lucene.search.BooleanQuery$TooManyClauses: maxClauseCount is set to 1024
    at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:163)
    at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:154)
    at org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:63)
    at org.apache.lucene.search.WildcardQuery.rewrite(WildcardQuery.java:54)
    at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:383)
    at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:383)
    at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:162)
    at org.apache.lucene.search.Query.weight(Query.java:94)
    at org.apache.lucene.search.Searcher.createWeight(Searcher.java:185)
    at org.apache.lucene.search.Searcher.search(Searcher.java:86)

I’m guessing that TooManyClauses is a common problem for people getting going with Lucene.

It’s mentioned in the FAQ, and there are a few StackOverflow threads around about it.

But I couldn’t find a straightforward “you need to follow these steps to fix it” post anywhere, so I’ll add my experience here.

(more…)

Migrating from ActiveMQ to WebSphere MQ

Tuesday, March 13th, 2012

Overview

A side-project I’ve been playing with in the evening: Writing a JMX layer to allow apps written for ActiveMQ to migrate to WebSphere MQ with minimal modifications

Background

This came out of working on something that uses a JMS messaging provider. It uses it internally to allow components to communicate with each other, even when spread across multiple machines.

It uses Apache ActiveMQ – an open-sourced implementation of JMS. I wanted to try and get it working using WebSphere MQ – IBM’s implementation of JMS that I used to work on until five years ago.

As a messaging standard, the fact that both ActiveMQ and WebSphere MQ (WMQ) are JMS providers means that the way it puts and gets messages should just work.

But the JMS standard doesn’t cover administration (how queue managers are created and configured, how they’re started, how queues and topics are created, etc.) or monitoring (getting statistics about how many messages have been put or got, how many messages are on a queue, etc.)

All of this was done in an ActiveMQ-specific ways. This was what needed to be ported if I was going to get this to work with WebSphere MQ.

The project I’m porting is actually a bit of a black box. Rather than make a significant rewrite to get it to go from being ActiveMQ-specific to WMQ-specific, I wanted to see what I could add so that as much of the existing code could just work transparently.

I wanted to write a layer to sit between the ActiveMQ-app and WebSphere MQ, so that the app needn’t realise it’s not talking to the ActiveMQ broker it was written for.

(more…)

ETag header missing in HTTP No Content responses in Internet Explorer

Sunday, February 26th, 2012

If you’re one of that exceedingly rare breed who regularly check or subscribe to my blog, you probably want to give this post a miss. This one is more for people who find me through Google. A specific solution to a specific, geeky, problem.

Background

First a little scene setting…

Server side

I have a REST API that uses ETags for, amongst other things, concurrency control. That is, the version of an entity is (opaquely) identified by an ETag. You need to specify that ETag when you try and make any changes to that entity. If someone else changes the entity before you do, your ETag won’t match, so your update will fail, and you won’t unintentionally roll-back their change.

The REST API returns no content (HTTP 204) in response to a successful PUT request to edit an entity, and includes the new ETag representing the version of the updated entity.

Client side

I have a Dojo web tool that uses xhr.put to submit edits to the REST API. In order to make further subsequent edits to an entity without reloading the page, it stores the ETag that it gets back in the response header after every PUT.

The problem

In short, Internet Explorer. 🙂

(more…)