MPs in the Media Mash-up

At the end of July the Guardian held an internal hackday at their offices in King’s cross. They invited me back, and another engineer from BBC Radio’s A&Mi department, Chris Lowis, came along with me. We teamed up with Leigh Dodds & Ian Davis from Semantic Web specialists, Talis to produce an ‘Interactive-MP-Media-Appearance-Timeline’ by mashing up data from BBC Programmes and the Guardian’s website.

Before the event Talis extracted data about MPs from the Guardian’s Open Platform API and converted it into a Linked Datastore. This store contains data about every British MP, the Guardian articles in which they have appeared, a photo, related links and other data. Talis also provide a SPARQL endpoint to allow searching and extraction of the data from the store.

Coincidentally, the BBC programmes data is also available as a linked datastore. By crawling this data using the MP name as the search key we were able to extract information about the TV and radio programmes in which a given MP had appeared. A second datastore was created from the combination of these two datasets, and by pulling in some related data from dbpedia. Using this new datastore we created a web application containing an embedded visualisation of the data.

We created the web-app using the lightweight ruby web-framework Sinatra. A simple RESTful URL schema provided access to a web page showing basic information about an MP.

Nick Clegg: a busy man in 2009

Nick Clegg: a busy man in 2009

In addition we queried the datastore to give a list of all of the MPs appearances across Guardian and BBC content. This was returned as a JSON document, and passed to an embedded Processing Java applet. A Java applet may seem like an unusual choice in 2009, but Processing is an excellent choice for the rapid development of responsive graphics applications, due to its integration with existing JAVA libraries, and its powerful graphics framework.

Leigh at Talis put together a screencast showing the app in action:

The Processing applet shows a month-by-month scrollable timeline. The user can move back and forward in time, at variable speeds, by pressing the mouse either side of the applet frame. In each month slot, a stack of media appearances is displayed, represented by the logo of the BBC brand, or in the case of Guardian articles, the Guardian brand. Moving the mouse over a media appearance reveals the headline or programme description and clicking a media appearance will navigate the browser to the episode page on the /programmes or the article page on guardian.co.uk.

We demonstrated the application to the hackday audience, and in the prize giving ceremony were awarded the ‘Best use of third-party data’ award. We think that the application demonstrates some of the ways the structured RDF data provided by BBC’s /programmes website can be used. This project shows how powerful the linked-data concept is when used in conjunction with other data that has been exposed in a similar way. As more media organisations expose their domains in this manner, more interesting and wide-reaching visualisations and web-applications can be built.

Thanks to Chris Lowis for contributions to this post. Photos courtesy of Mat Wall.

Now also published on the BBC’s Internet blog