Feb 11

police.uk official crime maps — there should be a law against it

It’s always good when open data makes the headlines, albeit slightly for the wrong reasons today. Nonetheless, too much traffic to our website is a problem we’d all like to have. It shows public interest if nothing else. After all, who wouldn’t want an easy way to find out how much crime is on their street and in their neighbourhood?

But before we fall over ourselves to be grateful for this latest attempt at transparency we should exercise more than a little caution.

This won’t be news to anyone who thinks seriously about data, but a map is a visualisation, not the data itself. It’s one way of representing the underlying data. In as much as the data is accurate, complete and relevant, the police.uk website is simply giving us a single way to look at it that’s already been decided for us. No matter how often we’re reminded that the map is not the territory (and let’s be honest, most people have never heard that saying, let alone considered the issues in any depth), if you’ve only got the map it might as well be the territory. Psychologically, the two become conflated.

Perhaps apocryphally, Stalin said that it’s not who votes that counts but who counts the votes. Likewise, we should be hugely cautious about giving too much weight to official visualisations of data. As the policing minister Nick Herbert wrote today (my emphasis):

We live in the age of accountability and transparency. The public deserve to know what is happening on their streets, and they want action. By opening up this information, and allowing the public to elect Police and Crime Commissioners, we are giving people real power – and strengthening the fight against crime.

So what we’re looking at here isn’t a value-neutral scientific exercise in helping people to live their daily lives a little more easily, it’s an explicitly political attempt to shape the terms of a debate around the most fundamental changes in British policing in our lifetimes.

Transparency isn’t wrong. It’s absolutely vital to make a meaningful contribution to public debate, but we need to distinguish pseudo-transparency from the real thing. Spatial visualisation and analysis is enormously difficult to get right and even thoughtfully-designed visualisations require a fair bit of understanding to interpret correctly. Slap it on a map works fine when you just want to see where your local recycling centres are, but as soon as you start to classify crimes by type and bound them into streets and neighbourhoods you’re into the realm of professional spatial analysis. You need to know what you’re doing and have access to tools that enable you to shift category and spatial boundaries to account for anomalous effects. The newspapers that have run lists of the most crime-ridden streets in the country today might want to consider the fact that longer streets will on average have more crime than shorter streets, just to take one simple example of a relevant factor that’s not accounted for if you want to visualise this data in that way.

Whether police.uk is trying to pull a fast one on us or is simply naive about the possibilities for doing something meaningful for a general audience with this data, the result is the same: plenty of heat and very little light. Mark Monmonier’s How to lie with maps provides a good starter text for the myriad ways in which maps can deceive, intentionally and otherwise.

On a more positive note, we’re also getting the data itself to use. This is a good thing, in as much as the data itself is, as stated above, accurate, complete and relevant. Unfortunately, it’s not. It’s derived data that’s already been classified, rounded and lumped together in various ways, with a bit of location anonymising thrown in for good measure. I haven’t had a detailed look at it yet but I would caution against trying to use it for anything serious. A whole set of decisions have already transformed the raw source data (individual crime reports) into this derived dataset and you can’t undo them. You’ll just have to work within those decisions and stay extremely conscious that everything you produce with it will be prefixed, “as far as we can tell”.

£300K for this? There ought to be a law against it. Worse than useless, it’s thoroughly misleading. In future, we need fine-grained datasets for these kinds of applications and a big head start (six months?) between publishing official data and the commissioning of official expensive projects around it to ensure that everyone really understands what can and should be done with it.

Jan 11

TfL’s information doesn’t want to be free

I’m a big fan of London’s Barclays Cycle Hire scheme. I praised it when it was introduced, I created a free API service for developers to help them get live data about bike availability to make useful apps for people, I built a realtime 3D visualisation of bike availability and I even wrote a simulator to help me better understand bike movement patterns. I still think it’s a great system and I’m keen to do what I can to help people use it and to make it work better.

So when Boris announced that the scheme had just passed its one millionth journey milestone it seemed like a good time to ask Transport for London for the journey data. It’s an easy enough job: Just a single database query to fetch the times, origin and destination of each trip. If I could load this data into my simulator I might be able to see where extra bikes and docking stations might be needed. I put in a Freedom of Information Act request, confident that I’d have the data within the 20 working days limit required by law.

That was three months ago on 8 October. I’m still waiting.

The good news is that the data has just been made available in TfL’s developers’ area and some people are already starting to do interesting and useful things with it. But behind that happy fact is another example of a public body deciding to completely ignore their Freedom of Information Act responsibilities and the rights of an applicant in pursuit of its own perceived interests.

Data delayed is data denied

Under the law, public bodies have got 20 working days to reply either with the information requested or to claim an exemption. The time limit is there for a good and obvious reason: Without it, public bodies can string an applicant along indefinitely, and with many requests being time-sensitive this can often past the point where the information would be useful.

Fortunately I didn’t have a specific deadline for using this data but it certainly would have been more useful to me sooner rather than later. I could have been working on it for two months by now. And if TfL had been keen for other developers to use it, they could have had it too. Some developers were keen to get hold of it for the Open Data Hackday on 4 December last year but that came and went without any sign of the data.

So why was the data delayed? I estimate that there would have been less than two hours work to produce it and send it to me, or to put it on an open website where anyone could download the file.

“Your free information is in this locked box. Sign this contract and if we like what you’re doing you can have it.”

The answer lies in TfL’s desire to wrap the data in a complicated contract rather than make it available to me or anyone else directly and legally unencumbered. This might make sense in the context of some data and some data users but it’s directly inimical to the aims and indeed the law of freedom of information. The data in TfL’s developers’ area isn’t open data and it’s not available to everyone. As the site says:

Please complete the registration form below to use our syndication feeds. Before we give permission to use any feeds, we need to know how they will be used, where they will be used and how many people are likely to view them.

So why should anyone have to apply for permission to get access to their freedom of information answer? Why not just send it to the applicant?

The Information Commissioner, who regulates public bodies’ compliance with the Freedom of Information Act is quite clear that information must be supplied regardless of the identity and motives of the applicant. His guidance (PDF) states:

A request therefore has to be considered on the basis that it could have been made by any person; the identity of that person is not a material consideration when deciding whether or not to release information. It is for this reason that we do recommend as good practice that requests under obvious pseudonyms should normally be considered unless there is reason to think that any of the matters below need to be taken into account.

There follows some general exceptions regarding vexatious requests, people requesting their own personal information and costs issues, none of which apply in this case.

On the issue of the applicant’s motives:

There is also no specific reference in the FOIA to the principle that requests for information must be considered without reference to the motives of the requester.

However, there are no references in the Act indicating that anyone can be asked to provide a reason for requesting information and it is from this absence that the principle [of disregarding the applicant's motives] is drawn.

The Information Commissioner then quotes the Lord Chancellor’s code of practice on freedom of information:

Authorities should be aware that the aim of providing assistance is to clarify the nature of the information sought, not to determine the aims or motivation of the applicant. Care should be taken not to give the applicant the impression that he or she is obliged to disclose the nature of his or her interest as a precondition to exercising the rights of access, or that he or she will be treated differently if he or she does (or does not).

But if I want to get a response to my FOI request from TfL I am asked to enter into a contract with them whose terms include:

2.1.2 [You shall] only use the Transport Data in accordance with these Terms and Conditions and the Syndication Developer Guidelines, and not use such information in any way that causes detriment to TfL or brings TfL into disrepute. The rights granted to You under these Terms and Conditions are limited to accessing and displaying or otherwise making available the Transport Data for the purposes stated by You in Your registration.

So not only is TfL’s contract explicitly asking me to state my motive as a precondition of access, it also constrains me from using the information for any other purpose and arguably prevents me from using that information to criticise TfL, thereby causing it “detriment” or bringing it into “disrepute”. If I don’t agree to this they can deny access altogether and if I subsequently break the agreement in their view they can revoke access. This is a funny kind of free information.

The Freedom of Information Act is designed to enable scrutiny of government. It’s inevitable that some information requested may cause embarrassment to the public body providing it or even bring it into disrepute. If the law is going to be workable at all, public bodies must consider each application on its merits alone without concerning themselves with the applicant or their motives. To do otherwise would allow public bodies to effectively pick and choose which requests they answered. TfL’s decision to require me to enter into an extremely restrictive contract with them to get a response to my freedom of information request is applicant and motive discrimination by the back door. It’s not something that should be tolerated from TfL much less adopted by other public bodies as a way to weaken FOI applicants’ rights. Free information should not come wrapped in a restrictive contract wall. That’s why I won’t be accepting TfL’s terms and I’ll simply have to leave the analysis of this Cycle Hire data in the very capable hands of others.

Aug 10

London Cycle Hire 3D Visualisation in Google Earth

I’ve used my Boris Bikes API which serves live data about bike and docking station availability and Google Earth to create a 3D visualisation that shows the current bike availability across London.

Movie by Andrew Hudson-Smith, Digital Urban/UCL CASA

Boris Bikes API Google Earth 3D Visualisation - 4

Continue reading →

Aug 10

Sutton pedestrian crossings proposed for removal by TfL

Transport for London are proposing to review and possibly remove 145 traffic lights and pedestrian crossings across London.

Here’s a map I’ve made of the five crossings in the London Borough of Sutton that are under review.

Download the map data as KML for Google Earth etc.

Councillor Lester Holloway is campaigning to retain the crossing at Collingwood Road / Bushey Road as has been reported in the Sutton Guardian and on his blog.

Aug 10

Boris Bikes — A gift to the city


If you’ve ever wanted to whistle up a pair of wheels while walking around London, now you can. Friday’s launch of the Barclays Cycle Hire scheme puts 6000 short-hire bikes at 300 docking stations within a few hundred metres of any point in the centre of the city. No matter where you are, you shouldn’t be more than a few minutes’ walk from a hire bike.

Continue reading →