Developers are great but…

Doing wonderful things with data: creating apps that everyone can use to seamlessly skip through their lives, or educate/reveal information through linking the data is always going to be awe-inspiring and useful/needed. We know this, hence there is a real revolution in the way the developer community is being trusted to help government open data in a useful and appropriate way.

But equally there are other benefits to having people freely playing with data – what are they doing with it and why?

Take for example the fact that two of the apps developed independently from each other at Young Rewired State were for finding safe routes to school. This tells us more than just: oh there’s a clever app, let’s talk to the IT people and data people to get this live as a government service. It tells us that young people do not feel safe going to school and in a group of 50 people aged under 18, two groups have chosen to give up their weekend to try to develop a solution to this. (That’s quite a high margin).

To any business, organisation or government, this is extremely useful information. The solution is not the app, that might form part of it, but what the development of such an app tells us is that there is a fundamental problem, a very clearly defined one, that needs some attention.

I could go on to give countless examples, but I know that you are all brilliant enough to think through the implications of this for yourselves. And why I think that it is important that those beyond the geek community keep a very close eye on what comes out of making data available.

On that note, I am hoping to get some of the gen on the apps being created behind the closed Beta at data.gov.uk as I suspect that there some early lessons we can all take from this. And when they do open it all up, please take time to look through what has been done, and see what clues you can find to making your own businesses better – in and outside of government.

6 responses

  1. Pingback: Stuff I’ve seen November 4th through to November 5th | Podnosh

  2. Opening up data seems an inherently good thing, but what about the risks you’re taking on when using the data, particularly in a context that it wasn’t collected to be used for?

    As Philip Virgo pointed out last week:

    assume random errors rates of up to 10% on original data entry unless the material was entered and checked by those with a vested interest in its accuracy and with the knowledge and authority to ensure that errors were identified and corrected. We were also told to assume that it would subsequently degrade at about 10% per annum unless actively used and updated by those with the knowledge and ability to update the files.

    Is there some sort of implicit assumption that opening up the data will force a general clean-up and improvement in quality? In the meantime, maybe there needs to be a way of marking the relative dodginess of a source of data (eg if even you know 10% of records may be wrong, you judge what margin to add in when making decisions based on an analysis of the data)

    Kind of tangential to your post, but I’d be interested to hear what you think?

  3. Pingback: #Opendata and poor quality control = trouble? « Spartakan

  4. I was a developer for years at a new media company and I even in a small company had access to vast amounts of information. I worked on primeraly public sector websites.

    I fully understand your point and am happy to discuss further with you – but for the majority most developers are to busy getting on with their job to be distracted with the data. It does give you an interesting insight into peoples lives and in particular the passwords people use.

  5. My guess is that the problems of data quality would just be highlighted in the appropriate way. Accuracy doesn’t come from the automated solutions you build with a dataset; it comes from humans interfacing with the world: humans assess accuracy (comparing the representation to the real world entity), produce accuracy (are the ones who can be expected to do so at the point of capture, as they confront the real world entity being represented), correct for accuracy, etc.

    So we discover that apps that are built on data not designed for the purpose (and that is certainly an important issue — I just speak here of the attribute of accuracy, which is well and away the single most important one), result in not-so-good results. The result of that is recognition of the role humans will have to play in producing the information, not a recognition that apps can’t be developed off of data not designed for their purpose.

    All the other aspects of data quality do relate to data design and fitness for purpose, but the most important one is the one the nature of which we keep forgetting: accuracy — and confronting that problem has to produce the right understanding: engage humans in the role of producing accurate data. This might be the key characteristic of a new sector of employment in the new economy.

  6. “The data contain too many errors or are too out-of-date for public use.” This becomes a self-fulfilling prophecy. There is no pressure to correct errors in unpublished data, so they remain too inaccurate to publish. The remedy is publication, warts and all. If better data would be valuable, people will press for them.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: