Opening the Door on Data: The Importance of Good Numbers – Part 2
Posted On : September, 3, 2013 | By សំឡេងទីក្រុង

This is the second part of a three part article about good data. You can find the first part here Opening the Door on Data: The Importance of Good Numbers – Part 1

Quality versus Quantity

It is easy to be charmed by the idea of a lot of data. With a lot of data you can do quite a bit. For Urban Voice it means we can add a lot of reports to our map, and with it draw some interesting conclusions. For our campaigns it would spell success in most people’s eyes. Take for example a fictional campaign about how many houses in Phnom Penh have red doors. Because we only want reports about red doors, not any other colour, then we might presume that any reports we receive about doors and their colour are about red doors. Why would one report about doors of any other colour?

Blue and Red Doors. Photo by Michael Osmenda

Blue and Red Doors. Photo by Michael Osmenda

Blue and Red Doors. Photo by Michael Osmenda

Blue and Red Doors. Photo by Michael Osmenda

The problem is that not every report about doors is going to be about red doors. Maybe someone misunderstood and sent a report about a blue door. Or maybe someone just really likes their orange door, so they sent a report about that. Then there might someone who has a door that is both red and blue and they reported that. So Urban Voice has a lot of reports, but they might not all be correct. The Urban Voice Team then has to go through all of the data – that is the reports –to make sure it’s correct. But verification of reports is not always easy. If all a particular report says “There is a red door at this address,” but no picture is provided, then Urban Voice has little real information to go on, and without physically visiting the place the Team cannot verify the information.

That’s because the quality of the data in the report is poor or incomplete. Quality information is not always easy to identify. In the example of the fictional campaign recording all the red doors in Phnom Penh, a good report includes both the address and a photo of the red door. The best possible data you could provide along with an address would be a photo that’s been correctly located on a map so the Urban Voice Team can accurately locate the door, and thus verify the data.

Definitions and Methodologies

As mentioned in the Phnom Penh Post article, “keyword definitions and methodology problems,” are persistent in Cambodia. While they are an issue that anyone, regardless of location, must overcome when collecting data, it’s a particular problem of crowdsourcing and crowdmapping.

Going back to the red door campaign, the keyword definition that people may dispute is the colour red. There are many shades of red. For someone a colour may be orange, but for another person red. Someone else might call a door brown, while another considers it red. The very idea of what red is, and we all have our own idea, is vital to whether or not the data provided is relevant.

Color scales for mapping

Color scales for mapping

Methodology is about how and what data is collected. It is influenced by the keyword definitions, but can also influence the keywords. In the fictional campaign for red doors in Phnom Penh, the methodology could be extremely specific – requiring that only doors viewable from the street be reported – or very open – allowing any door that is red can be reported.  In each case, the result will be significantly different.

A large part of methodology is also how you record the data and report it. Given that the house numbers on any street in Phnom Penh are not in order, nor are they necessarily unique, simply providing an address is not enough. This means that the collection methodology may need to say that a cross-street should be provided or an indicator placed on the map when submitting a report.

This is the second part of a three part article about good data. You can find the first part here Opening the Door on Data: The Importance of Good Numbers – Part 1

Urban Voice Cambodia

Urban Voice Cambodia