making sense of online data: tidy shelves and good coffee
If you’ve ever wandered through a small bookshop filled with old books in stacks that seem to have no order or logic, then you understand some of the challenges to using open data on the web.
Imagine if bookshop owners had hundreds of rare and valuable books, but no shelves. Or unlit stores. Imagine if you could not get to the books because the shop door was stuck and the racks were unmarked and the store was only open three hours per day every three days. The process of getting the books to the people isn’t finished when you’ve stocked your store. People need a way in and a way around.
Book collectors may appreciate the adventure of squatting on the floor hunting through dusty piles to find the right book, but the average web user lacks the tools or the training to turn digital piles of data into a small library of the information that matters most.
The growing community of transparency advocates and practitioners is widening its focus from the “What if…?” of greater government openness to the “What now…?” of making data more usable and easier to share. Until recently, financial or regulatory data secured by transparency legislation was probably available only in paper form, in the office of a clerk in the federal capital, hundreds of miles from the communities whose economic survival depends on the money in question.
As legislation becomes more tech-savvy, disclosure requirements have called for public release of data on the web, a big improvement to standard practice, but less useful than one would expect if the figures are released only in PDF files over two megabytes, usable only by citizens with easy computer access and Internet speeds fast enough to download big files.
Even in a best-case scenario where a government agency does release data–not just onto the web but “into the cloud,” by publishing information in digital form, displayed directly onto web pages and in a format that computers can read and repackage for new uses–most web pages with government data still look like the cluttered bookstore. They lack a well-organized layout for getting around, there is no annotated catalogue organizing the inventory, and neglect has resulted in broken areas and intermittent access.
An important goal of our bridge-building between transparency advocates and technology groups is to promote better standards for data usability, standards that include not only good organization of websites, but good data formats that enable redistribution, good presentation that does not overwhelm users or distract them with extraneous visual elements, and simple explanations of the information provided.
As we collect and generate the data that fuels advocacy, we should maintain a focus on the elements of effective dissemination and persuasion. Some of these fundamental principles for good data distribution include:
Make it easy to use. Picture the difference between a web page of airfares that you can sort by price or departure time and a Microsoft Excel file of airfares that you can could use one hundred different ways, if only you knew how. The presentation and structure of your information becomes the story that your information tells.
Sometimes “usability” means layout, sometimes it means aesthetics, sometimes it means which tools you use. Try to have an interface that is “persuasive” and not “dissuasive” in how it ushers people through the information. (Credit to the team at @demo_cratica for the term “dissuasive interface.”)
To dig deeper, read about Edward Tufte, FlowingData and the problems with infographics, and see this smart guide from John Emerson and Open Society Foundations.
Make it easy to copy and re-use. Too often advocacy groups that succeed in extracting transparency data do not follow through and repackage the data into simple spreadsheets for download, distinct data sets each with their own web URLs, or “widgets” that allow other groups to use the data in other ways. If open government means too little without mechanisms of accountability, then open data falls short without follow-through that enables easy sharing.
To dig deeper, read suggested principles for government transparency from former US CIO Vivek Kundra and the Open Knowledge Foundation. Also see a sample manifesto for open data shared by Andrew Rasiej, and one simple tool for promoting distribution, Facebook’s Like button.
Give people context; words matter. It’s common sense, but the simple step of explaining the data to non-experts is often neglected in data dissemination. This effectively shuts out not only users from affected communities, but many leaders and advocates with the power to help create change if the story behind the data was made clear to them.
As data proliferates across the Internet, the role of the “data journalist” is increasingly recognized as vital to advocacy, and tools like DocumentCloud have emerged as “infomediary” mechanisms to make big data sets and obscure policy information more comprehensible. Before the New York Times published the WikiLeaks cables, for instance, journalists spent days poring over their contents in order to create the scaffold of rudimentary summaries attached to each cable posted on NYTimes.com.
To dig deeper, also check out some tips on the fundamentals of writing for web sites and emails from the Guardian, O’Reilly Radar, UX Magazine, Free Range Studios, M+R Research Labs and Madeline Stanionis.
Start with what’s relevant. There’s a reason why 89% of people seeking local information in the U.S. want weather information (more than look for breaking news, politics, traffic, restaurants). People are motivated by what is most relevant to their lives. That’s why Ushahidi (or Craigslist) is so frequently used, and why so many online tools ask for your postal code or Zip code.
For inspiration on making data personal and local, check out FixMyStreet, WhereDoesMyMoneyGo, IfItWereMyHome and Recovery.gov.
In the end, data doesn’t teach people; people teach people. Amid our continuing struggle to wrest data from secretive or under-resourced governments and get it online somewhere, the open data community can lose track of the importance of “accessibility” in the more figurative sense.
These less technical demands of data management–web design and architecture, methods for republishing and sharing, citizen-friendly explanations and narratives that create relevance–are what turn our open data bookstore from a dimly-lit firetrap that smells like cat litter into a neighborhood institution with coffee, comfortable chairs, and friendly clerks. The kind of place you want to visit again, and where you might someday bring your own books to share with the community.
Without proper presentation and “hospitality,” it is much harder to turn data into information, and opportunities are lost.
Originally posted at Bridging Technology and Transparency. Photo: MorBCN/Flickr