CHAMPAIGN, Ill. - President Barack Obama's pledge to make his presidency the most open and accountable administration in history could come true if his administration embraces open data formats to make government information accessible to all, says a University of Illinois expert in information science.
According to Michael Twidale, a professor in the Graduate School of Library and Information Science, Obama's commitment to making more government data available to the public through the Internet than his predecessors is a "progression of the great experiment of American democracy."
"With the Freedom of Information Act, you request information in writing, and people will physically go and look up it up, copy all the relevant information, and then mail it to you," Twidale said. "That's definitely better than nothing, but creating a 'Google for government' by putting all non-secret government data online so anyone can search for what they're looking for is a much faster and more efficient solution."
For the purposes of democracy, by increasing not only the amount of information that's available but also rendering it available at any hour of the day, any day of the week, the effect on government accountability will be considerable, Twidale said, because the governed will now "be able to kick up a fuss earlier rather than later."
"For example, when government contracts and bidding processes are made available for download with just a few clicks of a mouse, you get more eyeballs looking at these documents, which will lead to someone noticing earlier when things look fishy, which also means something that looks untoward gets kicked up to the mainstream media a lot earlier," Twidale said.
"If you have a free press, the truth eventually comes out, but it can be years after the fact, when it's invariably too late to fix."
One of the simplest and most cost-effective ways that the Obama administration can increase "openness" in government, Twidale said, is to publish all official data and documents in open, machine-readable file formats.
An open file format, Twidale said, is one whose specifications is public and fully documented, and has no patent or copyright restrictions limiting its use. Proprietary file formats, on the other hand, are usually controlled and defined by private commercial interests and are often unreadable to users who don't have the correct operating system or software.
Historically, both business and government have chosen proprietary software and proprietary file formats created by corporations over their open source equivalents (think Microsoft Office versus OpenOffice, for example) for market-based reasons: Corporate behemoths and their wares are seen as ubiquitous and too big to fail, and if bought in bulk, "it's going to be something where the company is invested in that partnership as well," he said.
But for the purposes of public access to data, "you don't want to be in a situation where the people have to keep paying money simply to have access to data that's rightfully theirs," Twidale said.
That's why some European governments are already starting to use open file formats - not just because it's potentially more reliable and the cost of ownership may be lower, but for many of them, "it's also an ideological battle in that they want the data to be in a free format so that all citizens can have access to it," Twidale said.
Issues of security and national pride can also ratchet up the political pressure for foreign countries to consider open formats, Twidale said.
"Imagine if the U.S. stored its data in software formats controlled by a German or a Chinese company, so that Americans had to buy software from Germany or China in order to access their own data," Twidale said. "We're fortunate that most major software companies are based in the United States, so we don't have to worry about that issue. But things can change fast in the software industry, through foreign takeovers or innovative overseas startups exploding into the market."
An added benefit to saving information in an open, machine-readable format is that it allows anyone with a little bit of programming acumen to create so-called Web "mashups," where raw data from two or more sources are combined, processed and then visualized using various freely available online services and applications such as Google Maps to create a single integrated Web application.
Twidale cites Chicago Crime, the precursor to the EveryBlock suite of Web sites, as such an example.
"Chicago Crime is a classic mashup that takes raw data - in this case, crime reports - and combines it with Google Maps to give the data a more powerful visual presentation," Twidale said. "But you could also plug in other sources of data, as the EveryBlock sites do, to create a better picture of what's going on."
When information is freely available in an electronic form, people can "discover it, talk about it, process and analyze it," Twidale said.
"With more sophisticated mashups, you could also say, 'What correlates with crime? Is there surprisingly less crime in certain areas, and why?' There's a potential not just to visualize data, but to combine that data with other data to learn more about the fundamental causes of societal problems, and lead us to theorize about how we might improve the world."
If people have free, unfettered access to data, "they can combine it with other data in new ways that we've never even thought of, which can lead to new innovations and whole new kinds of industries," Twidale said.
"This can lead to new representations and even new uses for the data. And with some of those innovative ideas, the government may say, 'Hey, we really want to adopt that.' So they'll not just be outsourcing information, they'll be outsourcing innovation as well. We've seen this in the corporate sector, but not as much in government."