Authors: Aaron Swartz
April 14, 2008
As you've probably noticed, it's political insanity season in the U.S. I can hardly go outside these days without running into someone complaining about the latest piece of campaign gossip. I've mostly tried to keep it off this blog, but it's hard to not get swept up in the fever. As someone who wants to make a difference in the world, I've long wondered whether there was an effective way for a programmer to get involved in politics, but I've never been able to quite figure it out.
Well, recent events and Larry Lessig got me thinking about it again and I've spent the past few months working with and talking to some amazing people about the problem. I've learned a lot and must have gone through a dozen different project ideas, but I finally think I've found something. It's not so much a finished solution as a direction, where I hope to figure more of it out along the way.
So the site is called
and the plan has three parts. First, pull in data sources from all overâdistrict demographics, votes, lobbying records, campaign finance reports, etc.âand let people explore them in one elegant, unified interface. I want this to be one of the most powerful, compelling interfaces for exploring a large data set out there.
But just giving people information isn't enough; unless you give them an opportunity to do something about it, it will just make them more apathetic. So the second part of the site is building tools to let people take action: write or call your representative, send a note to local papers, post a story about something interesting you've found, generate a scorecard for the next election.
And tying these two pieces together will be a collaborative database of political causes. So on the page about global warming, you'll be able to learn more about the problem and proposed solutions, research the donors and votes on the issue, and see or start a letter-writing campaign.
All of it, of course, is free software and free data. And it's all got a dozen different APIs to make it easy for others to build on what we've done in their own work. The goal is to be a hub, connecting citizens, activists, organizations, politicians, programmers, and everybody else who's interested in politics.
The hope is to make it as interesting and easy as possible to pull people into politics. It's an ambitious goal with many pieces and possibilities, but with all the excitement right now we want to get something up as fast as possible. So we'll be developing live on
, releasing pieces as soon as we finish them. Our first goal is to put up data about every representative and a way to write them.
I've managed to find an amazing group of people willing to help out with building it so far. And the
has encouraged me and graciously agreed to fund it. But we still need many more hands, especially programmers. If you're interested in working on it, whether as a volunteer or for pay, please email me, telling me what you'd like to help with.
July 3, 2012
The open data movement is a hammer which has gathered the support of many nails. There are the curious taxpayers, who feel their annual checks mean they deserve a peek at the interesting facts the government has collected. There are the ambitious business owners, who see an opportunity to privatize profits from work with socialized costs. And there are the self-styled activists, who believe that if we reveal the data on what the government is really doing, we will arrest corruption by exposing it to sunlight.
The coalition is a confusing mix of these very different motivations (as Tom Slee observes), and the benefits of such a tactical alliance has come with the cost of some confusion. So let's be clear about what open data can and cannot do.
If the St. Louis Fed publishes reams of economic data, it can certainly make it easier for Mr. Yglesias to make his fantastic charts. If the MTA makes real-time subway information public, it can certainly let Mr. Ernst
improve his fantastic app
. And, as the talented Mr. Lee pointed out to me, his careful collection of data about members of Congress and the bills they're passing can be an invaluable resource for professional activists.
So, if I got to choose whether the government should share the data it's collected, I'd happily vote yes. In fact, I spent several years of my life using the FOIA laws to force it to do just that. I can't claim my work had any particular impact, but as a curious taxpayer, it was a weirdly enjoyable hobby.
But the open data movement often claims to be much more than
that. They insist open data will not just help a few people with their jobs or a few kids with their hobbies but,
as the Sunlight Foundation puts it
, “make government transparent and accountable.” And that I just don't see.
outlined my theory why elsewhere
, but the short version is pretty simple: people hide their crimes. Imagine you learn lots of bribes are exchanged at top of the Capitol Reflecting Pool. So you lobby Congress hard to set up bright lights and a camera to catch the perpetrators. The video would be live-streamed to the Internet so dedicated watchdogs can name and shame the bribe-taking politicians. Your lobbying succeeds and, on January 1st, the lights go up and the cameras switch on.
But as an engaged citizenry tunes in, there is nothing but disappointment. Nobody seems to be taking bribes; just a couple pieces of litter blowing by the pool.
Was Congress really squeaky clean after all? Of course notâthe bribes just moved to the other end at the pool, out of the spotlights.
When you have time to prepare, it's pretty easy to disguise the data. And this is exactly the pattern we've seen. It's always been investigative journalism, not data mining, that's revealed the big scandals about politicians. I, more than anyone, would love to believe that the next great Watergate is just lying in plain sight to be uncovered by a swashbuckling econometrician, but the sad fact is, it simply isn't so.
But it's also worth pausing to ask: what was any of this
to achieve? Imagine, for some strange reason, members of Congress didn't bother avoiding the spotlight. Every day, we saw them, in full HD video, taking money from prominent businessmen. Do we really think even this (far-fetched) instance of transparency would change much? After all, most Americans already think Congress is corrupt. Most Americans think money actually buys politicians' votes. Seeing it happen in video might be striking, and maybe make for some good segments on the evening news (or, these days, some viral YouTube videos), but would it really change anything?
After a couple weeks of chatter, and perhaps a few grandstanding legislative proposals, I suspect it'd just fade into the background. More dramatic examples are not exactly what's most missing from
the reform debateâ
Lessig's recent book
has enough to last us a couple decades. Structural reforms have failed because of the incompetence of reformers, not because there's a lack of evidence that there's a problem. (Free tip to structural reformers: get state legislators to sign on to your constitutional amendment. They're very susceptible to public pressure, there's a lot of them [so you'll have a constant narrative of progress], and they're the ones you'll ultimately need to actually pass the amendment.)
But maybe open data was supposed to improve politics in other ways. Structural reform is an ambitious goalâmaybe the open data proponents wanted something much more modest. But all the more modest stories suffer from a similar excess of naÃ¯vetÃ©. Whenever geeks turn their eyes to politics, they always have the same reaction: There's so much inefficiency! And they naturally propose the obvious ideas for reducing itâfor example: If only it was easier for citizens to read bills, citizens with relevant expertise could assist Congress by sharing their hard-earned wisdom!
The fact is, Congress isn't interested in availing itself of your wisdom any more than the sausagemaker needs your help tidying the floor. Lawmaking is
. It's about blood and war and power, not evidence and argument and policy. (I have one friend who was startled to learn that when members of Congress debate an issue on C-SPAN, they're speaking not to each other but to cameras in a largely empty room.)
I don't want this to sound overly harsh. The truth is, it's really hard to do effective philanthropy. With a little work, you could mount a similar critique of the vast majority of our bumbling efforts to do good. Most ideas for helping people that seem reasonable in the abstract turn out to fall apart upon close confrontation with reality. The real question is what happens then. There's no shame in admitting your mistakes, learning from them, and trying again. Indeed, as my old professor Carol Dweck
, that's the only real route to success. But most of us are too vain or too proud to take that route. We insist that the purity of our intentions reduces the need for careful scrutiny of our effects. Or we try to make ourselves feel better by grasping at any factoid that suggests we had an impact.
I have no particular interest in correcting people's pride or vanity.
This movement is populated by my friends and I respect them enormously and wish them well. Throwing darts at their day jobs has only made my life worse. But this stuff mattersâfunders and volunteers face tough choices about which causes to pursue. It's important that they know the case for opening up data to hold government accountable simply isn't there. (And that they should
invest in metaresearch, including open scientific data, instead
.) It's nothing personalâjust trying to help everyone do their best. I dearly hope that if anyone ever has a similar critique of the causes I pursue, they will
be even more blunt
in pointing out my folly.
The following essay appears in the new O'Reilly book
and attempts to combine and clarify some of the points I made in previous essays. It was written in June 2009
is a slippery word; the kind of word that, like
, sounds good and so ends up getting attached to any random political thing that someone wants to promote. But just as it's silly to talk about whether “reform” is useful (it depends on the reform), talking about transparency in general won't get us very far. Everything from holding public hearings to requiring police to videotape interrogations can be called “transparency”âthere's not much that's useful to say about such a large category.
In general, you should be skeptical whenever someone tries to sell you on something like “reform” or “transparency.” In general, you should be skeptical. But in particular, reactionary political movements have long had a history of cloaking themselves in nice words. Take the Good Government (goo-goo) movement early in the twentieth century. Funded by prominent major foundations, it claimed that it was going to clean up the corruption and political machines that were hindering city democracy. Instead, the reforms ended up choking democracy itself, a response to the left-wing candidates who were starting to get elected.
The goo-goo reformers moved elections to off-years. They claimed this was to keep city politics distinct from national politics, but the real effect was just to reduce turnout. They stopped paying
politicians a salary. This was supposed to reduce corruption, but it just made sure that only the wealthy could run for office. They made the elections nonpartisan. Supposedly this was because city elections were about local issues, not national politics, but the effect was to increase the power of name recognition and make it harder for voters to tell which candidate was on their side. And they replaced mayors with unelected city managers, so winning elections was no longer enough to effect change.
Of course, the modern transparency movement is very different from the Good Government movement of old. But the story illustrates that we should be wary of kind nonprofits promising to help. I want to focus on one particular strain of transparency thinking and show how it can go awry. It starts with something that's hard to disagree with.
Sharing Documents with the Public
Modern society is made of bureaucracies and modern bureaucracies run on paper: memos, reports, forms, filings. Sharing these internal documents with the public seems obviously good, and indeed, much good has come out of publishing these documents, whether it's the
National Security Archive
, whose Freedom of Information Act (FOIA) requests have revealed decades of government wrongdoing around the globe, or the indefatigable
Carl Malamud and his scanning
, which has put terabytes of useful government documents, from laws to movies, online for everyone to access freely.
I suspect few people would put “publishing government documents on the web” high on their list of political priorities, but it's a fairly cheap project (just throw piles of stuff into scanners) and doesn't seem to have much downside. The biggest concernâprivacyâseems mostly taken care of. In the United States, FOIA and the Privacy Act (PA) provide fairly clear guidelines for how to ensure disclosure while protecting people's privacy.
Perhaps even more useful than putting government documents online would be providing access to corporate and nonprofit records. A lot of political action takes place outside the formal government,
and thus outside the scope of the existing FOIA laws. But such things seem totally off the radar of most transparency activists; instead, giant corporations that receive billions of dollars from the government are kept impenetrably secret.
Generating Databases for the Public
Many policy questions are a battle of competing interestsâdrivers don't want cars that roll over and kill them when they make a turn, but car companies want to keep selling such cars. If you're a member of Congress, choosing between them is difficult. On the one hand are your constituents, who vote for you. But on the other hand are big corporations, which fund your reelection campaigns. You really can't afford to offend either one too badly.
So, there's a tendency for Congress to try a compromise. That's what happened with, for example, the Transportation Recall Enhancement, Accountability, and Documentation (TREAD) Act. Instead of requiring safer cars, Congress simply required car companies to report how likely their cars were to roll over. Transparency wins again!
Or, for a more famous example: after Watergate, people were upset about politicians receiving millions of dollars from large corporations. But, on the other hand, corporations seem to like paying off politicians. So instead of banning the practice, Congress simply required that politicians keep track of everyone who gives them money and file a report on it for public inspection.
I find such practices ridiculous. When you create a regulatory agency, you put together a group of people whose job is to solve some problem. They're given the power to investigate who's breaking the law and the authority to punish them. Transparency, on the other hand, simply shifts the work from the government to the average citizen, who has neither the time nor the ability to investigate these questions in any detail, let alone do anything about it. It's a farce: a way for Congress to look like it has done something on some pressing issue without actually endangering its corporate sponsors.
Interpreting Databases for the Public
Here's where the technologists step in. “Something is too hard for people?” they hear. “We know how to fix that.” So they download a copy of the database and pretty it up for public consumptionâgenerating summary statistics, putting nice pictures around it, and giving it a snazzy search feature and some visualizations. Now inquiring citizens can find out who's funding their politicians and how dangerous their cars are just by going online.
The wonks love this. Still stinging from recent bouts of deregulation and antigovernment zealotry, many are now skeptical about government. “We can't trust the regulators,” they say. “We need to be able to investigate the data for ourselves.” Technology seems to provide the perfect solution. Just put it all onlineâpeople can go through the data while trusting no one.
There's just one problem: if you can't trust the regulators, what makes you think you can trust the data?
The problem with generating databases isn't that they're too hard to read; it's the lack of investigation and enforcement power, and websites do nothing to help with that. Since no one's in charge of verifying them, most of the things reported in transparency databases are simply lies. Sometimes they're blatant lies, like how some factories keep two sets of books on workplace injuries: one accurate one, reporting every injury, and one to show the government, reporting just 10% of them. But they can easily be subtler: forms are misfiled or filled with typos, or the malfeasance is changed in such a way that it no longer appears on the form. Making these databases easier to read results only in easier-to-read lies.
Â Â Â Â Â Â Â Â Â Â Â
Congress's operations are supposedly open to the public, but if you visit the House floor (or if you follow what they're up to on one of these transparency sites) you find that they appear to spend all their time naming post offices. All the real work is passed using emergency provisions and is tucked into subsections of innocuous bills. (The bank bailouts were put in the Paul Wellstone Mental Health Act.) Matt Taibbi's
The Great Derangement
tells the story.
Â Â Â Â Â Â Â Â Â Â Â
Many of these sites tell you who your elected official is, but what impact does your elected official really have? For 40 years, people in New York thought they were governed by their elected officialsâtheir city council, their mayor, their governor. But as Robert Caro revealed in
The Power Broker
, they were all wrong. Power in New York was controlled by one man, a man who had consistently lost every time he'd tried to run for office, a man nobody thought of as being in charge at all: Parks Commissioner Robert Moses.
Â Â Â Â Â Â Â Â Â Â Â
Plenty of sites on the Internet will tell you who your representative receives money from, but disclosed contributions are just the tip of the iceberg. As Ken Silverstein points out in
his series of pieces for
(some of which he covers in his book
), being a member of Congress provides for endless ways to get perks and cash while hiding where it comes from.
Fans of transparency try to skirt around this. “OK,” they say, “but surely
of the data will be accurate. And even if it isn't, won't we learn something from how people lie?” Perhaps that's true, although it's hard to think of any good examples. (In fact, it's hard to think of any good examples of transparency work accomplishing anything, except perhaps for more transparency.) But everything has a cost.
Hundreds of millions of dollars have been spent funding transparency projects around the globe. That money doesn't come from the sky. The question isn't whether some transparency is better than none; it's whether transparency is really the best way to spend these resources, whether they would have a bigger impact if spent someplace else.
I tend to think they would. All this money has been spent with the goal of getting a straight answer, not of doing anything about it. Without enforcement power, the most readable database in the world won't accomplish muchâeven if it's perfectly accurate. So people go online and see that all cars are dangerous and that all politicians are corrupt. What are they supposed to do then?
Sure, perhaps they can make small changesâthis politician gets slightly less oil money than that one, so I'll vote for her (on the other
hand, maybe she's just a better liar and gets her oil money funneled through PACs or foundations or lobbyists)âbut unlike the government, they can't solve the bigger issue: a bunch of people reading a website can't force car companies to make a safe car. You've done nothing to solve the real problem; you've only made it seem more hopeless: all politicians are corrupt, all cars are dangerous. What can you do?
What's ironic is that the Internet does provide something you can do. It has made it vastly easier, easier than ever before, to form groups with people and work together on common tasks. And it's through people coming togetherânot websites analyzing dataâthat real political progress can be made.
So far we've seen baby stepsâpeople copying what they see elsewhere and trying to apply it to politics. Wikis seem to work well, so you build a political wiki. Everyone loves social networks, so you build a political social network. But these tools worked in their original setting because they were trying to solve particular problems, not because they're magic. To make progress in politics, we need to think best about how to solve its problems, not simply copy technologies that have worked in other fields. Data analysis can be part of it, but it's part of a bigger picture. Imagine a team of people coming together to tackle some issue they care aboutâfood safety, say. You can have technologists poring through safety records, investigative reporters making phone calls and sneaking into buildings, lawyers subpoenaing documents and filing lawsuits, political organizers building support for the project and coordinating volunteers, members of Congress pushing for hearings on your issues and passing laws to address the problems you uncover, and, of course, bloggers and writers to tell your stories as they unfold.
Imagine it: an investigative strike team, taking on an issue, uncovering the truth, and pushing for reform. They'd use technology, of course, but also politics and the law. At best, a transparency law gets you one more database you can look at. But a lawsuit (or congressional investigation)? You get to subpoena all the databases, as well as
the source records behind them, then interview people under oath about what it all means. You get to ask for what you need, instead of trying to predict what you may someday want.
This is where data analysis can be really useful. Not in providing definitive answers over the web to random surfers, but in finding anomalies and patterns and questions that can be seized upon and investigated by others. Not in building finished products, but by engaging in a process of discovery. But this can be done only when members of this investigative strike team work in association with others. They would do what it takes to accomplish their goals, not be hamstrung by arbitrary divisions between “technology” and “journalism” and “politics.”