The latest challenge of Florida’s congressional districts wrapped up last week, which brought to light a lot of questionable behind-the-scenes behavior on the part of the Florida legislative leadership during the redistricting process. Should the ruling go in favor of the Democrats and the League of Women Voters/Common Cause coalition, the findings produced during the case make it likely that the state senate maps will be challenged (again) afterward.
It’s easy to get cynical about the process, no matter what supposed safeguards are in place, such as Florida’s 2010 “Fair Districts” constitutional amendments. Many come to the same conclusion as Christopher Ingraham at the Washington Post – why not consider letting computers do it?
Like Ingraham says, algorithms for redistricting already exist, and have for over fifty years now. The operations research literature has had quite a bit of discussion on potential methodologies, as have election law journals and random programmers, ranging in levels of sophistication and computational complexity. There are a couple algorithms in particular I seem to see posted regularly around online. They’re both great in what they set out to achieve, and have made some headway in popularizing the idea of automated redistricting, so I’m a fan of the creators of both. Unfortunately, both algorithms fall on the simplistic side of suggested methods (by design), which has some major drawbacks.
Maximum Compactness and Shortest Split-Line
Brian Olson’s algorithm, as mentioned in Ingraham’s article, seeks to make sure people have the lowest average distance to the center of their district. He’s done a Florida state senate map, and it looks like this:
RangeVoting.org’s Shortest Split-Line algorithm is also somewhat focused on compactness, and draws a district by recursively dividing a state on the shortest line possible. It looks like they’ve only done congressional maps for the 2000 Census, but you get the idea:
The great benefit of both methods is that they are simple. There’s no chance of having politicians muddle with the maps, it’s just put in the data, pop out a map. The processes are easy to understand. The split-line has a single best result, and while Olson’s is based on optimization and therefore can’t guarantee that his current maps are the best given his criterion, I’m willing to bet they’re pretty darn close.
The drawback is that they’re not likely to be legal, either on the Fair Districts front or on the Voting Rights Act front. There’s no guarantee that racial minorities will be protected the way they’re required to be, and the maps pay no attention to county or city boundaries – in fact, the split-line algorithm is biased towards splitting cities.
Like I mentioned before, though, there’s a wide range of options for automated redistricting that have been put forth, even if they haven’t all gotten the same attention as Olson’s or RangeVoting.org’s.
Redistricting the Florida State Senate
A few summers ago after taking an introductory GIS course, I wrote an automated redistricting program to help learn scripting in ArcGIS, and by extension, Python, which is the language ArcGIS uses. It turns out that ArcGIS is the wrong tool for the job (the results I present below would literally take years to produce, rather than hours, had I not stripped all the dependence on ArcGIS), but I did end up writing something usable. I’ve improved on it here and there since that time as I’ve delved into the literature on the subject some more.
Almost nothing in my implementation is my own, at least at the theoretical level – ideas were drawn from a long list of scholars, including but not limited to Micah Altman, Burcin Bozkaya, Erhan Erkut, Dan Haight, T.C. Hu, Andrew B. Kahng, Gilbert Laporte, Michael McDonald, Federica Ricca, Andrea Scozzari, Bruno Simeone, Chung-Wen Albert Tsao, and William Vickrey. The function for producing majority-minority districts is my major contribution.
The basic approach is one of optimization: input a starting map, make a minor change in each iteration of the algorithm, check to see how the quality of the map changes, and decide whether to keep that change accordingly. To judge the quality, several factors are examined: population equality, compactness, county splits, and the status of the majority-minority districts in the map. Each factor has a function that produces a score, lower being better, each is then multiplied by a weight, and finally they are added together – the lowest scoring map “wins.” (Non-contiguous districts are not allowed at any point in the process, so that is not an issue.)
While it is possible to run my program on the smallest unit of geography available, the Census block, for the sake of speed, I’ve chosen to go with a slightly larger unit, the Census tract, of which there are a bit more than 4000 in the state – compare that with blocks, of which there are nearly half a million. The drawback is that preventing the splitting of cities is not possible at this level, since more than 10% of the tracts cut across two or more cities; however, the other major Fair Districts factors that came up in the recent trial seem to be handled (including not favoring a party or incumbent, since nothing regarding that is included in the scoring of a map).
I created 15 random but contiguous seed maps and ran them through 100,000 iterations of the algorithm, which is about the number of trials I could achieve in a day while doing other work at the same time. The best map is presented below, followed by the state’s current map (click for larger view).
Visually, the districts in the automated map look more compact on the whole than those in the state’s map, and this is backed up by the numbers. Although there are many different ways to measure compactness, I pull from Redistricting the Nation’s descriptions, using one of their dispersion methods (Convex Hull) and one of their indentation methods (Polsby-Popper). On the former, the automated map improves on the state’s from 0.756 to 0.808 – higher is better – and on the latter, 0.343 to 0.412.
The state’s map does better on other factors. Perhaps due to my greater emphasis on compactness, the state’s map splits fewer counties, with 24 split to the automated map’s 31 (Olson’s map doesn’t include county lines, so it’s hard to get an exact count of his splits, but it looks to be at least 47). The population equality function weighting was a bit lax in the algorithm, resulting in a deviation of 9.8%, which is within the general legal rule-of-thumb maximum of 10%, but the state keeps it to 2.0%. Finally, while both maps have five Hispanic majority-minority districts, the state’s has two African-American MM districts, and the automated map only one – there’s a second in the automated map with 51% African-American total population, but voting-age population is the standard metric, and it only reaches 47%.
Partisan Performance Comparison
Perhaps the most noteworthy improvement of the automated map is on the partisan performance front, which is one of the major selling points of automated redistricting. Although the automated map didn’t take into account voting results in its generation, comparing its predicted election performance to the map passed by the state is informative, and lends evidence to the (denied) partisan intentions of the state legislature.
To do this, I use data created though a fairly standard method (method 3 at the linked page) of taking election results at the precinct level and disaggregating them down to the Census block level based on the voting-age population of each block; I then took these data and reaggregated them back up to the senate district level. To arrive at a figure for each district, I look at an average of the Democrats’ performances in the 2008 presidential and 2010 gubernatorial races. These would have been the two most recent high-profile, closely contested statewide elections in Florida at the time of the redistricting process, and have the advantage of not being skewed by an incumbency advantage. The net statewide results show a nearly 50/50 partisan split, with a slight Democratic tilt. For simplicity, the figures I use disregard third party votes, so that a 50.1% Democratic performance average means a Democratic district, a 49.9% performance means a Republican district.
The charts below show the Democratic performance for each district, with the districts sorted by performance for clarity’s sake (click for a larger view).
Before I get into the analysis, it’d be useful to go over what the strategy behind a partisan gerrymander entails. Obviously, winning the most districts possible is the goal, and to do that, a partisan would pack as many of his opponents’ voters into as few districts as possible, while spreading out his own voters to create a large number of safe, but not overwhelmingly safe districts. Put another way, he is efficient with the distribution of his own voters, and inefficient with his opponents’ – the latter group has a lot of “wasted” votes being cast in lopsided elections.
The first thing to take away from the charts is that the automated map is relatively balanced: 19 districts go Democratic, 21 go Republican. On the other hand, the state’s is less so, with a 15-25 split (the actual election in 2012 went 14-26). But the type of Democratic or Republican district is also important. The 14 most Democratic districts in the state’s map are all more Democratic than the automated map, and especially so for the last five, suggesting a packing strategy. Furthermore, if you define a leaning district as being within 5% of an even split, the state’s map has 11 Republican leaners, but only one Democratic. The automated map, on the other hand, has eight Republican and five Democratic.
Again, this is the strength of automated redistricting. Even under a constitutional requirement to not benefit a particular party, the state legislature produced a map that looks incredibly similar to a textbook partisan gerrymander, while my algorithm, which completely ignored partisanship, ended up producing a much more neutral result.
Conclusion: Are Computers the Answer?
I’m not one to fully endorse the idea of using automated methods as the beginning and the end of the redistricting process – even looking at the map I produced above, there seem to be some simple, obvious changes that could be made that would make the map better. Granted, I’m not saying my method can’t be improved upon, but these sorts of things get hard to make perfect when you’re dealing on the scale of a state, with geographies Census tract-size or (ideally) smaller.
The problems run even deeper, however. As I glossed over above, each factor’s score was weighted before being summed together to produce a final rating of each map. The question thus becomes, who decides these weights? Should population deviation be put on equal footing with minimizing county splits, or should we allow for some leeway on the former to improve the latter? How aggressively should majority-minority districts be pursued? This is even true of Olson’s algorithm – he has chosen to weight compactness and population equality over everything else. These choices can have an impact on political outcomes – for example, a majority-minority district heavy map can quite often resemble a Republican gerrymander – and as such, they could fall prey to bitter partisan debate.
Additionally, the method above is not deterministic, and does not guarantee an ideal map, even if the weights are agreed upon. Each new running of the algorithm could produce a new best map: at what point do we say we’ve done enough trials?
At the very least, though, automated redistricting makes for a good benchmark to compare against what the legislature considers and passes. If they cannot beat a computer on most of the factors that people judge a map against, it’s a good sign that something fishy is going on. Ingraham’s article, I think, is a confirmation that I’m not alone in thinking this – computers can dig us out of all the debate and post hoc justification and say, “look, this is what’s possible, why are we so far from it?”
Although when I get cynical after long hours of thinking about redistricting, my go-to response is “hey, do we even need districts?” But that’s another debate entirely.
If you’re interested in my program, or want to have a laugh at my amateur programming skills, I’ve put up my code at GitHub. Or, if you’d prefer something a little more professional, Micah Altman and Michael McDonald also have an open-source automated redistricting R module called BARD.