Pennsylvania's congressional district maps are almost certainly the result of gerrymandering according to an analysis based on a new mathematical theorem on bias in Markov Chains developed by Carnegie Mellon University and University of Pittsburgh mathematicians. Their findings are published in the Feb. 28 online early edition of the Proceedings of the National Academy of Sciences (PNAS).
Markov chains are algorithms which can generate a random object by starting from a fixed object and evolving in a stepwise fashion, making small random changes at each step. Markov chains have numerous applications, and are used to model things like thermodynamic processes, chemical reactions, economic and financial phenomena, protein folding and DNA sequences.
To evaluate gerrymandering of congressional districts, a Markov Chain can, in principle, be used to compare the characteristics of the current districting map with a typical districting of the same state by generating truly random districtings as points of comparison.
However, one of the limitations of Markov chains is that there is often no way to determine how long the chains need to run in order to achieve a truly random sample. Without knowing the upper limit, researchers must assume that they've run the algorithm long enough for their resulting assumptions to be valid.
In the PNAS paper, University of Pittsburgh Assistant Professor of Computational and Systems Biology Maria Chikina and Carnegie Mellon Professor of Mathematical Sciences Alan Frieze and Assistant Professor of Mathematical Sciences Wesley Pegden prove a theorem that can use a Markov Chain to show that a sample is nonrandom, without generating random samples from the Markov Chain itself. This allows researchers to use the Markov chain to rigorously demonstrate bias in the congressional districting maps of the state of Pennsylvania without having to make unproven assumptions on the time required to generate samples from the Markov Chain.
The researchers began with a current map of Pennsylvania's congressional districts, and applied a Markov chain that incorporated geometric constraints on districts that would be used to create random districting maps. Those factors included ensuring roughly equal populations in each district, border continuity, and constraining the ratio of perimeter to area.
The researchers ran the chain, which changed the map in random steps. Statistical properties of the map were found to change rapidly with small random changes to the initial map, which, according to their theorem, would be extremely unlikely to happen by chance.
"There is no way that this map could have been produced by an unbiased process," said Pegden.
While the new method doesn't provide a new tool for drawing congressional district maps, it does provide a rigorous test to detect that existing maps were created in a biased fashion, and researchers may find applications in the many other fields where Markov Chains are used.
The research was supported by the National Institutes of Health (MH10900901A1, HG00854003), the National Science Foundation (DMS1362785, CCF1522984, DMS1363136), the Simons Foundation and the Sloan Foundation.