On Cathy O’Neil’s Weapons of Math Destruction – Chris Hoofnagle

Few have shed as much light on data science than Cathy O’Neil. The former Barnard math professor, author of Doing Data Science, and hedge fund quant has now published Weapons of Math Destruction (Crown 2016).

Weapons of Math Destruction (WMDs) are perversions of data science that increasingly influence our lives. O’Neil shows how sloppy mathematical processes, designed for efficiency and lacking any consideration of fairness, are being used to sort people. Why is this a problem? WMDs are focused on the poor, while the rich get to rely on old-school methods reputation and decisionmaking—the letter of recommendation, the personal interview, and so on. Why are WMDs worse than ordinary human decisionmaking, with all of its foibles? O’Neil argues that WMDs lack feedback loops and that WMD users are much more concerned about doing things well enough rather than correctly. To demonstrate these points, O’Neil walks the reader through anecdotes including the scoring of teachers based on student exam performance, the pathologies that have arisen from U.S. News & World Report’s rankings of colleges, the online advertising that leads people to subprime loans and for-profit colleges, use of algorithms to sentence criminals, use of predictive policing to allocate cops on the beat, the use of information to set personalized insurance rates, and Facebook’s potential to influence our mood and votes.

Our livelihoods increasingly depend on our ability to make our case to machines

O’Neil points out time and again that people learn to game the algorithm. So, why isn’t that enough to solve the problems that O’Neil elucidates? The gaming creates perverse incentives and gross outcomes. Teachers help their students cheat in order to perform well on test-score-based algorithms; the honest who do not get fired. Colleges “hire” highly-cited professors on a part-time basis only to list them on their website in order to improve the school’s ranking.

In other cases, individuals cannot game the system and they suffer for it. Poor neighborhoods with nuisance crimes get more and more police attention, and in turn, more arrests, which feeds into other systems that predict that the poor are more likely to be recidivists. People who do not comparison shop are identified and charged more because companies can. And finally, we face the risk that Facebook will use its platform to shape how we view the world, to encourage us to vote or not, and so on.

O’Neil discusses baseball data science extensively, showing how the reams of information from games can be used to make interesting predictions and alter gameplay strategy. According to O’Neil, baseball analytics are fair for several reasons: the system does not use proxies (indirect measures of player skill) for performance and instead relies on direct evidence such as number of runs hit. The inputs of baseball analytics are transparent: anyone can see and record them. In addition, whatever machine learning that is occurring is relatively transparent as well. Finally, the system can incorporate lessons from its predictions and adjust accordingly.

How would one improve on O’Neil’s assessments of WMDs? I would suggest several additional factors. First, taking baseball as an example, there are no pathological incentive conflicts in such analyses. Baseball teams want their players to perform well. O’Neil’s book details just how conflicting incentives are in other contexts, such as when your bank analyzes your value as a customer. Second, to the extent there are conflicts (after all, the baseball teams are in competition), they are in rough equilibrium. Two well-resourced teams are competing, and we can assume that each can anticipate and react to the predictive power of the other. On the other hand, I am at a total disadvantage with respect to my bank’s analytics. In theory, competition among banks protects me, but in reality, transaction costs in switching, the erosion of fiduciary duty to customers, and so on makes our relationship with banks a form of competitive conflict.

Weapons of Math Destruction makes a good case that efficiency is not an unqualified good. O’Neil shows how increasingly, WMDs are used to create efficiencies for companies that come at the cost of our dignity and to the fairness of our society. She suggests several interventions that would deepen the responsibility the data scientist has for the data subject. But under my suggested framework above (incentive conflict and equilibrium), I think a needed solution is to bring back competition—radical competition. The challenge is to bring back this competition in a society that has bought into the platform, a society that can say with a straight face that ISPs are competitive, and that ignores the obvious transaction costs involved in putatively competitive markets. If we could get away from using a single company for search, email, social networking, online videos, ecommerce, advertising, a browser, an app market, and an operating system, there would be one less company that could so deeply evaluate us and control how we experience the world.