Bayesian networks and criminal defense

By Kristopher A. Nelson
in July 2011

600 words / 3 min.
Tweet Share
I have begun to consider the utility of formal methods of evidential evidence mapping. Even without deep mathematical knowledge, the formulas are useful in any presentation of statistics in a courtroom, and can help avoid common reasoning fallacies (like the “prosecutor’s fallacy”).


Please note that this post is from 2011. Evaluate with care and in light of later events.

A simple Bayesian network
Image via Wikipedia

I have begun to consider the utility of formal methods of evidential evidence mapping. David Lagnado has presented Bayesian methodologies to us here in Vienna for the last week. Such an approach tends to be math-intensive in its quantitative form, but is powerful as well in its graphical, non-mathematic form. It is reminiscent of Wigmore’s early 20th century graphical approach to mapping evidence, but is in many respects less complex and more powerful. Additionally, even without deep mathematical knowledge, the formulas are useful in any presentation of statistics in a courtroom, and can help avoid common reasoning fallacies (like the “prosecutor’s fallacy“).

Whether people actually think in Bayesian terms is unclear. What is more clear is that Bayesian tools help lay bare some of the heuristic shortcuts people take when dealing with complex evidence (such as in legal cases). We tend, for example, to over-value high-probability evidence by conflating, say, a fingerprint match with guilt, rather than considering alternative hypothesis (the fingerprint is a match, but was deposited at a different time). We also tend to completely ignore low probability evidence, collapse variables and possibilities into singular possibilities, leave out weak links, and downplay absent information entirely. All of this is critical knowledge for any trial attorney to keep in mind, especially when dealing with jurors.

Just always remember that evidence is “irreducibly contextual,” in the words of Hasok Chang. We simply cannot control all the variables, or even imagine all the variables. Failing to be aware of this leads to many of the logical fallacies that Lagnado discussed when explaining Bayesian approaches to evidence, since many problems emerge if one fails to take this into account (whether that’s in the public health context, a legal case, or when deciding on the best cafe in Vienna). This means that however effective your Bayesian map may be, it’s easy to leave out key aspects. Do not assume your map is complete.

Relatedly, Bayesian networking–especially when one expects to actually calculate anything, rather than simply graphing–are deeply dependent on the “priors.” Priors represent the probability of an event occurring, and generally reflect subjective assessments of experts.

In a sense, needing priors simply pushes back complex and subjective calculations further, and this is a major criticism of the approach. How does one calculate how many people smother their children in the U.K. each year (a necessary prior in calculating aspects of the Sally Clark case). Lagnado has emphasized that, while a key problem, the proper Bayesian approach is to lay these subjective factors bare, and to focus not on concealing them, but rather on agreeing on shared assumptions. Bayesian calculations do not show “the truth,” but rather a mathematical truth based on shared assumptions.

Certainly Bayesian approaches have problems, but I would encourage considering the situations in which they may prove helpful, rather than focusing on attacking the approaches key problems. Systematizing decision-making may be flawed–certainly we cannot simply replace the jury with a Bayesian calculator–but thinking through  a complex web of evidence in Bayesian terms provides critical insights, and in some cases fundamental and powerful truths.

In short, I would highly recommend that any criminal defense attorney consider investigating both the mapping techniques and the basic statistics of Bayesian networks.