Article Text

Download PDFPDF

Risk assessment: predicting violence
Free
  1. David Crighton
  1. Correspondence to Durham University, Department of Psychology, University Office, Old Elvet, DH1 3HP d.a.crighton{at}dur.ac.uk

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Violence in its many and varied forms represents a major public health problem. Home Office figures for England & Wales, based on the British Crime Survey, suggest that there were just over two million incidents of violence between January and December 2010.1 It is therefore not surprising that significant efforts have been made to predict the risk of future violence in at risk groups. At a fundamental level, the challenges of assessing risk in this area are similar to those in other areas of public health, where uncertainty can be divided into two types: aleatory and epistemological. Aleatory uncertainty refers to what might also be termed true chance or randomness, as in the throw of a dice, the toss of a coin or the similar types of events which fill statistics textbooks. Epistemological uncertainty refers to our lack of knowledge about potentially verifiable events. In areas such as screening for a disease, it has been noted that the uncertainty is all epistemological, with the disease either being present or absent.2 In making predictions about violence though, there is a mix of the epistemological and aleatory uncertainty. Future events are unknown and chance events will have unknown impacts on these. Gathering more information may change our predictions, but we will always be faced with a remaining level of uncertainty which cannot be eliminated in predicting violence. In common with other areas of prediction, future violence therefore presents very significant challenges.

Current practice

In addressing these challenges there has, in recent years, been a dramatic growth in the use of structured assessments to try to predict future risk. Most of these assessments have been based on actuarial models of risk and have come to be termed Actuarial Risk Assessment Instruments (ARAIs) or, less formally, risk assessment ‘tools’. These ARAIs have sought to reduce the level of uncertainty about future violence and have been actively marketed and widely used across mental health and forensic settings.

It has frequently been asserted that actuarial or ‘statistical’ approaches to assessing the risk of violence are better than ‘clinical’ assessments. Indeed, this has come to be used in a somewhat clichéd manner. However, the evidence base for this assertion is not as extensive or convincing as this might suggest. Much of the debate around ‘statistical’ versus ‘clinical’ assessments of the risk of future violence can be traced back to the work of the distinguished psychologist Paul Meehl3 dating from the 1950s. The key point Meehl and his collaborators made was that the use of ‘statistical’ methods could improve the generally poor levels of accuracy than seen in ‘clinical’ predictions. The arguments made were detailed, subtle and nuanced, but this has tended to get lost over time to be replaced by stark claims that suggest evidence-based assessment of risk is the same as statistical assessments of risk.

Using ‘statistical’ approaches to predict violence in at risk groups, it is relatively easy to demonstrate that combinations of relatively small numbers of variables can yield better than chance predictions of future outcomes. Such an approach can be criticised on a number of grounds though. These include the fact that assessing the risk of violence involves estimating the odds of particular events occurring by placing individuals into a population of similar people, where the likelihood of the predicted event is known. This has involved placing the individual into a historical sample and, in turn, using statistical methods (primarily regression analysis) to estimate the future probability that an individual will be violent. This has been criticised on the basis that regression techniques were not designed for such use.4 Positive results have though been claimed for ARAIs. For example, based on the results of the Violence Risk Appraisal Guide (VRAG), a widely used ARAI, a sample was dichotomised into ‘high-’ and ‘low-risk’ groups. Subsequently, 55% of the ‘high-risk’ group went on to record violent incidents compared with 18% of the ‘low’ risk group. This finding, along with many others similar to this, has been favourably contrasted with the results obtained using ‘clinical’ assessments. A typical study of the ‘clinical’ prediction of violence in acute psychiatric patients found that 53% of those who aroused professional concern committed a violent act within 6 months compared with 36% among those who did not elicit concern.5

In addition, the selection of risk factors underpinning statistical methods is not theoretically driven. Given our lack of theoretical understanding of violence, this is perhaps to be expected, yet in turn it is likely to serve as a limit on accuracy. Most existing ARAIs draw on a constricted range of risk factors, with many potential risk factors discarded as they have little empirical predictive power on a group basis. However, for individuals such factors may be of considerable significance. Such ARAIs will also generally require individuals to be assessed on every risk factor, even though they may not be individually relevant, a generally counter intuitive and wasteful approach for practitioners.

A further issue in the prediction of violence concerns the base rate of the behaviour being predicted. In the context of violence, re-arrest or reconviction rates have frequently been used. However, these are known to be poor estimators of actual rates of violence, with low rates or reporting, detection and conviction being common. Paradoxically, this may have made the prediction of violence appear more difficult, as well as introducing poorly understood biases into the process. The importance of base rate information was clearly illustrated in the MacArthur risk assessment study.5 Base rate information along with information on the nature and context of the violence was collected from a number of mental health settings across the USA. Officially recorded incidents of violence suggested the base rate was relatively low at 4.5% in the patients studied. However, when multiple sources of information were used to estimate levels of violence, this increased more than sixfold to 27.5%. The implications of these two figures are very different, yet many ARAIs are constructed on the basis of officially recorded violence, such as criminal convictions.

Another significant difficulty with current ARAIs has recently been highlighted in a review that looked at the accuracy of individual predictions, as opposed to the accuracy achieved on a group basis. Here statistical approaches to prediction about individuals using two commonly used ARAIs, the Risk Matrix 2000 (RM2000) and VRAG, were used as exemplars. These were evaluated in terms of 95% CIs for individual assessments. These were very wide, in fact overlapping across most of the risk groups defined within these ARAIs. This led the authors to suggest that “… at the individual level, the margins of error were so high as to render the test results virtually meaningless” (pp S63). Although this research looked at two examples of current ARAIs, there is little reason to believe other similar assessments would perform better in this respect.

Future developments

Alternative approaches to predicting the risk of violence have addressed some of these weaknesses in ARAIs by drawing on different statistical approaches, often those used successfully in other areas of science. An example here is the development of iterative classification tree (ICT) methods as part of the MacArthur study.5 These seek to embed individuals into historical classes in a similar manner to other ARAIs but seek to do this more precisely, with allocations into subclasses being contingent on the results of assessment in previous areas. This is a familiar approach across a range of health settings and has the advantage that it avoids the requirement to complete every item for every patient. As such, the approach has an intuitive appeal to practitioners.

In seeking to improve accuracy, ICT approaches abandoned the dichotomous approach to risk of violence involving the use of a single cut point (‘violent’ vs ‘non-violent’). This was replaced with two cut points leading to three groupings: indistinguishable from the population base rate, lower than population base rate and higher than population base rate. This moved the focus onto the two tails of the distribution, and it was suggested the accuracy of estimates of risk may be improved on this basis.5

Drawing on the base rates for violence in the sample the ‘high-risk’ grouping was defined operationally as twice the base rate (>37% over 12 months) and ‘low risk’ as half the base rate (<9% over 12 months). Using these cut points, 42.9% of the sample were unclassified using a regression analysis approach compared with 49.2% using the simple classification tree approach.5 This highlighted a significant weakness of ARAIs. Even with the addition of a ‘low’ risk cut point, both approaches were successful in identifying only slightly over half of those assessed as high or low risk. The remainder of those assessed was not differentiated from the base rate for violence grouping, providing little in the way of added utility from the risk assessment. In an effort to address this, the MacArthur researchers went on to look at the use of repeated iterations using a classification tree approach, describing this as an ICT model. Repeated (iterative) analyses were undertaken on the group that had not been distinguished from the population base rate. A second iteration allocated 119 of these individuals to either ‘high-’ or ‘low-risk’ groups, a third iteration allocated 63 and a fourth iteration allocated 60. Using this form of recursive partitioning, 77% of the sample could be allocated to the ‘high-’ or ‘low-risk’ groups, representing a significant improvement. As a result six ‘low’ risk subgroups (49%), four ‘high-risk’ subgroups (27%) and two base rate risk subgroups (23%) had been identified. Accuracy was reported as similar to non-recursive models but with the ability to classify a significantly greater proportion of cases into more precisely defined ‘high-’ and ‘low-risk’ groupings. This model was subsequently tested using a technique called ‘bootstrapping’, which involved mathematically increasing the sample size by the random duplication of cases. This larger hypothetical ‘sample’ could then be randomly sampled to test how effectively these repeated samples fitted the new model. The results from this process suggested that the model continued to work effectively.

An alternative emergent view of prediction has questioned the value of using statistical models per se.6 Many of the critiques here have drawn from the world of financial economics but have clear application to efforts to predict other types of behaviour. Econometric models designed to predict risk were similar to psychometric models. They were developed using complex mathematical models, often more sophisticated than those seen in ARAIs. Similar claims of accuracy were also made suggesting that ‘statistical’ models were better than the models used by practitioners. As with ARAIs, the arguments in favour of risk assessment ‘tools’ in economic settings were detailed, subtle and nuanced, and again this tended to get lost in the marketing of tools for the assessment of risk. In particular, such approaches proved to perform poorly in the face of rare but high impact events. Evidence also emerged that the performance of financial traders was better before such methods were invented and that the most effective traders would replace these statistical algorithms with their own heuristics7 which proved to be more accurate and less vulnerable to rare high impact events. Here accuracy had developed seemingly through a process of evolution, with performance pressures leading to the development and spread of heuristics that worked and the ‘extinction’ of ineffective heuristics.8 The replacement of these effective heuristics with statistical models and ‘tools’ could therefore be seen as a retrograde step.

A key aspect in the development of these effective heuristics was almost certainly the prompt and cumulative feedback that successful financial traders receive, allowing them to adjust and hone their heuristics. Risk assessments of future violence, by contrast, have placed little focus on the accuracy of predictions from individual practitioners, teams and institutions. This is curious for at least two reasons. First, it seems likely that wide variation in the accuracy of predictions of violence will be seen across a range of individuals, teams and settings. In this respect, there is little reason to believe that the situation will differ significantly from other areas of health practice. This range is likely to include performance that is better than actuarial approaches and those where predictions have been less accurate than those delivered by ARAIs. Second, a range of methods for assessing and reporting clinical outcomes exist in many other areas of public health.9 There would be few bars to using such methods to develop better more evidence-based practice.

From the emerging evidence base, it can reasonably be argued that most existing ARAIs have significant limitations. Some of these are fundamental to the nature of the prediction of violence. Whatever approach is adopted, there will always be an irreducible level of uncertainty. Trying to predict the future presents substantial challenges in fields ranging from physics to economics. The limitations of current approaches have led to the development of alternatives, such as the use repeated ICT and time series analysis models of risk prediction. These hold out the possibility of greater accuracy and utility in the allocation of individuals to risk grouping. It is suggested that such models were designed, in contrast to regression models, with prediction in mind. More radically, it has been suggested that the rejection of ‘clinical’ approaches and what might be termed clinical heuristics may have been premature in seeking to predict violence. We may in fact have thrown the baby out with the bathwater. Moves to the use of statistical ‘tools’ in this area may have been driven by the results of pooling ‘good’ and ‘bad’ assessments based on effective and ineffective heuristics. This is likely to have been compounded by the failure to develop feedback to practitioners on the accuracy of predictions at individual, team and institutional levels.

References