Size of Treatment Effects and Their Importance to Clinical Research and Practice

doi:10.1016/j.biopsych.2005.09.014

Biological Psychiatry

Volume 59, Issue 11, 1 June 2006, Pages 990-996

https://doi.org/10.1016/j.biopsych.2005.09.014 Get rights and content

In randomized clinical trails (RCTs), effect sizes seen in earlier studies guide both the choice of the effect size that sets the appropriate threshold of clinical significance and the rationale to believe that the true effect size is above that threshold worth pursuing in an RCT. That threshold is used to determine the necessary sample size for the proposed RCT. Once the RCT is done, the data generated are used to estimate the true effect size and its confidence interval. Clinical significance is assessed by comparing the true effect size to the threshold effect size. In subsequent meta-analysis, this effect size is combined with others, ultimately to determine whether treatment (T) is clinically significantly better than control (C). Thus, effect sizes play an important role both in designing RCTs and in interpreting their results; but specifically which effect size? We review the principles of statistical significance, power, and meta-analysis, and commonly used effect sizes. The commonly used effect sizes are limited in conveying clinical significance. We recommend three equivalent effect sizes: number needed to treat, area under the receiver operating characteristic curve comparing T and C responses, and success rate difference, chosen specifically to convey clinical significance.

Section snippets

Statistical and Clinical Significance, Power, and Meta-Analysis

As statistical hypothesis testing is typically performed, a “statistically significant” result with p < .05 means that the data indicate that something nonrandom is going on. When p < .01, the evidence is more convincing, and p = 10⁻⁶ very convincing indeed. However, the p value is a comment on how convincing the data are against the null hypothesis of randomness; the conclusion is always “something nonrandom is going on.” Such a conclusion gives no clue as to the size or importance of the

Cohen’s d

When an RCT outcome measure is scaled, the most common effect size is Cohen’s d (Cooper and Hedges 1994, Hedges and Olkin 1985), the difference between the T and C group means, divided by the within-group standard deviation. This effect size was designed for the situation in which the responses in T and C have normal distributions with equal standard deviations.

The population parameter estimated by Cohen’s d ranges across the real line, with zero indicating no difference between T and C,

Number Needed to Treat

The effect size proposed that seems to best reflect clinical significance is one proposed in the context of evidence-based medicine for binary (success/failure) outcomes: NNT (Altman and Andersen 1999, Cook and Sackett 1995). Number needed to treat is defined as the number of patients one would expect to treat with T to have one more success (or one less failure) than if the same number were treated with C. For a binary outcome (success/failure), the success rate difference (SRD) is defined as

Confidence Intervals and Effect Sizes

In every report of an RCT, we recommend that each p value be accompanied by NNT (for interpretability) and SRD with its standard error and confidence interval (for computations). The difficulty is that the correct computation of the confidence interval and the standard error of SRD depends on the distribution of the data underlying that effect size.

In those circumstances in which Cohen’s d is appropriate (normal distributions, equal variances), the exact distribution of Cohen’s d is known (

Discussion: The Threshold of Clinical Significance

To summarize, we propose that for any RCT, along with reporting the p value comparing T with C, researchers report NNT and SRD, as well as the standard error and a confidence interval for SRD. If effect sizes were so reported, they could then be used to facilitate consideration of what the threshold of clinical significance might be for design of subsequent related studies.

Here we have attempted to take the first major step, recommending an effect size that is clinically interpretable and

References (45)

M. Borenstein
The case for confidence intervals in controlled clinical trials
Control Clin Trials
(1994)
M. Borenstein
Hypothesis testing and effect size estimation in clinical trials
Ann Allergy Asthma Immunol
(1997)
J.W. Tukey
Tightening the clinical trialNonrelevancy of power calculations after the fact (Appendix 1)
Control Clin Trials
(1993)
Acion L, Peterson JJ, Temple S, Anrndt S (in press): Probabilistic index: an intuitive non-parametric approach to...
D.G. Altman et al.
Calculating the number needed to treat for trials where the outcome is time to an event
Br Med J
(1999)
M. Borenstein
The shift from significance testing to effect size estimation
N. Cliff
Dominance statisticsOrdinal analyses to answer ordinal questions
Psychol Bull
(1993)
J. Cohen
The cost of dichotomization
Appl Psychol Measurement
(1983)
J. Cohen
Statistical Power Analysis for the Behavioral Sciences
(1988)
J. Cohen
The earth is round (p<.05)
Am Psychol
(1995)

R.J. Cook et al.

The number needed to treatA clinically useful measure of treatment effect

Br Med J

(1995)

H. Cooper et al.

The Handbook of Research Synthesis

(1994)

H.M. Cooper et al.

Statistical versus traditional procedures for summarizing research

Psychol Bull

(1980)

J. Cornfield

A method of estimating comparative rates from clinical data. Applications to cancer of the lung, breast and cervix

J Natl Cancer Inst

(1951)

J. Cornfield

A statistical problem arising from retrospective studies

R. Dar et al.

Misuse of statistical tests in three decades of psychotherapy research

J Consult Clin Res

(1994)

B. Efron et al.

A leisurely look at the bootstrap, the jackknife, and cross-validation

Am Statistician

(1983)

B. Efron et al.

Computer-Intensive Statistical Methods (Technical Report 174)

(1995)

J.L. Fleiss

On the asserted invariance of the odds ratio

Br J Prev Soc Med

(1970)

R.J. Grissom et al.

Effect Sizes for Research

(2005)

L.V. Hedges et al.

Statistical Methods for Meta-Analysis

(1985)

L.M. Hsu

Biases of success rate differences shown in binomial effect size displays

Psychol Bull

(2004)

Cited by (670)

Exploring the associations between muscularity teasing and eating and body image disturbances in Chinese men and women
2024, Body Image
This study described muscularity teasing in both men and women and explored its associations with eating and body image disturbances in adults from China. A total of 900 Chinese adults (50% women) were recruited online. Correlation and regression analyses were conducted to examine the relationships between muscularity teasing and a battery of measures on eating and body image disturbances. Gender differences in the associations were examined. Men reported more muscularity teasing than women (31.6% men vs. 15.6% women; $χ^{2}$ (1,N = 900) = 31.99, p < .001). Muscularity teasing was significantly and positively correlated with all measures in both men and women. Muscularity teasing explained significant, unique variance in all measures for men and women, except for body fat dissatisfaction in women, beyond covariates (i.e., age, body mass index, and weight teasing). The relationships between muscularity teasing and eating and body image disturbances were generally stronger in men than women. Findings further suggest that muscularity teasing is an important factor related to eating and body image disturbances in men and women, but muscularity teasing might be more detrimental to men’s eating behaviors and body image. Future research is needed to further explore the directionality and mechanisms of the links between muscularity teasing and eating and body image disturbances.
A Hybrid Type 1 trial of a multi-component mHealth intervention to improve post-hospital transitions of care for patients with serious mental illness: Study protocol
2024, Contemporary Clinical Trials
The transition from acute (e.g., psychiatric hospitalization) to outpatient care is associated with increased risk for rehospitalization, treatment disengagement, and suicide among people with serious mental illness (SMI). Mobile interventions (i.e., mHealth) have the potential to increase monitoring and improve coping post-acute care for this population. This protocol paper describes a Hybrid Type 1 effectiveness-implementation study, in which a randomized controlled trial will be conducted to determine the effectiveness of a multi-component mHealth intervention (tFOCUS) for improving outcomes for adults with SMI transitioning from acute to outpatient care.
Adults meeting criteria for schizophrenia-spectrum or major mood disorders (n = 180) will be recruited from a psychiatric hospital and randomized to treatment-as-usual (TAU) plus standard discharge planning and aftercare (CHECK-IN) or TAU plus tFOCUS. tFOCUS is a 12-week intervention, consisting of: (a) a patient-facing mHealth smartphone app with daily self-assessment prompts and targeted coping strategies; (b) a clinician-facing web dashboard; and, (c) mHealth aftercare advisors, who will conduct brief post-hospital clinical calls with patients (e.g., safety concerns, treatment engagement) and encourage app use. Follow-ups will be conducted at 6-, 12-, and 24-weeks post-discharge to assess primary and secondary outcomes, as well as target mechanisms. We also will assess barriers and facilitators to future implementation of tFOCUS via qualitative interviews of stakeholders and input from a Community Advisory Board throughout the project.
Information gathered during this project, in combination with successful study outcomes, will inform a potential tFOCUS intervention scale-up across a range of psychiatric hospitals and healthcare systems.
Clinicaltrials.gov registration: NCT05703412
The efficacy of psychological interventions for adult post-traumatic stress disorder following exposure to single versus multiple traumatic events: a meta-analysis of randomised controlled trials
2024, The Lancet Psychiatry
Previous meta-analyses of psychological interventions for adult post-traumatic stress disorder (PTSD) did not investigate whether efficacy is diminished in individuals with PTSD related to multiple (vs single) traumatic events. We aimed to assess whether treatment efficacy would be lower in randomised controlled trials involving multiple-event-related PTSD versus single-event-related PTSD.
For this meta-analysis, we searched PsycINFO, MEDLINE, Web of Science, and PTSDpubs from database inception to April 18, 2023. Randomised controlled trials involving adult clinical samples (≥70% meeting full PTSD criteria) with adequate size (≥10 participants per arm) were included. We extracted data on trial characteristics, demographics, and outcome data. Random-effects meta-analyses were run to summarise standardised mean differences (Hedges' g). Trials involving 100% of participants with single-event-related PTSD versus at least 50% of participants with multiple-event-related PTSD (ie, associated with ≥two traumatic events) were categorised. Quality of evidence was assessed using the Cochrane criteria. The review protocol was registered in PROSPERO (CRD42023407754).
Overall, 137 (85%) of 161 randomised controlled trials were included in the quantitative synthesis, comprising 10 684 participants with baseline data and 9477 with post-treatment data. Of those randomly assigned, 5772 (54%) of 10 692 participants identified as female, 4917 (46%) as male, and three (<1%) as transgender or other. 34 (25%) of 137 trials exclusively involved women, 15 (11%) trials exclusively involved men, and the remainder were mixed samples. Mean age across the trials was 40·2 years (SD 9·0) ranging from 18·0 years to 65·4 years. 23 (17%) of 137 trials involved participants from low-income and middle-income countries (23 [17%] of 137). Data on ethnicity were not extracted. At treatment endpoint, psychological interventions were highly effective for PTSD when compared with passive control conditions in both samples with single-event-related PTSD (Hedges' g 1·04 [95% CI 0·77–1·31]; n=11; I²=43%) and multiple-event-related PTSD (Hedges’ g 1·13 [0·90–1·35]; n=55, I²=87%), with no efficacy difference between these categories (p=0·48). Heterogeneity between studies was substantial but outlier-corrected analysis yielded similar results. Moderate-sized effects were found compared with active control conditions with no significant difference between single-trauma and multiple-trauma trials. Results were robust in various sensitivity analyses (eg, 90% cutoff for multiple-trauma trials) and analyses of follow-up data. The quality of evidence was moderate to high.
Contrary to our hypothesis, we found strong evidence that psychological interventions are highly effective treatments for PTSD in patients with a history of multiple traumatic events. Results are encouraging for clinical practice and could counteract common misconceptions regarding treatment and treatment barriers.
None.
Efficacy and safety of anti-amyloid-β monoclonal antibodies in current Alzheimer's disease phase III clinical trials: A systematic review and interactive web app-based meta-analysis
2023, Ageing Research Reviews
The risk-benefit profile of anti-Aβ monoclonal antibodies (mAbs) in Alzheimer’s disease (AD) remains unclear, especially concerning their safety and overall effects on AD progression and cognitive function. Here, we investigated cognitive, biomarker and side effects of anti-Aβ mAbs in large phase III randomized placebo-controlled clinical trials (RCTs) in sporadic AD. The search was performed on Google Scholar, PubMed and ClinicalTrials.gov by applying Jadad score to evaluate the methodological quality of the reports. Studies were excluded if they scored < 3 on Jadad scale or if they analyzed less than 200 sporadic AD patients. We followed PRISMA guidelines and DerSimonian-Laird random-effects model in R. Primary outcomes were cognitive: AD Assessment Scale-Cognitive Subscale (ADAS-Cog), Mini Mental State Examination (MMSE) and Clinical Dementia Rating Scale-sum of Boxes (CDR-SB). Secondary and tertiary outcomes included biomarkers of Aβ and tau pathology, adverse events, and performance on Alzheimer's Disease Cooperative Study – Activities of Daily Living Scale. The meta-analysis included 14,980 patients in 14 studies and four mAbs: Bapineuzumab, Aducanumab, Solanezumab and Lecanemab. The results of this study suggest that anti-Aβ mAbs statistically improved cognitive and biomarker outcomes, particularly Aducanumab and Lecanemab. However, while cognitive effects were of small effect sizes, these drugs considerably increased risk of side effects such as Amyloid Related Imaging Abnormalities (ARIA), especially in APOE-ε4 carriers. Meta-regression revealed that higher (better) baseline MMSE score was associated with improved ADAS Cog and CDR-SB. In order to improve reproducibility and update the analysis in the future, we developed AlzMeta.app, web-based application freely available at https://alzmetaapp.shinyapps.io/alzmeta/.
Students’ foundational understanding of chemical reaction in the forensic science bachelor's degree program at the National Autonomous University of Mexico
2023, Science and Justice
To compare the understanding of the concept of chemical reaction—as operationalized by Bloom’s taxonomy of cognitive levels—of students in forensic science bachelor’s degree with that achieved by students majoring in chemistry, as a prerequisite for future professional collaboration and communication.
Using previously validated and published tests developed to assess students’ knowledge, comprehension, and application of the concept of chemical reaction, we explored how conceptual understanding developed in students enrolled in (a) a forensic science degree program in a Mexican public university and in (b) chemistry undergraduate programs offered by the same university, and whether both groups achieved comparable attainment levels.
Despite receiving considerably less chemical instruction, forensic science students achieved comparable levels of conceptual understanding of chemical reaction to those exhibited by chemistry students. This finding is encouraging because it might mean that future forensic scientists could graduate with a solid foundation of chemical knowledge. More research, particularly on the learning of other key concepts, will be needed to verify these initial findings.
Sequencing of symptom emergence in anorexia nervosa, bulimia nervosa, binge eating disorder, and purging disorder in adolescent girls and relations of prodromal symptoms to future onset of these eating disorders
2023, Psychological Medicine

View all citing articles on Scopus

View full text

ReviewSize of Treatment Effects and Their Importance to Clinical Research and Practice

Section snippets

Statistical and Clinical Significance, Power, and Meta-Analysis

Cohen’s d

Number Needed to Treat

Confidence Intervals and Effect Sizes

Discussion: The Threshold of Clinical Significance

Control Clin Trials

Ann Allergy Asthma Immunol

Control Clin Trials

Calculating the number needed to treat for trials where the outcome is time to an event

Br Med J

The shift from significance testing to effect size estimation

Dominance statisticsOrdinal analyses to answer ordinal questions

Psychol Bull

The cost of dichotomization

Appl Psychol Measurement

Statistical Power Analysis for the Behavioral Sciences

The earth is round (p<.05)

Am Psychol

The number needed to treatA clinically useful measure of treatment effect

Br Med J

The Handbook of Research Synthesis

Statistical versus traditional procedures for summarizing research

Psychol Bull

A method of estimating comparative rates from clinical data. Applications to cancer of the lung, breast and cervix

J Natl Cancer Inst

A statistical problem arising from retrospective studies

Misuse of statistical tests in three decades of psychotherapy research

J Consult Clin Res

A leisurely look at the bootstrap, the jackknife, and cross-validation

Am Statistician

Computer-Intensive Statistical Methods (Technical Report 174)

On the asserted invariance of the odds ratio

Br J Prev Soc Med

Effect Sizes for Research

Statistical Methods for Meta-Analysis

Biases of success rate differences shown in binomial effect size displays

Psychol Bull

Review
Size of Treatment Effects and Their Importance to Clinical Research and Practice