Article Text

Download PDFPDF

Use of the experience sampling method in the context of clinical trials
  1. Simone J W Verhagen,
  2. Laila Hasmi,
  3. Marjan Drukker,
  4. J van Os,
  5. Philippe A E G Delespaul
  1. Department of Psychiatry and Neuropsychology, Maastricht University, Maastricht, The Netherlands
  1. Correspondence to Professor Philippe A EG Delespaul; Ph.delespaul{at}


Objective The experience sampling method (ESM) is a structured diary technique to appraise subjective experiences in daily life. It is applied in psychiatric patients, as well as in patients with somatic illness. Despite the potential of ESM assessment, the improved logistics and its increased administration in research, its use in clinical trials remains limited. This paper introduces ESM for clinical trials in psychiatry and beyond.

Methods ESM is an ecologically valid method that yields a comprehensive view of an individual's daily life. It allows the assessment of various constructs (eg, quality of life, psychopathology) and psychological mechanisms (eg, stress-sensitivity, coping). These constructs are difficult to assess using cross-sectional questionnaires. ESM can be applied in treatment monitoring, as an ecological momentary intervention, in clinical trials, or in single case clinical trials. Technological advances (eg, smartphone applications) make its implementation easier.

Results Advantages of ESM are highlighted and disadvantages are discussed. Furthermore, the ecological nature of ESM data and its consequences are explored, including the potential pitfalls of ambiguously formulated research questions and the specificities of ESM in statistical analyses. The last section focuses on ESM in relation to clinical trials and discusses its future use in optimising clinical decision-making.

Conclusions ESM can be a valuable asset in clinical trial research and should be used more often to study the benefits of treatment in psychiatry and somatic health.


This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


In clinical trials, the effectiveness of a psychiatric intervention is often assessed retrospectively, by asking patients to complete self-report questionnaires or by administering clinical interviews over the past days or weeks.1 These methods lack detailed insight into momentary processes and context sensitivity relevant to assess stress-reactivity, a notion central to psychopathology. Therefore, the use of momentary assessment techniques is proposed in mental health research to tap into daily life symptom experiences.2 ,3 Momentary assessment techniques, such as the experience sampling method (ESM), have a long tradition.4–6 This paper aims to demonstrate the usefulness of these methods in clinical trial research, in the field of psychiatry and beyond.

What is ESM?

ESM is an umbrella term for a family of momentary assessment techniques that use signals to trigger data collection in daily life.5 Alternative terms are ecological momentary assessment,6 ambulatory assessment,7 event sampling,8 beeper studies,9 the structured diary method,10 intensive longitudinal assessment11 and real-time data capture studies.12 ESM assesses sampled experiences and behaviours as well as moment-to-moment changes in mental states, embedded in (normal) daily life. It is an empirically validated structured diary technique. Typically, participants are asked to repeatedly complete short questionnaires (lasting no more than 2 min) in response to beep prompts. The questionnaires cover current mood, cognitions, perceptions, behaviours and descriptors of the momentary context (eg, location, company, activity).4–6 ESM is useful in psychiatry, as well as in patients with somatic illness. The method focuses on symptoms (ill health), as well as on adaptive functioning (well-being), aiming to map (normal) daily psychological functioning.13 Originally, paper diaries were used in combination with pagers or electronic wristwatches.4 As technology became more advanced, data collection logistics and reliability was improved by the use of personal digital assistants13 and smartphone applications14 (eg, the PsyMate, ExperienceSampler or apps by Invivo, lifeData). Typically, assessments are collected using an unpredictable (random) time sampling protocol. Assessments can also be triggered by an event (event sampling).8 Questionnaires are designed for quick and easy data collection and use open-ended questions, visual analogue scales, checklists or self-report Likert scales.4 ,13

A typical ESM data form

ESM assessments usually comprise a morning questionnaire, a beep questionnaire assessed repeatedly within a day and an evening questionnaire. The content of the items is subject to the assessment theme. Questions are short and can be rated quickly. An example of such an assessment form is included in table 1.

Table 1

Beep-level assessment sheet from the Psymate standard assessment protocol (

Why is ESM used in research?

ESM typically has a number of advantages, but its implementation varies.

  • ESM has high ecological validity because assessments are made in the natural flow of real life.13 ,15

  • ESM reduces memory strains and avoids aggregation because only the actual moment is assessed repeatedly over time. This increases accuracy and is comparatively easier.15

  • The repeated assessments are collected in different situations (contextualised), which allows researchers to disentangle and understand the variability in mental states and psychological constructs.16

  • ESM yields a rich data set covering information on mental state, quality of life, mobility, social network and more, which reduces the need to use separate questionnaires measuring different constructs.13

  • Assessment error is reduced by repeated measures over time. This improves the validity, reliability, and transparency of individual pattern assessments, which is helpful in clinical practice.12

  • Sensitivity to detect change increases due to the collection of data at multiple time points.11

  • ESM makes it easier for patients during feedback sessions to acknowledge and translate the findings to daily life practice, because they have rated the items (eg, anxiety) repeatedly. This avoids mystification. Involvement and collaboration in care are improved and feelings of empowerment in the treatment process are heightened.17

A possible drawback of ESM is that the method is perceived as time-consuming and demanding.16 Ideally, assessments are kept as brief as possible, preferably <1 min, with a maximum of 2 min.4 A study protocol of 10 assessments for 6 consecutive days results in a cumulative time investment of <120 min. The set of questionnaires needed in cross-sectional studies easily exceeds this time investment. Selection bias is a second concern. Not all patients are willing to participate or comply with the protocol, and this subgroup possibly functions better than the non-responders.3 However, previous research has shown that the method is feasible in a wide variety of patients, including those suffering from severe mental illness.2 ,4 Most studies have reported good compliance.18 On the other hand, patients are inclined to miss assessments as a response to their current mood, which could also lead to selection bias. However, extensive analysis of missed responses demonstrate that beeps are predominantly missed at logical moments, such as the morning and evening (when participants are most likely sleeping), and lag analysis found no mood-related dropped responses.19 Third, ESM can induce reactivity.16 For instance, participants report increased awareness during ESM data collection.18 Monitoring is also used as a behavioural intervention, for instance, to reduce cigarette smoking. In this case, behavioural change is maximised by single target assessment of unwanted (automatic) behaviours. ESM, in contrast, is an open exploration of daily life using multiple targets. This open exploration reduces reactivity. Furthermore, repeated (exclusive) assessments of negative mood states can be confrontational and induce negative feelings. To avoid this, a careful and balanced construction of the questionnaire is needed; for example, changing the order of questions so that positive and negative momentary mood items are mixed. Finally, the complexity of ESM data analyses could form a drawback for both researchers and clinicians.11 Normal statistical methods, such as linear regression analysis and analysis of variance, cannot be used.20 Aggregation of data per participant removes the multilevel structure, allowing simple analyses. In addition, most statistical packages now have easily accessible multilevel regression tools. The present paper provides guidelines on how to analyse the data properly, allowing a reliable assessment of the richness of the ESM data (see below).20

Method and statistics

ESM has advantages over classical assessment methods in clinical trials. However, ESM is a complex assessment method and considerations should be taken into account when using this innovative approach.

The nature of ESM data

The nature of ESM data is appealing but its complexity often challenges researchers and clinicians. The data collection parameters (the actual design) that define the nature of the data can be more important than statistical techniques. Aspects to consider are item selection, item order, the time frame (eg, number of days), the intensity (eg, number of beeps per day; number of questions within a beep), the need for additional information (eg, sleep quality assessed in a morning questionnaire), the signalling algorithm (eg, random, fixed, beep-free periods, anticipation), the addition of event recording (eg, of stressors or panic attacks), application type and data storage.

Items from cross-sectional questionnaires are often unsuitable for repeated assessment in daily life. In one-time questionnaires, a reliable assessment of a construct is achieved by redundancy (multiple items in a sum score). Repeated (up to 10 times a day) answering of similar items is frustrating. Often, metaphors are used, but these lack variation within the day.21 ESM information, such as current mood, activity and company, is assessed with single items, which can be combined to improve reliability: different aspects at one moment or the same items over time. ESM data are ecologically valid but correlational in nature.22 The current activity can be a cause as well as an effect of momentary mood states. Furthermore, different mental states have different natural flows. Anxiety, for example, fluctuates and is more contextually reactive than depression. With an ESM sampling frequency of 10 times a day, highly variable states will not be adequately represented. In that case, the process is under-sampled. Other slow changing states are often over-sampled. The actual ESM protocol is usually a compromise.

Event monitoring is often added to ESM protocols. For example, participants are asked to complete a questionnaire when a certain stressor is present, or when the participant has a panic attack. This requires continual prospective monitoring and results in a high workload for participants. The recorded event initiates a questionnaire. Events should be discrete (have a clear beginning and end) and often require coding instructions (what is a panic attack or a social interaction?). There is no correction for rating misinterpretation. Feasibility limits the number of (different) events that can be reported. Having to respond to additional questions after reporting an event can act as a ‘punishment’ and results in an extinction of the report (not the actual event) under some (stressful) circumstances. The same ‘learning curve’ can occur in branching when different answers lead to different workloads. Often, time sampling is preferred because it is less burdensome, more reliable, allows the (non-exhaustive) assessment of a larger set of events and reports events as well as non-events (eg, when a participant smokes, as well as when a participant does not smoke).

When beeps are programmed at fixed times, predictability increases reactivity and this may induce behavioural changes (eg, postpone shopping or showering). More generally, when beep-free periods can be anticipated (eg, no beep expected within an hour after a beep), reactivity increases. Random sampling avoids this reactivity. Unexpectedly, true random schedules (eg, 3 beeps on 1 day and 15 on another) are not ideal, because long periods with no beeps can result in some participants staying at home (not to miss a beep). This behaviour disrupts normal daily life. Therefore, a stratified random schedule with restricted intervals is advised.4

This list of design modalities is not exhaustive and the actual choices depend on the research question and study population.

Statistical analyses of ESM data

ESM samples daily life and can be used to compute time budgets, for example, the proportion of time spent alone or doing sports. Similarly, the average anxiety level (or anxiety while alone) can be computed, resulting in one statistic per participant as the input for standard statistical analyses. However, in most situations, statisticians prefer to use all available data. In that case, analyses have to account for the fact that in ESM, multiple participants answer a set of questions multiple times, resulting in a multilevel data set. Hence, ESM observations are not independent and standard linear and logistic regression analysis techniques cannot be used.20 Multilevel linear and logistic regression is indicated and is now available in most statistical analysis packages (eg, Stata, R, SPSS). Since observations are nested in participants, the most simple multilevel regression models for ESM data include a random intercept that allows appropriate control for the between participant variability.20 These models therefore include a second error term at the participant level, in addition to the regular error term (at the assessment level) used in unilevel regression. Even when trends over time are not the primary interest, it is advisable to remove the potentially confounding effects of time in ESM data by adding a time variable (detrending23). This is effective when values increase or decrease linearly over the time of the study. Furthermore, since adjacent assessments tend to be more similar to each other than to other assessments further apart in time (autocorrelation), ESM analyses also need more sophisticated adjustment methods.11 For example, autocorrelation can be taken into account by using appropriate correlational structures for the residuals (eg, AR, MA, ARMA).11 Care must be exercised when using such models, since observations are not evenly spaced due to randomisation and data collection usually proceeds over multiple days, resulting in a huge time lag for observations on consecutive days. Therefore, continuous-time structures are usually preferable, although their availability across different software packages differs. It is also highly recommended to allow random slopes in the model where applicable.20 This yields more realistic estimates of SEs, which otherwise might be underestimated. A random slope means that the regression coefficient is not fixed, but is allowed to vary between participants. This variation is realised by adding an extra error term for each predictor. In these models, a simple variance–covariance matrix of the random effects (eg, which assumes that all covariances are zero: the default in some statistical software packages) is not realistic. Therefore, an unstructured variance–covariance matrix, where all variances and covariances are allowed to be different, is preferable. However, this can lead to rather complex and time-consuming models. Sometimes, the statistical program may even fail to provide output. Alternatively, one can consider only using random slopes for the main independent variable(s). If the analysis includes multiple main independent variables, or if the analysis still does not yield valid results after reducing the number of random slopes, it may be necessary to remove essential random slopes. In that case, resampling methods (eg, permutation tests, bootstrapping) can be used to obtain valid tests and CIs.24

Clarifying ESM questions

ESM allows the study of a broad range of research questions. The epidemiological finding that the mental health of people in urban environments is poor25 (a between-subject question) can be explored by assessing whether people in crowded environments feel worse (a within-subject question). ESM allows the exploration or ‘unpacking’ of relevant factors.26 The multilevel data structure of ESM can confuse researchers. Consider, for example, a study that explores the relation between depression and loneliness. What should be studied? Are depressive persons often alone, do lonely people become depressed, does depression make people lonely, are people in general more depressed when alone, and does this relation hold for depressed people? ESM data analyses require more specific research questions. The pitfall lies in the convergence of person-related questions and situation-related questions, both of which are represented in ESM data and can lead to divergent conclusions.22 In addition, results from ESM studies are easily misinterpreted. The conclusion that agoraphobia is a bogus diagnosis because ESM data show that people with this diagnosis feel worse when at home (and better when out) disregards the fact that people may select going out on days they feel best. In correlational relations, causal attributions should be avoided.


ESM was developed to sample moments from daily life. Since then, the method has yielded valuable results to better understand mental illness.16 ESM yields a comprehensive view of the participant's daily life. This comprehensive view is valuable in clinical trials because the same instrument can measure various constructs, such as quality of life, psychopathology, social networks or productive activities. This broad measurement is done with high sensitivity to change.11 Even more, ESM allows assessment of mechanisms that are difficult to assess with cross-sectional questionnaires, such as stress-sensitivity, coping, reward-sensitivity and resilience.16 ESM has the potential to replace a larger body of instruments and sensitively assesses new and relevant parameters. It is, however, not yet the gold standard in clinical trials. This fact is mainly due to the heterogeneity in ESM applications.27 Nevertheless, it has the potential to become the de facto gold standard.

ESM monitoring in clinical trials

ESM data are ideal to inform and optimise personalised treatment in mental health. Feedback modules have been developed that translate ESM data into understandable graphs and figures that help promote patient engagement in treatment. Kramer et al28 assessed the benefits of an ESM feedback-intervention in a randomised control clinical trial. An ESM feedback-intervention was studied in addition to treatment as usual in patients diagnosed with depression. They demonstrated that ESM with weekly feedback moments reduced depression more than treatment as usual. ESM monitoring without feedback had a similar association, but this reduction did not last. The majority of ESM studies have short assessment periods, ranging from several days to a few weeks.12 In clinical practice, it is important to follow a patient for months or years. To reduce the burden, clinicians experiment with interrupted periodic ESM assessments. Psychiatrists and patients can mutually agree to customise ESM to monitor the effectiveness of treatment and update interventions accordingly. There is growing insight into what might be relevant to individual patients. For example, a patient who used ESM for a year had increased associations between symptoms in periods of exacerbation (personal communication, Dr. M. Bak). The patient's subjective feeling of relapse coincided with changes in ESM towards higher severity. Monitoring patients this way provides both the clinician and the patient with additional information that can improve a personalised treatment process.

Ecological momentary intervention (EMI) has been cued as an umbrella term for mobile interventions in the daily living environment of participants. Unlike ESM, EMI aims not to observe, but to intervene. Only recently has mobile technology allowed one to bridge the gap between the therapist's office and situations in daily life, where acquired skills should generalise and be autonomously applied. Several studies have demonstrated that EMI is feasible, acceptable and can be successful in mental health treatment.29

Single case clinical trials

The translation of evidence from clinical trials assumes that patients are a homogeneous group. This is often not true. A large variability exists in treatment responses between subjects. Single case designs can be used to customise treatment to individual needs and accommodate to strengths. Since treatment effect (or outcome optimisation) can be difficult to detect, single case trials can add (experimental) control to clinical decision-making.30 ,31 This control requires that the timing of conditions is allocated randomly (independent from the clinical effect). Guidelines are available for the randomisation of single case trials,32 and statistical software packages33 facilitate the analyses. An interesting innovation is a user-initiated program, which implements ESM to monitor the effects of gradual dose reductions in the user's antidepressant medication.34 Over the course of a year, the user collected over a thousand ESM observations. Meanwhile, his clinician teamed up with a pharmacist to blindly implement a dose-reduction (tapering) schedule. From the data, early warning signs were detected for mood changes that predicted relapse. They could relate the mood change patterns to the medication reduction and were able to find the optimal, safe medication dosage.34 The use of ESM in classical clinical trials is less challenging because only aggregated group findings are pursued. Global constructs can be used to comprehensively describe the mental state of groups of individuals. In single-case studies, the data are highly individual and can benefit from the individual selection of items that maximally reflects the user's variability.

Future of ESM in clinical trials

The advantages of single case trials to optimise treatment are increasingly recognised.31 In addition, there is increasing recognition of how mobile phones can be used as medical instruments in the treatment of mental illness.35 The benefits of combining mobile technologies with single case designs are also recognised.36 Dissemination in regular practice requires freely available user-friendly technology and software that reduces the logistical burden and need for statistical skills. By making this technology available, treatment efficiency can be improved. Most participants are already used to carrying mobile phones, so the threshold for using them to collect ESM data via an app is greatly reduced. ESM progresses in parallel with evolutions in mobile technology. Data collection logistics are reduced by innovations in tools and programs, as well as cultural changes related to the use of mobile devices. Furthermore, innovation in mobile tools facilitates the general acceptance of momentary assessment. For broader clinical dissemination, great investments are needed in software that allows on-the-fly comprehensive feedback. More research should focus on the way data become information. This knowledge is important for academically trained professionals, as well as for clinicians who are more reluctant to use and less able to understand statistical analyses. In addition, the patients themselves should be able to comprehend the data. Ideally, treatment planning, implementation and evaluation should be a collaborative process. ESM can prove its utility both in large-scale studies and at the level of clinical practice. The latter is especially important, as ESM can be used to optimise individual treatment and to help in decision-making. In the future, assessment strategies that are more sensitive in detecting change could be used in all prescriptions of pain or psychotropic medication. Medication should be continued only when efficacy is determined in a specific patient. Sensitive assessment strategies can also be used in the domain of optimal dose finding. Furthermore, recovery oriented care in psychiatry is not restricted to ill health and symptoms, but extends to well-being. Daily life assessment strategies can inform patients and clinicians how to improve resilience by maximising strengths in patients.



  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.