*© 2021, American Psychological Association. This paper is not the copy of record and may not exactly replicate the final, authoritative version of the article. Please do not copy or cite without authors' permission. The final article will be available, upon publication, via its DOI: 10.1037/xge0001052*

**No Evidence for Loss Aversion Disappearance and Reversal in Walasek and Stewart (2015)**

Quentin André\*

Bart de Langhe

(*Forthcoming, Journal of Experimental Psychology: General)*

Quentin André (quentin.andre@colorado.edu) is an assistant professor of marketing at the Leeds School of Business, University of Colorado Boulder. He was previously an assistant professor of marketing at the Rotterdam School of Management, Erasmus University. Bart de Langhe (bart.delanghe@esade.edu) is an associate professor of marketing at ESADE, Universitat Ramon Llull. Please address all correspondence to Quentin André. We are grateful to Lukasz Walasek and Neil Stewart for making their methods and data transparent and publicly available. We thank Lukasz Walasek, Neil Stewart, Tim Mullett, David Gal, and two other anonymous reviewers for their helpful comments and suggestions. Finally, we thank Bram van den Bergh and Uri Simonsohn for helpful comments on previous versions of this manuscript. The code and files needed to reproduce all the analyses reported in this manuscript are accessible on the [OSF repository](https://osf.io/67ng8/) of the project.

# ABSTRACT

Loss aversion—the idea that losses loom larger than equivalent gains—is one of the most important ideas in Behavioral Economics. In an influential article published in the *Journal of Experimental Psychology: General*, Walasek and Stewart (2015) test an implication of decision by sampling theory: Loss aversion can disappear, and even reverse, depending on the distribution of gains and losses people have encountered. In this manuscript, we show that the pattern of results reported in Walasek and Stewart (2015) should not be taken as evidence that loss aversion can disappear and reverse, or that decision by sampling is the origin of loss aversion. It emerges because the estimates of loss aversion are computed on different lotteries in different conditions. In other words, the experimental paradigm violates measurement invariance, and is thus invalid. We show that analyzing only the subset of lotteries that are common across conditions eliminates the pattern of results. We note that other recently published articles use similar experimental designs, and we discuss general implications for empirical examinations of utility functions.

# INTRODUCTION

Loss aversion—the idea that losses loom larger than equivalent gains —is one of the most important ideas in Behavioral Economics. A central feature of Prospect Theory (Kahneman & Tversky, 1979), it has inspired fundamental and applied research in psychology, economics, and management. However, a growing body of research now questions whether loss aversion is indeed a fundamental law of human decision-making (e.g., Ert & Erev, 2013; Gal & Rucker, 2018; Yechiam, 2018).

Decision by sampling theory (Stewart et al., 2006) suggests that the utility of a gain is derived from a series of ordinal comparisons with other gains in memory, and the utility of a loss is similarly derived from a series of ordinal comparisons with other losses in memory. In an influential article published in the *Journal of Experimental Psychology: General*, Walasek and Stewart (2015; hereafter W&S) test an implication of decision by sampling theory: Loss aversion can disappear, and even reverse, depending on the gains and losses that participants have encountered. Multiple articles in high-impact journals have explored extensions and boundary conditions of this prediction (e.g., Alempaki et al., 2019; Bhui & Gershman, 2018; Olivola & Chater, 2017; Rigoli, 2019; Stewart, Reimers, & Harris, 2015; Walasek & Stewart, 2018), making W&S one of the most-cited articles published in *JEP: General* since 2015.[^1]

In the current paper, we show that the differences in loss aversion *λ* reported in W&S should not be taken as evidence that loss aversion can disappear and reverse, or that decision by sampling is the origin of loss aversion. The mistake in W&S is that the experiments conflate the treatment phase and the measurement phase. As a consequence, the estimates of loss aversion *λ* are computed on different lotteries in different conditions.

We proceed in four steps. First, we describe the experimental paradigm and the pattern of results in W&S. Second, we explain why the experimental paradigm is invalid, and show that the differences in *λ* reported in W&S will emerge without decision by sampling. Third, we show that analyzing the subset of lotteries that are common across conditions eliminates the pattern of results. Fourth, we discuss the implications of our findings for past and future investigations of loss aversion and decision by sampling.

# THE EXPERIMENTAL PARADIGM IN W&S

According to decision by sampling theory (Stewart et al., 2006), the (dis)utility of a gain (loss) is determined by its ordinal position in a set of previously encountered gains (losses). As such, the utility of a unit gain and the disutility of a unit loss will depend on the set of gains and losses that people have encountered. For instance, the difference between a \$12 gain and a \$20 gain appears larger in the context of small gains (e.g., \$6, \$8, \$10, **\$12**, \$14, \$16, \$18, **\$20**) than in the context of large gains (e.g., **\$12**, \$16, **\$20**, \$24, \$28, \$32, \$36, \$40), and thus the utility of a unit gain is larger when gains are small than when the gains are large. Similarly, the difference between a \$12 loss and a \$20 loss appears larger in the context of small losses (e.g., \$6, \$8, \$10, **\$12**, \$14, \$16, \$18, **\$20**) than in the context of large losses (e.g., **\$12**, \$16, **\$20**, \$24, \$28, \$32, \$36, \$40), and thus the disutility of a unit loss is larger when losses are small than when losses are large. Decision by sampling therefore predicts loss aversion when gains are large and losses are small (because the utility of a unit gain appears relatively small and the disutility of a unit loss appears relatively large), but a reversal of loss aversion when gains are small and losses are large (because the utility of a unit gain appears relatively large and the disutility of a unit loss appears relatively small).

To test this prediction, participants in W&S make repeated decisions to either accept or reject lotteries with a 50:50 chance of winning versus losing different amounts. The distributions of the gains and losses presented to participants are manipulated between subjects. In Experiment 1, for instance, participants are randomly assigned to one of four conditions: They make decisions about 64 lotteries that feature (i) small gains and small losses (\$6 to \$20, in increments of \$2), (ii) large gains and large losses (\$12 to \$40, in increments of \$4), (iii) large gains and small losses, (iv) or small gains and large losses.[^2] This experimental design is summarized in Table 1.

<table>
<colgroup>
<col style="width: 33%" />
<col style="width: 33%" />
<col style="width: 33%" />
</colgroup>
<thead>
<tr>
<th style="text-align: center;"></th>
<th style="text-align: center;"><p>Small gains</p>
<p>($6, $8, $10, $12,</p>
<p>$14, $16, $18, $20)</p></th>
<th style="text-align: center;"><p>Large gains</p>
<p>($12, $16, $20, $24,</p>
<p>$28, $32, $36, $40)</p></th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center;"><p>Small losses</p>
<p>($6, $8, $10, $12,</p>
<p>$14, $16, $18, $20)</p></td>
<td style="text-align: center;">Loss Neutral</td>
<td style="text-align: center;">Loss Averse</td>
</tr>
<tr>
<td style="text-align: center;"><p>Large losses</p>
<p>($12, $16, $20, $24,</p>
<p>$28, $32, $36, $40)</p></td>
<td style="text-align: center;">Loss Seeking</td>
<td style="text-align: center;">Loss Neutral</td>
</tr>
</tbody>
</table>

Table 1: Experimental conditions and expected pattern of results in Experiment 1 in W&S. Participants see all pairwise combinations of gains and losses, for a total of 64 lotteries.

The 64 decisions made by each participant are then analyzed using the following logistic regression:

``` math
\log\left( \frac{P(Accept)}{1 - P(Accept)} \right) = \beta_{Bias} + \beta_{G}Gain + \beta_{L}Loss
```

where $`\beta_{G}`$ reflects the utility of a unit increase in gains, $`\beta_{L}`$ reflects the utility of a unit increase in losses, and $`\beta_{Bias}`$ reflects the utility of accepting a lottery regardless of its gain or loss. The degree of loss aversion of each participant is then operationalized as $`\lambda = \ \frac{- \beta_{L}}{\beta_{G}}`$.

W&S report that *λ* = 1 when gains and losses have similar magnitudes (i.e., both small or both large), *λ* \> 1 when gains are large and losses are small, and *λ* \< 1 when gains are small and losses are large. See Figure 1. They take this pattern of results to suggest that loss aversion can disappear and reverse, and that decision by sampling is the origin of loss aversion.

![Figure 1. Median estimates of λ across the four experimental conditions of Experiment 1.](https://quentinandre.net/publications/reanalysis-ws/media/image1.png)

Figure 1. Median estimates of *λ* obtained in each of the four experimental conditions of Experiment 1 (1a + 1b). We use the condition labels from W&S. Error bars are bootstrapped 95% confidence intervals of the medians (hence the asymmetry).

*Figure 1 (visual description): Point-range plot. The y-axis is loss aversion λ = −β_L/β_G (~0.8–1.8) with a horizontal reference line at λ = 1; the x-axis lists the four conditions 20G-20L, 20G-40L, 40G-20L, 40G-40L (gain/loss ranges). Median λ with bootstrapped 95% CIs is ≈1.00 (20G-20L), ≈0.86 (20G-40L), ≈1.73 (40G-20L — well above 1), and ≈1.02 (40G-40L). Estimated loss aversion thus swings sharply across conditions, peaking in 40G-20L — the pattern W&S interpreted as loss aversion disappearing and reversing.*

# λ WILL DIFFER WITHOUT DECISION BY SAMPLING

In W&S, the utility of a gain and the disutility of a loss are computed from participants’ willingness to accept different lotteries in different conditions. This experimental design is invalid because it violates measurement invariance. As a consequence, the null hypothesis is unspecified: We cannot know how *λ* would differ between conditions in the absence of decision by sampling.

Below, we show how various plausible decision rules that do not assume memory for previously seen outcomes (i.e., no decision by sampling) will produce differences in *λ* between conditions consistent with those reported in W&S.

## Evidence Through Simulation

To demonstrate that differences in *λ* will emerge without memory for previously seen outcomes (and therefore without decision by sampling), we created 10,000 simulated “participants,” which we randomly assigned to the four conditions of Experiment 1 in W&S. All participants decided to accept or reject each of the 64 lotteries based on the value of the prospective gain and the value of the prospective loss offered by the lottery:

``` math
\log\left( \frac{P(Accept)}{1 - P(Accept)} \right) = U(Gain,\ Loss)
```

We then estimated *λ* on those 64 simulated choices as in W&S:

1.  For each participant, we estimate a logistic regression model:

``` math
\log\left( \frac{P(Accept)}{1 - P(Accept)} \right) = \beta_{Bias} + \beta_{G}Gain + \beta_{L}Loss
```

2.  We construct a participant-level estimate of loss aversion: $`\lambda = \ \frac{- \beta_{L}}{\beta_{G}}`$.

3.  We remove the 5% of participants with the highest deviance, and any participant with a negative estimate of *λ*.

4.  We compare the median estimate of *λ* across conditions.

We repeat this simulation and analysis for three different decision rules:

1.  Log-Linear: The utility of a lottery is determined by the value of the potential gain and the value of the potential loss, with diminishing marginal sensitivity to the magnitudes of gains and losses (e.g. Kahneman & Tversky, 1979):  
    ``` math
    \text{U}(\text{Gain},\ \text{Loss})\  = \ log(Gain)\  - \ log(Loss)
    ```

2.  Gain-Loss Ratio: The utility of a lottery is determined by the magnitude of the potential gain divided by the magnitude of the potential loss (e.g., De Langhe & Puntoni, 2014):$`
    `$
    ``` math
    \text{U}(\text{Gain},\ \text{Loss})\  = \ \frac{\text{Gain}}{\text{Loss}} - 1
    ```

3.  Discontinuous Expected Value: The utility of a lottery is determined by its expected value, with discontinuities at zero (reflecting that lotteries with positive, neutral and negative expected value feel qualitatively different; e.g., Diecidue & Van De Ven, 2008; Payne et al., 1980):

> 
> ``` math
> \text{U}(\text{Gain},\ \text{Loss})\  = \ Gain\  - \ Loss\  + \ \mathbb{1}_{(\text{Gain}\  > \ \text{Loss})}\  - \ \mathbb{1}_{(\text{Gain}\  < \ \text{Loss})}\ 
> ```

Figure 2 compares the median estimates obtained on this simulated data with the estimates obtained on the pooled data from Experiments 1a and 1b in W&S. While simulated participants evaluated lotteries without memory for previously seen outcomes, and they followed the same decision rule across conditions, we find the same pattern of results as in W&S.

![Figure 2. Median λ from W&S data versus three simulated decision rules.](https://quentinandre.net/publications/reanalysis-ws/media/image2.png)

Figure 2. Comparison of the median estimates of *λ* obtained on data from W&S (circles) and data simulated with the three different decision rules (diamonds, squares and triangles). Error bars are bootstrapped 95% confidence intervals of the medians (hence the asymmetry).

*Figure 2 (visual description): Same four conditions on the x-axis; λ on the y-axis (~0.5–2.0) with a reference line at 1. Four series: Original W&S data (blue circles) and three simulated decision rules that contain no loss aversion — Log-Linear (orange diamonds), Gain-Loss Ratio (green squares), Discontinuous EV (red triangles); medians with 95% CIs. Annotated λ values per condition: 20G-20L — λ_O = 1.00, λ_LL = 1.00, λ_GLR = 1.06, λ_DEV = 1.00; 20G-40L — λ_O = 0.86, λ_LL = 0.52, λ_GLR = 0.50, λ_DEV = 0.79; 40G-20L — λ_O = 1.73, λ_LL = 2.00, λ_GLR = 2.00, λ_DEV = 1.24; 40G-40L — λ_O = 1.02, λ_LL = 1.00, λ_GLR = 1.08, λ_DEV = 1.00. The loss-aversion-free simulated rules reproduce the same up-and-down λ pattern as the real data, showing the pattern does not require loss aversion.*

## Evidence in Data from W&S

We have shown that the differences in *λ* reported *between* conditions will emerge without decision by sampling. Next, we show that a similar pattern of results emerges *within* conditions (where no differences are to be expected based on decision by sampling) when *λ* is estimated on different lotteries.

Specifically, we focus on four subsets of lotteries taken from the 20G-20L and the 40G-40L conditions: Lotteries with small gains and small losses, lotteries with large gains and large losses, lotteries with large gains and small losses, and lotteries with small gains and large losses. We define “small” gains (losses) as the lower four values: \[\$6, \$8, \$10, \$12\] in the 20L-20G condition, and \[\$12, \$16, \$20, \$24\] in the 40G-40L condition. Similarly, we define “large” gains (losses) as the upper four values: \[\$14, \$16, \$18, \$20\] in the 20G-20L, and \[\$28, \$32, \$36, \$40\] in the 40G-40L condition.

If the results were uniquely driven by decision by sampling, then we should not find any differences in *λ* (since all participants belonged to the same condition, and thus encountered the same distribution of gains and losses). If, on the other hand, the results stem from analyzing lotteries with different outcome magnitudes, we should observe differences in *λ* *within* conditions similar to the differences in *λ* *between* conditions reported in W&S.

Analyzing a smaller subset of lotteries decreases statistical power, which limits our ability to precisely estimate *λ* at the participant-level. It also increases the likelihood that a participant accepted (or rejected) all lotteries in the subset, which would yield a singular choice matrix, and exclude the participant from the analysis. For those reasons, we instead use a pooled model:

``` math
\log\left( \frac{P(Accept)}{1 - P(Accept)} \right) = (\beta_{Bias} + \beta_{G}Gain + \beta_{L}Loss)*C(Subsets),
```

where C(Subsets) is a vector of dummies identifying the subset of lotteries under consideration. We then recover the subset-level estimates $`\beta_{G}`$ and $`\beta_{L}`$ to construct subset-level estimates of $`\lambda = \ \frac{- \beta_{L}}{\beta_{G}}\ `$.[^3]

Figure 3 presents the estimates of *λ* for the four subsets of lotteries. We find similar estimates of *λ* for lotteries with small gains and small losses, and for lotteries with large gains and large losses. We find a higher estimate of *λ* when gains are large and losses are small, and a directionally lower estimate of *λ* when gains are small and losses are large.

![Figure 3. Estimates of λ for four lottery subsets within the 20G-20L and 40G-40L conditions.](https://quentinandre.net/publications/reanalysis-ws/media/image3.png)

Figure 3. Estimates of *λ* for four subsets of lotteries presented in the 20G-20L condition and the 40G-40L condition of Experiments 1a and 1b. Error bars are bootstrapped 95% confidence intervals (hence the asymmetry).

*Figure 3 (visual description): Two stacked panels, "20G-20L" (top) and "40G-40L" (bottom), each with λ on the y-axis (~−1 to 3) and a reference line at 1. Within each panel λ is estimated separately for four lottery subsets on the x-axis: Small G/Small L, Small G/Large L, Large G/Small L, Large G/Large L (points with 95% CIs). Although the experimental condition is held fixed within a panel, λ varies across subsets — notably higher for "Large G/Small L" (≈1.8 top, ≈1.3 bottom) and lower for "Small G/Large L" (≈0.5) — demonstrating that the λ estimate depends on which lotteries are measured (a measurement-invariance violation).*

# *λ* IS THE SAME WHEN ANALYZING COMMON LOTTERIES

To ensure that parameters are estimated on lotteries with similar outcome magnitudes, we can restrict the analysis to the *subset* of lotteries that are common across conditions. In Experiments 1a and 1b, for instance, 9 lotteries (out of 64) were shown to all participants regardless of their condition. Again, because analyzing a smaller subset of lotteries limits our ability to estimate *λ* at the participant-level, we estimate a pooled model:

``` math
\log\left( \frac{P(Accept)}{1 - P(Accept)} \right) = (\beta_{Bias} + \beta_{G}Gain + \beta_{L}Loss)*C(Condition)
```

where C(Condition) is a vector of dummies identifying the experimental conditions. We then recover the condition-level $`\beta_{G}`$ and $`\beta_{L}`$ to construct condition-level estimates of $`\lambda`$. As mentioned earlier, restricting the analysis to a small subset of lotteries comes at a cost in terms of statistical power. To assess the impact of this decrease in statistical power, we also estimate the same model for subsets of 9 lotteries picked at random for each participant.

Figure 4 compares the original *λ* estimates to the ones obtained for the 9 common lotteries, and for the 9 random lotteries. When considering the subset of lotteries that are common across conditions, we do not find evidence for decision by sampling: Estimates of *λ* are not statistically different from each other, and the confidence intervals of our estimates (while wide) do not contain the original estimates of *λ* reported in W&S (conditions 20G-40L and 40G-20L). Finally, for the 9 lotteries that were selected at random for each participant, estimates of *λ* were consistent with those reported in W&S. This suggests that the non-significant differences in *λ* for the common lotteries cannot be attributed to a drop in statistical power alone.

![Figure 4. Estimates of λ for all lotteries, random subsets, and common lotteries.](https://quentinandre.net/publications/reanalysis-ws/media/image4.png)

Figure 4. Estimates of *λ* for all lotteries (circles), random subsets of 9 lotteries (squares), and common lotteries (diamonds). Error bars are bootstrapped 95% confidence intervals (hence the asymmetry).

*Figure 4 (visual description): Three stacked panels — "Experiments 1a + 1b", "Experiment 1a", "Experiment 1b" — each with λ on the y-axis (~0.6–2.6), a reference line at 1, and the four conditions (20G-20L, 20G-40L, 40G-20L, 40G-40L) on the x-axis. Three series per panel: All Lotteries (blue circles), Random Subset of 9 (orange squares), and Common Lotteries (green diamonds). All-Lotteries and Random-Subset reproduce the shifting pattern (λ spikes to ≈1.8 in 40G-20L and dips in 20G-40L), but Common Lotteries stays flat near λ = 1 across all conditions — restricting to the lotteries common to every condition eliminates the apparent shifts in loss aversion.*

# GENERAL DISCUSSION

Our analyses suggest that W&S should not be taken as evidence that loss aversion can disappear and reverse, or that decision by sampling is the origin of loss aversion. The differences in *λ* emerge because *λ* is computed on different lotteries in different conditions. We demonstrated that these differences will emerge without decision by sampling, and that there are no differences when *λ* is computed on similar lotteries in different conditions.

**Implication for Tests of Decision by Sampling Theory**

We highlight that other recently published papers make a similar error. On the topic of loss aversion, a recent article in *Journal of Experimental Psychology: Learning, Memory, and Cognition* reports that the skewness of gains and losses encountered in the environment shape loss aversion (Walasek & Stewart, 2018). However, loss aversion is again estimated on different lotteries in different conditions.

More generally, a recent article in *Management Science* reports that the shapes of utility and probability weighting functions are influenced by the distribution of outcomes that people have encountered in the environment (Stewart et al., 2015). However, utility function parameters are again estimated on different outcomes in different conditions. A “quasi-adversarial collaboration” by Alempaki and colleagues (2019) found no evidence consistent with decision by sampling after re-analyzing the gambles that are common across conditions.

It is important to emphasize that more advanced statistical modeling is not an antidote to violations of measurement invariance in experiments. In the same way that one cannot determine whether controlling for observable characteristics of participants solves a violation of random assignment (e.g., Gordon et al., 2019), modeling a more complex utility function may or may not solve a violation of measurement invariance. It is a problem of experimental design, not one of statistical analysis.

Finally, we highlight that even if *λ* had been estimated on similar lotteries across conditions, the experiments in W&S would not provide a sufficiently specific test of decision by sampling theory (i.e., that the utility of an outcome is determined by its ordinal rank in the set of other outcomes sampled from memory). For instance, the minimum and maximum amounts that participants encountered in each condition were different, so the reference point against which people evaluate gain amounts and loss amounts might differ across conditions. As a consequence, other reference-dependent models of utility (e.g., Kőszegi & Rabin, 2006; Loomes & Sugden, 1982) would predict similar changes in *λ*. A discriminant test of decision by sampling should present a result that is not, or less parsimoniously, predicted by other models.

**Implication for Measurements of Utility Function Parameters**

Measurement invariance is not only important for making within-study comparisons. It is critical also for making between-study comparisons, because the utility function parameters that researchers obtain in any study are *local* estimates, contingent on the outcomes they were estimated on. For instance, for the three decision rules we considered in our simulation, researchers will obtain higher estimates of loss aversion if they present lotteries with large gains and small losses (as in Tom et al., 2007; or Tversky & Kahneman, 1992), and lower estimates if they present lotteries with gains and losses of similar magnitudes, or lotteries with small gains and large losses.

This has implications for how to interpret meta-analyses of utility function parameters (e.g., Kühberger et al., 1999; Neumann & Böckenholt, 2014; Walasek et al., 2018). First, a meta-analysis that reveals significant heterogeneity across studies may be taken as evidence of hidden moderators (e.g., cultural differences), while it may in fact reflect that parameters are estimated on different outcomes in different studies. Second, meta-analytical averages of utility function parameters may not be informative. Even if meta-analyses included a broad set of studies spanning all reasonable outcomes, the average of the *local* parameter estimates found across studies may not converge to the *true* parameter value (i.e., the one that would be found in a single study including all reasonable outcomes). Moreover, the meta-analytical average would not allow making good predictions for how people might behave in a specific context.

# DATA AND METHODS

The code and files needed to reproduce all the analyses reported in this manuscript have been stored on an [OSF repository](https://osf.io/67ng8/).

# APPENDIX

## Validation of the Pooled Model

Table 2 compares the estimates obtained from the individual-level approach (as described in W&S) to the estimates obtained from the pooled model that we propose (estimated by inferential and Bayesian methods). Our results closely match the original results.[^4]

<table>
<colgroup>
<col style="width: 11%" />
<col style="width: 11%" />
<col style="width: 8%" />
<col style="width: 9%" />
<col style="width: 11%" />
<col style="width: 8%" />
<col style="width: 9%" />
<col style="width: 11%" />
<col style="width: 8%" />
<col style="width: 9%" />
</colgroup>
<thead>
<tr>
<th rowspan="2" style="text-align: right;">Condition</th>
<th colspan="3" style="text-align: center;">Experiments 1a + 1b</th>
<th colspan="3" style="text-align: center;">Experiment 1a</th>
<th colspan="3" style="text-align: center;">Experiment 1b</th>
</tr>
<tr>
<th style="text-align: center;">Median <span class="math inline"><em>λ</em><sub><em>i</em></sub></span></th>
<th style="text-align: center;"><span class="math inline"><em>λ</em></span> Logit</th>
<th style="text-align: center;"><span class="math inline"><em>λ</em></span> Bayes</th>
<th style="text-align: center;">Median <span class="math inline"><em>λ</em><sub><em>i</em></sub></span></th>
<th style="text-align: center;"><span class="math inline"><em>λ</em></span> Logit</th>
<th style="text-align: center;"><span class="math inline"><em>λ</em></span> Bayes</th>
<th style="text-align: center;">Median <span class="math inline"><em>λ</em><sub><em>i</em></sub></span></th>
<th style="text-align: center;"><span class="math inline"><em>λ</em></span> Logit</th>
<th style="text-align: center;"><span class="math inline"><em>λ</em></span> Bayes</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: right;">20G-20L</td>
<td style="text-align: center;">1.00</td>
<td style="text-align: center;">1.04</td>
<td style="text-align: center;">1.04</td>
<td style="text-align: center;">1.01</td>
<td style="text-align: center;">1.06</td>
<td style="text-align: center;">1.06</td>
<td style="text-align: center;">1.01</td>
<td style="text-align: center;">1.03</td>
<td style="text-align: center;">1.03</td>
</tr>
<tr>
<td style="text-align: right;">20G-40L</td>
<td style="text-align: center;">0.86</td>
<td style="text-align: center;">0.71</td>
<td style="text-align: center;">0.72</td>
<td style="text-align: center;">0.88</td>
<td style="text-align: center;">0.73</td>
<td style="text-align: center;">0.75</td>
<td style="text-align: center;">0.81</td>
<td style="text-align: center;">0.70</td>
<td style="text-align: center;">0.71</td>
</tr>
<tr>
<td style="text-align: right;">40G-20G</td>
<td style="text-align: center;">1.73</td>
<td style="text-align: center;">1.84</td>
<td style="text-align: center;">1.83</td>
<td style="text-align: center;">1.77</td>
<td style="text-align: center;">1.94</td>
<td style="text-align: center;">1.94</td>
<td style="text-align: center;">1.59</td>
<td style="text-align: center;">1.77</td>
<td style="text-align: center;">1.76</td>
</tr>
<tr>
<td style="text-align: right;">40G-40L</td>
<td style="text-align: center;">1.02</td>
<td style="text-align: center;">1.07</td>
<td style="text-align: center;">1.07</td>
<td style="text-align: center;">1.02</td>
<td style="text-align: center;">1.08</td>
<td style="text-align: center;">1.08</td>
<td style="text-align: center;">1.01</td>
<td style="text-align: center;">1.07</td>
<td style="text-align: center;">1.07</td>
</tr>
</tbody>
</table>

Table 2. Comparison of *λ* obtained from the individual-level (vs. pooled) statistical model

## Detailed Results from Figure 4

Table 3 presents the estimates reported in Figure 4 in tabular format. While the estimates obtained on a random subset of lotteries are close to the original estimates, we do not find differences between conditions when restricting the analysis to the lotteries that were presented in all conditions.

<table style="width:100%;">
<colgroup>
<col style="width: 11%" />
<col style="width: 9%" />
<col style="width: 9%" />
<col style="width: 9%" />
<col style="width: 9%" />
<col style="width: 9%" />
<col style="width: 9%" />
<col style="width: 9%" />
<col style="width: 9%" />
<col style="width: 9%" />
</colgroup>
<thead>
<tr>
<th rowspan="2" style="text-align: right;">Condition</th>
<th colspan="3" style="text-align: center;">Experiments 1a + 1b</th>
<th colspan="3" style="text-align: center;">Experiment 1a</th>
<th colspan="3" style="text-align: center;">Experiment 1b</th>
</tr>
<tr>
<th style="text-align: center;">All</th>
<th style="text-align: center;">Random</th>
<th style="text-align: center;">Common</th>
<th style="text-align: center;">All</th>
<th style="text-align: center;">Random</th>
<th style="text-align: center;">Common</th>
<th style="text-align: center;">All</th>
<th style="text-align: center;">Random</th>
<th style="text-align: center;">Common</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: right;">20G-20L</td>
<td style="text-align: center;">1.04</td>
<td style="text-align: center;">1.08</td>
<td style="text-align: center;">0.96</td>
<td style="text-align: center;">1.06</td>
<td style="text-align: center;">1.05</td>
<td style="text-align: center;">0.99</td>
<td style="text-align: center;">1.03</td>
<td style="text-align: center;">1.06</td>
<td style="text-align: center;">0.99</td>
</tr>
<tr>
<td style="text-align: right;">20G-40L</td>
<td style="text-align: center;">0.71</td>
<td style="text-align: center;">0.70</td>
<td style="text-align: center;">0.96</td>
<td style="text-align: center;">0.73</td>
<td style="text-align: center;">0.69</td>
<td style="text-align: center;">1.00</td>
<td style="text-align: center;">0.70</td>
<td style="text-align: center;">0.66</td>
<td style="text-align: center;">1.00</td>
</tr>
<tr>
<td style="text-align: right;">40G-20G</td>
<td style="text-align: center;">1.84</td>
<td style="text-align: center;">1.74</td>
<td style="text-align: center;">1.03</td>
<td style="text-align: center;">1.94</td>
<td style="text-align: center;">1.85</td>
<td style="text-align: center;">1.08</td>
<td style="text-align: center;">1.77</td>
<td style="text-align: center;">1.73</td>
<td style="text-align: center;">1.08</td>
</tr>
<tr>
<td style="text-align: right;">40G-40L</td>
<td style="text-align: center;">1.07</td>
<td style="text-align: center;">1.10</td>
<td style="text-align: center;">1.01</td>
<td style="text-align: center;">1.08</td>
<td style="text-align: center;">1.14</td>
<td style="text-align: center;">1.03</td>
<td style="text-align: center;">1.07</td>
<td style="text-align: center;">1.08</td>
<td style="text-align: center;">1.03</td>
</tr>
</tbody>
</table>

Table 3. Estimates of *λ* in Experiments 1a and 1b when computed on all lotteries, random subsets of nine lotteries, and the nine common lotteries.

## Unsuccessful Re-Analysis of Experiment 2

W&S describe the stimuli of this experiment as follows: *“We used two distributions for gains and losses, one ranging from \$6 to \$20 (in \$2 increments) and one three times larger, ranging from \$18 to \$60 (in \$6 increments). We only tested the asymmetric cases. Unlike in Experiments 1a and 1b, the possible gains and losses were randomly drawn and paired from the distributions to produce 64 pairs.”* This description does not match the values of gains and losses we found in the data. Gains and losses appear to be drawn at random from the range \[0, 19\] or from the range \[0, 59\], in increments of \$1. The authors confirmed that the range of values reported in the paper was indeed inaccurate.

## Re-Analysis of Experiment 3

Experiment 3 includes three conditions: One in which the gains (losses) are sampled from the \[5, 20\] (\[10, 40\]) range, one in which the gains (losses) are sampled from the \[10, 40\] (\[5, 20\]) range, and one in which both gains and losses are sampled from the \[10, 40\] range. All possible combinations of gains and losses were considered to create 256 lotteries in each condition, with 36 lotteries common to all conditions. For each lottery, participants indicated their willingness to accept each lottery on a four-point scale anchored at “Strongly Reject” and “Strongly Accept.”

Decision by sampling theory predicts that *λ* will be larger in the 40G-20L than in 40G-40L. Analyzing only the lotteries that are common across conditions does not provide compelling support for decision by sampling (see Table 4 and Figure 5).

<table style="width:57%;">
<colgroup>
<col style="width: 12%" />
<col style="width: 14%" />
<col style="width: 14%" />
<col style="width: 14%" />
</colgroup>
<thead>
<tr>
<th rowspan="2" style="text-align: right;">Condition</th>
<th colspan="3" style="text-align: center;">Experiment 3</th>
</tr>
<tr>
<th style="text-align: center;">All</th>
<th style="text-align: center;">Random</th>
<th style="text-align: center;">Common</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: right;">20G-40L</td>
<td style="text-align: center;">0.73</td>
<td style="text-align: center;">0.74</td>
<td style="text-align: center;">0.89</td>
</tr>
<tr>
<td style="text-align: right;">40G-40L</td>
<td style="text-align: center;">1.24</td>
<td style="text-align: center;">1.40</td>
<td style="text-align: center;">1.31</td>
</tr>
<tr>
<td style="text-align: right;">40G-20L</td>
<td style="text-align: center;">1.86</td>
<td style="text-align: center;">2.24</td>
<td style="text-align: center;">1.12</td>
</tr>
</tbody>
</table>

Table 4. Comparison of *λ* when computed on all lotteries, common lotteries, and random subsets of lotteries in Experiment 3.

![Figure 5. Estimates of λ for all lotteries, a random 36-lottery subset, and the common 36 lotteries.](https://quentinandre.net/publications/reanalysis-ws/media/image5.png)

Figure 5. Estimates of *λ* obtained on all lotteries (circles), a subset of 36 lotteries chosen at random for each participant (squares), and the subset of 36 common lotteries (diamonds).

*Figure 5 (visual description): Point-range plot (a separate dataset of 36 lotteries). λ on the y-axis (~0.5–3.0) with a reference line at 1; the x-axis lists three conditions ordered 20G-40L, 40G-40L, 40G-20L. Three series: All Lotteries (blue circles), Random Subset (orange squares), Common Lotteries (green diamonds), with 95% CIs. All-Lotteries and Random-Subset rise steeply from ≈0.7 (20G-40L) to ≈1.85–2.25 (40G-20L), whereas Common Lotteries stays comparatively flat (≈0.9 to ≈1.1–1.3) — again, the apparent variation in loss aversion disappears once only common lotteries are analyzed.*

# REFERENCES

Alempaki, D., Canic, E., Mullett, T. L., Skylark, W. J., Starmer, C., Stewart, N., & Tufano, F. (2019). Reexamining How Utility and Weighting Functions Get Their Shapes: A Quasi-Adversarial Collaboration Providing a New Interpretation. *Management Science*, *65*(10), 4841–4862. https://doi.org/10.1287/mnsc.2018.3170

André, Q. & De Langhe, B. (2019, November 11). No Evidence of Loss Aversion Disappearance and Reversal in Walasek and Stewart (2015). Retrieved from https://osf.io/67ng8/

De Langhe, B., & Puntoni, S. (2014). Bang for the buck: Gain-loss ratio as a driver of judgment and choice. *Management Science*, *61*(5), 1137–1163.

Diecidue, E., & Van De Ven, J. (2008). Aspiration level, probability of success and failure, and expected utility. *International Economic Review*, *49*(2), 683–700.

Ert, E., & Erev, I. (2013). On the descriptive value of loss aversion in decisions under risk: Six clarifications. *Judgment and Decision Making*, *8*(3), 214–235.

Gal, D., & Rucker, D. D. (2018). The Loss of Loss Aversion: Will It Loom Larger Than Its Gain? *Journal of Consumer Psychology*, *28*(3), 497–516. https://doi.org/10.1002/jcpy.1047

Gordon, B. R., Zettelmeyer, F., Bhargava, N., & Chapsky, D. (2019). A comparison of approaches to advertising measurement: Evidence from big field experiments at Facebook. *Marketing Science*, *38*(2), 193–225.

Kahneman, D., & Tversky, A. (1979). Prospect Theory: An Analysis of Decision under Risk. *Econometrica*, *47*(2), 263–291. JSTOR. https://doi.org/10.2307/1914185

Kőszegi, B., & Rabin, M. (2006). A Model of Reference-Dependent Preferences. *The Quarterly Journal of Economics*, *121*(4), 1133–1165. https://doi.org/10.1093/qje/121.4.1133

Kühberger, A., Schulte-Mecklenbeck, M., & Perner, J. (1999). The Effects of Framing, Reflection, Probability, and Payoff on Risk Preference in Choice Tasks. *Organizational Behavior and Human Decision Processes*, *78*(3), 204–231. https://doi.org/10.1006/obhd.1999.2830

Loomes, G., & Sugden, R. (1982). Regret Theory: An Alternative Theory of Rational Choice under Uncertainty. *Economic Journal*, *92*(368), 805–824.

Neumann, N., & Böckenholt, U. (2014). A meta-analysis of loss aversion in product choice. *Journal of Retailing*, *90*(2), 182–197.

Payne, J. W., Laughhunn, D. J., & Crum, R. (1980). Translation of Gambles and Aspiration Level Effects in Risky Choice Behavior. *Management Science*, *26*(10), 1039–1060. JSTOR.

Stewart, N., Chater, N., & Brown, G. D. A. (2006). Decision by sampling. *Cognitive Psychology*, *53*(1), 1–26. https://doi.org/10.1016/j.cogpsych.2005.10.003

Stewart, N., Reimers, S., & Harris, A. J. L. (2015). On the Origin of Utility, Weighting, and Discounting Functions: How They Get Their Shapes and How to Change Their Shapes. *Management Science*, *61*(3), 687–705. https://doi.org/10.1287/mnsc.2013.1853

Tom, S. M., Fox, C. R., Trepel, C., & Poldrack, R. A. (2007). The neural basis of loss aversion in decision-making under risk. *Science*, *315*(5811), 515–518.

Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. *Journal of Risk and Uncertainty*, *5*(4), 297–323. https://doi.org/10.1007/BF00122574

Walasek, L., Mullett, T. L., & Stewart, N. (2018). A meta-analysis of loss aversion in risky contexts. *Available at SSRN 3189088*.

Walasek, L., & Stewart, N. (2015). How to make loss aversion disappear and reverse: Tests of the decision by sampling origin of loss aversion. *Journal of Experimental Psychology. General*, *144*(1), 7–11. https://doi.org/10.1037/xge0000039

Walasek, L., & Stewart, N. (2018). Context-dependent sensitivity to losses: Range and skew manipulations. *Journal of Experimental Psychology: Learning, Memory, and Cognition*, *45*(6), 957. https://doi.org/10.1037/xlm0000629

Yechiam, E. (2018). Acceptable losses: The debatable origins of loss aversion. *Psychological Research*, 1–13.

[^1]: With 118 cites, it is the 13th most-cited article since 2015 (citation metrics from Google Scholar, recovered on March 10<sup>th</sup> 2020)

[^2]: Throughout this manuscript, we refer to the range of gains and losses used in Experiments 1a and 1b, and we use a logistic choice rule (as in Experiments 1a, 1b and 2). This is done without loss of generality: Our observations and results remain valid regardless of the specific values of gains and losses considered, and also apply when participants rate (rather than choose) lotteries, as in Experiment 3 (cf. appendix).

[^3]: This model can be estimated using frequentist techniques (using a logistic regression, in which case the 95% confidence intervals for $`\lambda`$ are constructed by bootstrapping) or a Bayesian model (in which case the 95% confidence intervals for $`\lambda`$ are constructed by drawing from the posterior distribution). While this pooled analysis ignores individual-level heterogeneity, it yields estimates of $`\lambda`$ that closely match those reported by the authors when applied to the full data (see appendix for model comparison).

[^4]: The data and code were recovered from [Neil Stewart’s website](https://web.archive.org/web/20190925083408/https:/www.stewart.warwick.ac.uk/publications/loss_aversion/), and are also available on our [OSF repository](https://osf.io/67ng8/).
