```
library(tidyverse)
library(truncnorm)
library(extraDistr)
library(EnvStats)
library(janitor)
library(kde1d)
library(kableExtra)
set.seed(123)
<-100000 n
```

# Set-up and data preparation

We begin by uploading packages and writing functions that we will reuse. By convention, whenever we sample distributions, we will use 100,000 draws. We set the randomization seed to 123 for replicability.

We create a function to sample from the beta distribution, which we use for all prevalence estimates. To sample from the beta distribution we need to be able to find a good standard deviation, based on our estimated ranges and mean. We also write a function to perform a binary search to find a standard deviation.

```
# sample from the beta distribution by specifying the mean proportion an its standard deviation
<- function(mean_val, sd_val) {
sample_beta <- 100000
n # calculate alpha and beta parameters
<- sd_val^2
var_val <- ((1 - mean_val) / var_val - 1 / mean_val) * mean_val^2
alpha <- alpha * (1 / mean_val - 1)
beta # Check and adjust alpha and beta to ensure they are valid for the beta distribution
if (alpha <= 0 | beta <= 0) {
stop("Invalid shape parameters: alpha and beta must be greater than 0.")
}# sample from beta distribution
return(rbeta(n, alpha, beta))
}
# find a good standard deviation for beta distributions for which we have mean and range estimates by using binary search
<- function(mean_val,
find_good_sd_binary
fifth_percentile, ninety_fifth_percentile,tol=0.001, sd_val=-1) {
# Set start value
if (sd_val == -1){
=0.2
sd_val
}# Check if the provided parameters are valid
if (sd_val^2 >= mean_val * (1 - mean_val)) {
stop("Invalid sd_val: sd_val^2 must be less than mean_val * (1 - mean_val).")
}
# Initialize the search range
<- 0
lower_bound <- sqrt(mean_val * (1 - mean_val))
upper_bound
# Perform binary search to find the optimal sd_val
while (abs(upper_bound - lower_bound) > tol) {
<- (lower_bound + upper_bound) / 2
sd_val <- sample_beta(mean_val, sd_val)
samples <- sum(
prop_between_given_percentiles > fifth_percentile & samples < ninety_fifth_percentile) / length(samples)
samples
if (abs(prop_between_given_percentiles - 0.9) < tol) {
return(sd_val)
else if (prop_between_given_percentiles < 0.9) {
} <- sd_val
upper_bound else {
} <- sd_val
lower_bound
}
}return((lower_bound + upper_bound) / 2)
}
```

Next, we write functions to sample from the dirichlet distribution and the truncated log-normal distribution, which we use for estimating pain intensity and some durations of pain, respectively.

```
# sample from Dirichlet distribution
<-function(a=0, b=0, x=0, y=0){
sample_dirichletif (sum(c(a, b, x, y)) != 100){
print("Arguments do not sum to 100! Their sum is: ")
print(sum(c(a, b, x, y)))
else {
} <- 100000
n = c(a, b, x, y)[c(a, b, x, y) != 0]
non_zero_args return(rdirichlet(n, non_zero_args))
}
}
# sample from truncated log-normal distribution
<- function(n_value, mean, sd, min_value, max_value){
sample_trunclogn <-rlnormTrunc(n = n_value, meanlog = log(mean^2 / sqrt(mean^2 + sd^2)),
trunclogn_distsdlog = sqrt(log(1 + (sd^2/mean^2))),
min=min_value, max=max_value)
return(trunclogn_dist)
}
```

Finally, we make a function to print tables of our results:

```
=function(x){
show_tablekable(x, table.attr = 'data-quarto-disable-processing="true"') %>%
::kable_styling(full_width=FALSE, position="center", font_size=12,
kableExtrabootstrap_options = c("condensed"))
}
```

# Prevalence

There are four broad categories of shrimp farms: extensive, semi-intensive, intensive, and super-intensive.

Based on 2018 data from Boyd et al. (2021), and accounting for increasing intensification, we estimate the percentage of penaeid shrimp raised in different production systems as:

- Extensive: ~11.2%
- Semi-intensive: ~16.4%
- Intensive: ~71.4%
- Super-intensive: ~1%

Although we assume these estimates are credible, we add in a modest amount of uncertainty to account for potential weaknesses in Boyd et al. (2018), changes that may have occurred since the survey was conducted, and because these proportions relate to the shrimp produced at the end of production—they do not account for individuals who die pre-slaughter.

```
# Define the expected proportions for each practice
<- 0.112
ext_mean <- 0.164
semi_mean <- 0.714
int_mean <- 0.01
super_mean
sum(c(ext_mean,semi_mean,int_mean,super_mean))
```

`[1] 1`

```
# Sample from the Dirichlet distribution
<-data.frame(sample_dirichlet(ext_mean*100, semi_mean*100,
prop_sample*100, super_mean*100))
int_mean# we multiply by 100 to make the distributions less noisy
colnames(prop_sample)<-c("Ext", "Semi", "Int", "Super")
<-round(rbind((quantile(x =prop_sample[,1], probs = c(.05, .50, .95))),
production_summarydistquantile(x =prop_sample[,2], probs = c(.05, .50, .95))),
(quantile(x =prop_sample[,3], probs = c(.05, .50, .95))),
(quantile(x =prop_sample[,4], probs = c(.05, .50, .95)))), 2)
(row.names(production_summarydist)<-c("Extensive","Semi-Intensive","Intensive", "Super-Intensive")
production_summarydist
```

```
5% 50% 95%
Extensive 0.07 0.11 0.17
Semi-Intensive 0.11 0.16 0.23
Intensive 0.64 0.72 0.79
Super-Intensive 0.00 0.01 0.03
```

We then estimate prevalence ranges and means, based on data where possible, and draw from the beta distribution. Where no data on the mean is available, we often set the mean to be the mid point of the estimated range. Where no data on the range is available we use the rough, subjective likelihood categories below:

- Very unlikely: 10%
- Unlikely: 25%
- Unsure how likely: 50%
- Likely: 75%
- Very likely: 90%

# Duration (pre-slaughter mortality)

For time, we always use hours as the unit. We base pain duration on the scientific literature where possible.

The average time spent in pain from any given welfare threat has to account for shrimp that die prematurely from that or other welfare threats, because this reduces how long shrimp can suffer from that threat in the aggregate.

If a threat is a one-off event in a shrimp’s life (e.g., slaughter), we just model estimated duration. Otherwise, if it occurs regularly, we often model duration as the pain caused to a shrimp *per day*, and then weight by an estimate of the average number of days lived by a shrimp (including those that die pre-slaughter). We can then also estimate the proportion of shrimp experiencing welfare threats that are life-stage specific (e.g., slaughter).

We use the estimates presented in Waldhorn and Autric’s (2023) Guesstimate model. Specifically, we use the mortality rate of each life stage multiplied by the mean number of days a shrimp lives given that they die in a certain life stage, and then weight these by the proportion of farmed shrimp each species accounts for.

**Note:** the guesstimate model re-samples each time it is opened so the figures you see may differ slightly from those used here.

To best sample from the Guesstimate data (especially the tails of the distributions) and to remove errors created by Guesstimate (e.g., negative values), we generated 100,000 synthetic samples for each of these, based off the original 5000 guesstimate samples. To do this, we use the `kde1d`

function, which estimates a kernel density function from the 5000 samples, and the `rkde1d`

function, which generates n (in our case 100,000) samples from the estimated kernel density.

## Number of shrimp dying on farms

```
# loading die on farms samples, taken from Guesstimate model
<-read.csv("../data/die_on_farm_samples.csv", header=TRUE, sep=",")
allspecies_dof_samp
<-allspecies_dof_samp$vannamei[allspecies_dof_samp$vannamei>0] %>% # Cell GZ in the Guesstimate model
vannamei_dof# removing negative values created by Guesstimate
kde1d(xmin=0) %>% # running kde1d function and ensuring no values below 0 are created
rkde1d(100000,.) # creating 100,000 samples from the data
# repeat this for other taxa
<-allspecies_dof_samp$monodon[allspecies_dof_samp$monodon>0] %>% # Cell ES in the Guesstimate model
monodon_dofkde1d(xmin=0) %>%
rkde1d(100000,.)
<-allspecies_dof_samp$other_pen[allspecies_dof_samp$other_pen>0] %>% # Cell NN in the Guesstimate model
otherpen_dofkde1d(xmin=0) %>%
rkde1d(100000,.)
```

Now we can calculate the proportion of farmed shrimp accounted for by each species for each of our 100,000 samples.

```
<-cbind(vannamei_dof, monodon_dof, otherpen_dof) # combining the 100,000 samples for each species into a data frame
allspecies_dof
<-as.data.frame(allspecies_dof) %>%
prop_allspeciesrowwise() %>%
mutate(van_prop=vannamei_dof/sum(vannamei_dof, monodon_dof, otherpen_dof), # calculating each sample as a proportion of the total shrimp, row-wise
mon_prop=monodon_dof/sum(vannamei_dof, monodon_dof, otherpen_dof),
otherpen_prop=otherpen_dof/sum(vannamei_dof, monodon_dof, otherpen_dof))
<-as.data.frame(prop_allspecies[,4:6]) # keeping only the proportions
prop_allspecies_dof
saveRDS(allspecies_dof, file="../data/allspecies_dof.RData")
saveRDS(prop_allspecies_dof, file="../data/prop_allspecies_dof.RData")
```

## Days lived by life stage

Next we load the data for the number of days lived by a shrimp, given that it dies within a certain life stage. The three life stages under consideration here are postlarval, juvenile-adult, and the days lived before slaughter, which we call the total farmed days.

```
# First load the data
<-read.csv("../data/vannamei_days_lived.csv", header=TRUE,sep=",") # Cells TE, VO, XM, and QV in the Guesstimate model
vannamei_days_samp<-read.csv("../data/monodon_days_lived.csv", header=TRUE, sep=",") # Cells MF, HI, OZ, and LW in the Guesstimate model
monodon_days_samp<-read.csv("../data/otherpen_days_lived.csv", header=TRUE, sep=",") # Cells NQ, XV, DQ, and YY in the Guesstimate model
otherpen_days_samp
# Next make the 100,000 samples
# create empty data frame
<-data.frame(postlarval=rep(NA, 100000),
vannamei_days_livedjuvenile.adult=rep(NA, 100000),
total.farmed=rep(NA, 100000))
# fill empty data frame with data simulated from Guesstimate data using kde1d function
$postlarval<-vannamei_days_samp$postlarval %>%
vannamei_days_livedkde1d() %>%
rkde1d(100000,.)
$juvenile.adult<-vannamei_days_samp$juvenile.adult %>%
vannamei_days_livedkde1d() %>%
rkde1d(100000,.)
# repeat for other species
<-data.frame(postlarval=rep(NA, 100000),
monodon_days_livedjuvenile.adult=rep(NA, 100000),
total.farmed=rep(NA, 100000))
$postlarval<-monodon_days_samp$postlarval %>%
monodon_days_livedkde1d() %>%
rkde1d(100000,.)
$juvenile.adult<-monodon_days_samp$juvenile.adult[monodon_days_samp$juvenile.adult<214] %>%
monodon_days_livedkde1d(xmax=213.99) %>% # guesstimate produced some large samples over what we think is the maximum ongrowing length (see below) so we restrict to one day prior to the maximum ongrowing period (as this is the maximum age a juvenile--adult can be while still dying pre-slaughter)
rkde1d(100000,.)
<-data.frame(postlarval=rep(NA, 100000),
otherpen_days_livedjuvenile.adult=rep(NA, 100000),
total.farmed=rep(NA, 100000))
$postlarval<-otherpen_days_samp$postlarval %>%
otherpen_days_livedkde1d() %>%
rkde1d(100000,.)
$juvenile.adult<-otherpen_days_samp$juvenile.adult[otherpen_days_samp$juvenile.adult<153] %>%
otherpen_days_livedkde1d(xmax=152.99) %>% # guesstimate produced some large samples over what we think is the maximum ongrowing length (see below) so we restrict to just prior to the maximum ongrowing period (as this is the maximum age a juvenile--adult can be while still dying pre-slaughter)
rkde1d(100000,.)
```

Guesstimate produced a few large numbers of over one year for the total days farmed, which seems incorrect, so we remove these and restrict to:

- 183 days for
*P. vannamei*(FAO (2009a) suggests the maximum cycle is 6 months in ongrowing 6 months = ~183 days) - 214 days for
*P. monodon*(FAO (2009b) suggests the maximum cycle is >6 months in ongrowing, so we assume ~7 months, which = ~214 days) - 153 days for other penaeids (Table 4.6c–d in Wickens and Lee (2002) suggests the maximum cycle is 4 months of ongrowing. As other penaeids can refer to many different species, we are more uncertain about this value, so we increase the maximum to 5 months. 5 months = ~153 days)

The `kde1d`

function will still produce a few samples just above this estimation.

```
<-vannamei_days_samp$total_farmed_d[vannamei_days_samp$total_farmed_d<183]
vannamei_days_lived_total$total.farmed<-vannamei_days_lived_total %>%
vannamei_days_livedkde1d() %>%
rkde1d(100000,.)
<-monodon_days_samp$total_farmed_d[monodon_days_samp$total_farmed_d<214]
monodon_days_lived_total$total.farmed<-monodon_days_lived_total %>%
monodon_days_livedkde1d() %>%
rkde1d(100000,.)
<-otherpen_days_samp$total_farmed_d[otherpen_days_samp$total_farmed_d<153]
otherpen_days_lived_total$total.farmed<-otherpen_days_lived_total %>%
otherpen_days_livedkde1d() %>%
rkde1d(100000,.)
```

## Mortality rates by life stage

Now we load the data for the mortality rates samples, then make the 100,000 samples.

```
<-read.csv("../data/vannamei_mortality_rates.csv",header=TRUE, sep=",") # Cells UY, UJ, and TF in the Guesstimate model
vannamei_mort_samp<-read.csv("../data/monodon_mortality_rates.csv", header=TRUE, sep=",") # Cells ZA, XZ, and YR in the Guesstimate model
monodon_mort_samp<-read.csv("../data/otherpen_mortality_rates.csv", header=TRUE,sep=",")
otherpen_mort_samp# Cells MU, UQ, and XJ in the Guesstimate model
<-data.frame(postlarval=rep(NA, 100000),
vannamei_stage_probjuvenile.adult=rep(NA, 100000))
# fill empty data frame with data simulated from Guesstimate data using kde1d function
$postlarval<-vannamei_mort_samp$postlarval %>%
vannamei_stage_probkde1d(xmin = 0,
xmax = 1) %>%
rkde1d(100000,.)
$juvenile.adult<-vannamei_mort_samp$juvenile.adult[vannamei_mort_samp$juvenile.adult<1] %>%
vannamei_stage_probkde1d(xmin = 0, # The juvenile-adult samples from Guesstimate sampled some instances over 1 so we get rid of these first, or else the xmin and xmax functions won't work
xmax = 1) %>%
rkde1d(100000,.)
# repeat for other taxa
<-data.frame(postlarval=rep(NA, 100000),
monodon_stage_probjuvenile.adult=rep(NA, 100000))
$postlarval<-monodon_mort_samp$postlarval %>%
monodon_stage_probkde1d(xmin = 0,
xmax = 1) %>%
rkde1d(100000,.)
$juvenile.adult<-monodon_mort_samp$juvenile.adult %>%
monodon_stage_probkde1d(xmin = 0,
xmax = 1) %>%
rkde1d(100000,.)
<-data.frame(postlarval=rep(NA, 100000),
otherpen_stage_probjuvenile.adult=rep(NA, 100000))
$postlarval<-otherpen_mort_samp$postlarval %>%
otherpen_stage_probkde1d(xmin = 0,
xmax = 1) %>%
rkde1d(100000,.)
$juvenile.adult<-otherpen_mort_samp$juvenile.adult %>%
otherpen_stage_probkde1d(xmin = 0,
xmax = 1) %>%
rkde1d(100000,.)
```

Right now we have the probability that a shrimp dies in each stage, conditional on the shrimp having survived previous stages. However, we want the cumulative probability of a shrimp dying in the ongrowing period—for example, the probability of dying in the juvenile–adult stage is

\(1-p(dying\ in\ postlarval\ stage)*p(dying\ in\ juvenile--adult\ stage)\).

We also calculate the probability of a shrimp surviving to slaughter age, which means it has survived both postlarval and juvenile–adult stages.

```
# create new columns with the total probability they make it to each stage
<-vannamei_stage_prob %>%
vannamei_stage_prob_totalrowwise() %>%
mutate(postlarval.total=postlarval,
juvenile.adult.total=(1-postlarval)*juvenile.adult,
slaughter.age.total=(1-postlarval)*(1-juvenile.adult))
<-as.data.frame(vannamei_stage_prob_total) %>%
vannamei_stage_prob_totalselect(ends_with("total")) # keep only the totals
# repeat for other species
<-monodon_stage_prob %>%
monodon_stage_prob_totalrowwise() %>%
mutate(postlarval.total=postlarval,
juvenile.adult.total=(1-postlarval)*juvenile.adult,
slaughter.age.total=(1-postlarval)*(1-juvenile.adult))
<-as.data.frame(monodon_stage_prob_total)%>%
monodon_stage_prob_totalselect(ends_with("total"))
<-otherpen_stage_prob %>%
otherpen_stage_prob_totalrowwise() %>%
mutate(postlarval.total=postlarval,
juvenile.adult.total=(1-postlarval)*juvenile.adult,
slaughter.age.total=(1-postlarval)*(1-juvenile.adult))
<-as.data.frame(otherpen_stage_prob_total) %>%
otherpen_stage_prob_totalselect(ends_with("total"))
```

## Calculating pre-slaughter mortality rates

Now we can calculate the pre-slaughter mortality rate across the production cycle. This will give us estimates of the average number of days lived by a shrimp.

To get a weighted average mortality rate we need to weight the life stage rates by the probability a shrimp is from a given species. We start by creating vector of species chosen according to probabilities from proportions calculated above (see Number of shrimp dying on farms).

```
= vapply(
species_chosen 1:100000,
function(i) {sample.int(3, size=1, prob=prop_allspecies_dof[i, ])},
integer(1))
# Check that the proportions look correct
print(data.frame(vannamei.prop=length(species_chosen[species_chosen==1])/length(species_chosen),
monodon.prop=length(species_chosen[species_chosen==2])/length(species_chosen),
otherpen.prop=length(species_chosen[species_chosen==3])/length(species_chosen)))
```

```
vannamei.prop monodon.prop otherpen.prop
1 0.82507 0.0832 0.09173
```

Now, we make a data frame of 100,000 stage probabilities, weighted by the proportion of farmed shrimp that comes from each species.

```
= matrix(NA, nrow=100000, ncol=3)
stage_probabilities for (i in 1:100000){
if (species_chosen[i] == 1){
= as.matrix(vannamei_stage_prob_total[i,])
stage_probabilities[i, ]
}if (species_chosen[i] == 2){
= as.matrix(monodon_stage_prob_total[i,])
stage_probabilities[i, ]
}if (species_chosen[i] == 3){
= as.matrix(otherpen_stage_prob_total[i,])
stage_probabilities[i, ]
}
}head(stage_probabilities)
```

```
[,1] [,2] [,3]
[1,] 0.2279207 0.02892378 0.7431556
[2,] 0.1233640 0.05429026 0.8223458
[3,] 0.1248379 0.21950486 0.6556573
[4,] 0.1639346 0.12880184 0.7072636
[5,] 0.1902008 0.08136774 0.7284314
[6,] 0.1596227 0.11999293 0.7203843
```

`saveRDS(stage_probabilities, file="../data/stage_probabilities.RData")`

Now we can calculate the weighted days lived for each life stage of each species.

```
= rep(NA, 100000)
average_days_lived
for (i in 1:100000){
<- sample.int(3, size=1, prob=stage_probabilities[i, ])
stage_chosen # P. vannamei
if (species_chosen[i] == 1){
if (stage_chosen == 1){
= vannamei_days_lived$postlarval[i]
average_days_lived[i]
}if (stage_chosen == 2){
= vannamei_days_lived$juvenile.adult[i]
average_days_lived[i]
}if (stage_chosen == 3){
= vannamei_days_lived$total.farmed[i]
average_days_lived[i]
}
}# P. monodon
if (species_chosen[i] == 2){
if (stage_chosen == 1){
= monodon_days_lived$postlarval[i]
average_days_lived[i]
}if (stage_chosen == 2){
= monodon_days_lived$juvenile.adult[i]
average_days_lived[i]
}if (stage_chosen == 3){
= monodon_days_lived$total.farmed[i]
average_days_lived[i]
}
}# other penaeids
if (species_chosen[i] == 3){
if (stage_chosen == 1){
= otherpen_days_lived$postlarval[i]
average_days_lived[i]
}if (stage_chosen == 2){
= otherpen_days_lived$juvenile.adult[i]
average_days_lived[i]
}if (stage_chosen == 3){
= otherpen_days_lived$total.farmed[i]
average_days_lived[i]
}
}
}
head(average_days_lived)
```

`[1] 92.828757 140.129031 85.592464 122.142815 111.202064 9.701857`

`quantile(average_days_lived, probs = c(0.01, .05, .50, .95))`

```
1% 5% 50% 95%
11.46242 13.92991 133.03242 175.05771
```

`mean(average_days_lived)`

`[1] 114.8756`

`plot(density(average_days_lived), main= "Average mortality rate of farmed penaeid shrimp", xlab="Days")`

`saveRDS(average_days_lived, file="../data/average_days_lived.RData")`

# Intensity

We use the Pain-Track framework (Alonso & Schuck-Paim, 2021a) developed by Welfare Footprint to compare the overall burden of different welfare threats affecting farmed shrimp. The pain categories are: annoying, hurtful, disabling, and excruciating, with each worse than the last.

`<-c("Excruciating", "Disabling", "Hurtful", "Annoying") paincategories`

Welfare Footprint’s pain track categories are based on functional differences between qualitatively different types of pain. Thus, they are, strictly speaking, ordinal, not cardinal representations of pain severity.

However, leaving the pain categories disaggregated does not allow for an overall comparison of welfare threats because, for instance, some threats may have a long duration exclusively in mild pain categories, while others may have briefer threats in more severe categories. Here, we use Disabling as the reference category and assign weights to the other severity categories that are relative to it. We draw from uniform distributions for the pain category weightings as we have high uncertainty about how much worse or better different pain categories are, or whether long-lasting, mild pains should be prioritized over shorter, more severe pains or vice versa. In particular, we are uncertain about whether excruciating pain is a few or many orders of magnitude worse than disabling pain. For more discussion of this issue, see McAuliffe and Shriver (2023).

Our lower bounds for excruciating pain and upper bounds for annoying and disabling pain are weights favored by Duffy (2023, p.18) and Šimčikas (2022). The weights were therefore as follows:

- Annoying: 0.002 to 0.01
- Hurtful: 0.02 to 0.2
- Disabling: 1
- Excruciating: 5 to 1000

You can change the input below if you disagree with the relative weights. Moreover, all of the inputs below can be changed to reflect your personal beliefs about the intensity, duration, and prevalence of each welfare harm.

```
<-runif(n = 100000, min = (1/500), max = (1/100))
Annoying_Weight<-runif(n = 100000, min = (1/50), max = (1/5))
Hurtful_Weight<- 1
Disabling_Weight <-runif(n = 100000, min = 5, max = 1000) Excruciating_Weight
```