2 Donations

Note: This is largely meant as a year-independent template. ‘…’ indicates a place we may want to fill in more details1 See our discussion of what should go in each chapter here. Note that we aim to keep the commentary here very limited. The EA forum post will do a bit more commenting… feel free to insert, but hide any notes on this output that we may want to integrate into the EA forum post. We will try to move all ‘generic’ statistical and methods discussions to our methods discussion book, and link this here.

This report is summarized and discussed in the EA forum post …

Linked to forum post: Link; EA Survey 2020 Series: Donation Data

# Folder to save plots in plots_folder <- here("analysis", "plots", "donations_20")

I’d love your feedback :

Show

I’d love to get your feedback on this report. (cut …)

Ways of leaving feedback.

Connection to EA forum post described…

2.1 Introduction and summary

Very brief: Statement about the importance of charitable giving and the data collection here. (cut …), what we present, … linked outline here, e.g.,

Our modeling work work considers how donations (total, share-of-income, and ‘donated over 1000 USD’) jointly relates to a range of characteristics. We present:

In the narrative below, we simply refer to “donations” rather than “reported donations” for brevity. Unless otherwise mentioned, all figures simply add, average, or otherwise summarize individual responses from the EA Survey years mentioned.

# Construct charity-specific aggregations # TODO: do for eas_all, move to build side eas_new <- eas_new %>% mutate( num_named_dons = rowSums(!is.na(select(., one_of(all_chars)))), dev_don = rowSums(across(all_of(dev_health_chars)), na.rm = TRUE), d_dev_don = dev_don > 0, animal_don = rowSums(across(all_of(animal_chars)), na.rm = TRUE), d_animal_don = animal_don>0, ea_meta_don = rowSums(across(all_of(ea_meta_chars)), na.rm = TRUE), d_ea_meta_don = ea_meta_don>0, lt_ai_don = rowSums(across(all_of(lt_ai_chars)), na.rm = TRUE), d_lt_ai_don = lt_ai_don>0, other_don = rowSums(across(all_of(other_chars)), na.rm = TRUE), d_other_don = other_don>0 ) %>% mutate_at(.vars =where_don_vars, funs(ifelse(num_named_dons==0, NA, .)) ) eas_new %<>% labelled::set_variable_labels(.labels = as.list(all_char_labels), .strict=FALSE)
pct_tot <- function(x, df=eas_new) { x/NROW(df)*100 } num_don <- sum(eas_new$donation_c>0, na.rm=TRUE) num_na_don <- sum(is.na(eas_new$donation_c)) zero_don <- sum(eas_new$donation_c==0, na.rm=TRUE) tot_don <- sum(eas_new$donation_c, na.rm=TRUE) #for all years, for USA nonstudents only tot_don_all_usa <- sum(eas_all$donation_c[eas_all$d_live_usa==1 & eas_all$d_student==0], na.rm=TRUE) tot_inc_all_usa <- sum(eas_all$income_c_imp_bc5k[eas_all$d_live_usa==1 & eas_all$d_student==0], na.rm=TRUE) tot_share_don_us_nonstudent <- tot_don_all_usa/tot_inc_all_usa tot_don_dev <- sum(eas_new$dev_don, na.rm=TRUE) tot_don_animal <- sum(eas_new$animal_don, na.rm=TRUE) tot_don_ea_meta <- sum(eas_new$ea_meta_don, na.rm=TRUE) tot_don_lt_ai <- sum(eas_new$lt_ai_don, na.rm=TRUE) med_don <- median(eas_new$donation_c, na.rm=TRUE) mean_don <- mean(eas_new$donation_c, na.rm=TRUE) mean_don_not_new <- mean(eas_new$donation_c[eas_new$year_involved_n!=year_s], na.rm=TRUE) mean_don_ly <- mean(eas_all$donation_c[eas_all$year==(year_n-1)], na.rm=TRUE) mean_don_ly_not_new <- mean( eas_all$donation_c[eas_all$year==(year_n-1) & as.numeric(eas_all$year_involved)!=(year_n-1)], na.rm=TRUE) plan_donate_ly_c <- filter(eas_all, year == year_n-1) %>% pull(donation_plan_c) mean_plan_ly <- mean(plan_donate_ly_c, na.rm=TRUE) #plan in 2018 for 2019 med_plan_ly <- median(plan_donate_ly_c, na.rm=TRUE) med_not_new <- median(eas_new$donation_c[eas_new$year_involved_n!=year_s], na.rm=TRUE) top_1p3don <- eas_new %>% select(donation_c) %>% slice_max(donation_c, prop =.013) %>% sum() top_1p3share <- top_1p3don/tot_don

2.1.1 Summary, key results and numbers

Responses, total and average donation

  • 55.5% of EAs in the 2020 survey reported making a charitable donation in 2019

  • 13.7% reported making zero donations, and

  • 30.8% did not respond to this question.

    • Thus, of those who responded, 80.3% reported making a donation in the prior year.
  • Participants reported total donations of 10,695,926 USD in 2019 (cf 16.1M USD in 2018).


Trends

  • The number of survey participants went from 0 in 2019 (0 of whom answered the donation question) to 12841 (7288 answering the donation question) in 2020.

  • Over the past years, we see … (trend in median or mean donation amounts reported).2

  • Median annual donation in 2019: 528 USD

    • Cf prior year …
  • Mean (reported) annual donation for 2019 was 7,516 USD (cf 9,370 for 2018)

  • or 8,607 USD excluding those who joined in 2020

  • (cf 10,246 USD for 2018 excluding those who joined in 2019).

  • Median annual donation in 2019 excluding those who joined EA in 2020 was 761 USD (cf. 990 USD for the comparable median for 2018/2019 and 832 USD for 2017/2018). (See ’donation and income trends in EA’ for more details).


Donation as shares of income, distribution of donations

  • In 2019 1.3% of donors accounted for $6,437,404 in donations or 60% of the survey total.

    • (Cf …)
med_don_share <- median(eas_new$don_share_inc_imp, na.rm = TRUE) med_don_share_imp_bc <- median(eas_new$don_share_inc_imp_bc5k, na.rm = TRUE) med_don_share_ly <- median(eas_lastyr$don_share_inc_imp, na.rm = TRUE) med_don_share_imp_bc_ly <- median(eas_lastyr$don_share_inc_imp_bc5k, na.rm = TRUE) earn_filter <- quos(d_student==0, income_c>10000) med_don_share_imp_ns_10k <- eas_new %>% filter(!!!earn_filter) %>% summarise(med=median(don_share_inc_imp, na.rm = TRUE)) tot_inc <- sum(eas_new$income_c, na.rm=TRUE) tot_inc_imp_bc <- sum(eas_new$income_c_imp_bc5k, na.rm=TRUE) share_don_gt_10pct <- sum(eas_new$don_share_inc_imp>=.1, na.rm = TRUE)/sum(!is.na(eas_new$don_share_inc_imp)) share_don_gt_10pct_imp <- sum(eas_new$don_share_inc_imp_bc5k>=.1, na.rm = TRUE)/sum(!is.na(eas_new$don_share_inc_imp_bc5k)) share_don_gt_5pct_imp <- sum(eas_new$don_share_inc_imp_bc5k>=.05, na.rm = TRUE)/sum(!is.na(eas_new$don_share_inc_imp_bc5k)) share_don_gt_10pct_earn <- eas_new %>% filter(!!!earn_filter) %>% transmute(share_don_gt_10pct = sum(don_share_inc_imp>=.1, na.rm = TRUE)/sum(!is.na(don_share_inc_imp)) ) %>% unlist %>% .[1] #don gt 10pct ... by gender #eas_new %>% # mutate(d_don_gte10_imp = don_share_inc_imp>=.1) %>% # tabyl(gender_manual, d_don_gte10_imp) %>% tabylstuff()
  • The median percentage of income donated in 2019 was 2.14% (cf 2.61% in 2018).

  • However, if we impute “0 and missing incomes” at “group medians for student-status and country,”3 the median percentage of income donated was 2% for 2019.

  • Mean share of total (imputed) income donated was 9.4% (imputing income where below 5k or missing) or 12.5% without imputation.

  • 19.9% of EAs who answered the donation question reported donating 10% or more of their income in 2019 (if we impute income; otherwise 21.4% without imputation; this compares to …, without imputation).

  • The median percent of income donated by full-time-employed non-students who earned more than $10,000 was 2.92%, and of this group 23.9% donated 10% of their income or more in 2018 (cf 3.38% and 24% in 2018).

  • Overall, those taking the EA survey tend to report donating a substantially greater share of income than those in the general US population – (web link).

don_tot_freq <- eas_all %>% dplyr::filter(year==max(year)) %>% summarise(across(c(all_of(donate_charity_names)), ~sum(as.numeric(.x) > 0, na.rm = TRUE))) %>% slice(1) %>% unlist(., use.names=TRUE) pct_don <- function(x) { sum(don_tot_freq[x])/sum(don_tot_freq)*100 } pct_ddon <- function(x) { op( sum(x != 0, na.rm=TRUE)/sum(notNA(x), na.rm=TRUE)*100 ) }
don_stats <- eas_new %>% filter(num_named_dons>0) %>% select(all_of(where_don_vars)) %>% vtable::sumtable( summ=c('notNA(x)', 'sum(x != 0)', 'sum(x != 0)/notNA(x)', 'mean(x)', 'sd(x)', 'pctile(x)[50]', 'pctile(x)[90]'), summ.names = c('Number of Responses', 'Number reporting donation to cause', 'Share of reporters donating to cause', "Mean donation of reporters (including 0's)", 'Sd', "Median", "90th pct"), digits=c(0,0,2,0,0,0,0), simple.kable = TRUE, labels = all_char_labels2, #it's a horrible workaround but we need to have the order of these the same as the table order ... I think it's a flaw of sumtable title = "Donations by category (where indicated)", out="kable") %>% kable_styling() #todo (low-priority) -- replace with .summ hijacked command n_rep_char <- sum(eas_new$num_named_dons>0, na.rm=TRUE)
don_stats_by_gwwc <- eas_new %>% mutate(`GWWC Pledge` = case_when( action_gwwc==1 ~ "Yes", action_gwwc==0 ~ "No" )) %>% filter(num_named_dons>0) %>% select(all_of(where_don_vars), `GWWC Pledge`) %>% vtable::sumtable(group = "GWWC Pledge", group.test=TRUE, summ=c('notNA(x)','sum(x != 0)/notNA(x)', 'mean(x)', 'sqrt(var(x)/length(x))', 'pctile(x)[50]'), summ.names = c('N Responses', 'Share positive', 'Mean', "Median"), digits=c(0,2, 0,0,0), simple.kable = TRUE, labels = all_char_labels2, #it's a horrible workaround but we need to have the order of these the same as the table order ... I think it's a flaw of sumtable title = "Donations by category (where indicated), by GWWC", out="kable") %>% row_spec(1:1, bold = TRUE) %>% kable_styling()
ddon_stats_by_gwwc <- eas_new %>% mutate(`GWWC Pledge` = case_when( action_gwwc==1 ~ "Yes", action_gwwc==0 ~ "No" )) %>% filter(num_named_dons>0) %>% select(all_of(where_don_dummies), `GWWC Pledge`) %>% vtable::sumtable(group = "GWWC Pledge", group.test=TRUE, summ=c('notNA(x)','sum(x != 0)/notNA(x)'), summ.names = c('N Responses', 'Donated to... ?'), digits=c(0,2), simple.kable = TRUE, labels = all_char_labels2, #it's a horrible workaround but we need to have the order of these the same as the table order ... I think it's a flaw of sumtable title = "Binary: Indicated donating to category, by GWWC", out="kable") %>% row_spec(1:1, bold = TRUE) %>% kable_styling() # .kable() %>% # .kable_styling("striped") #todo (low-priority) -- replace with .summ hijacked command


Where EAs donated

  • While 69.2% of respondents answered the donation question, only 20.9% answered at least one question about where they donated.
  • Among these, the charity that the most EAs stated that they donated to was (…) with 121 reported donations (out of a total of 715 reported donations).
  • Global Poverty charities continue to attract (…) the largest counts and amounts of donations. 62% of those who answered the relevant question reported donating to this category. 55.1% of the total ‘where donated’ reports were to global poverty charities. We sum 1,703,870 USD in total donations reported as specifically going to global poverty charities.

    • This compares to 27.3% reporting donating, 21.4% of donations and \(\$\) 645,086 total donated for animal charities,

      • 17.2%, 11.9% and \(\$\) 330,910 for EA movement/meta charities,
      • and 18.2%, 11.5% and \(\$\) 418,403 for long term and AI charities, respectively.


Donations vs plans

  • Evidence is mixed on whether EAs’ donations in a year tend to exceed or fall short of the amount they planned to donate (as they reported in previous surveys).


Descriptive and predictive models

  • … model performance, brief note about technical modeling choice changes if any

Link: Why does the EA Survey ask about donations?

See discussion in 2020 EA Forum post.

2.2 Total EA donations, magnitudes in context

Considering the magnitude of the donations…

The $10,695,926 USD in donations reported above seems likely to be a small share of total EA-affiliated giving, …

See our fairly lengthy discussion and benchmarking of this in the 2020 EA Forum post

2.3 Career paths: Earning-to-give

[Caveat about changes in question text (mention and link …)]

Changes for students and nonstudents (bearing in mind the above caveats):

etg_rates_all <- eas_all %>% filter(year>2014) %>% group_by(year) %>% summarise( "Count" = n(), "Share ETG" = mean(as.numeric(d_career_etg)) ) etg_rates_ns <- eas_all %>% filter(year>2014) %>% filter(d_student==0) %>% group_by(year) %>% summarise( "Count" = n(), "Share ETG" = mean(as.numeric(d_career_etg)) ) ( etg_rates_tab <- bind_cols(etg_rates_all, etg_rates_ns[-1]) %>% magrittr::set_names(c("Year", "All responses", "Share EtG", "Nonstudents", "Nonstudents: Share EtG")) %>% kable(caption = "Rates of 'Earning-to-give' by year and student status (see caveats)", digits=3) %>% .kable_styling() )
Table 2.1: Rates of ‘Earning-to-give’ by year and student status (see caveats)
Year All responses Share EtG Nonstudents Nonstudents: Share EtG
2015 2362 0.217 980 0.322
2017 1845 0.220 671 0.380
2018 2599 0.303 1791 0.336
2019 2509 0.283 1898 0.262
2020 2056 0.151 1035 0.232
#todo - medium priority: combine the above tables into a single table: overall, just for students with just n,
(etg_rates_plot <- eas_all %>% group_by(year, d_student) %>% filter(year>2014) %>% filter(!is.na(d_student)) %>% #@oska (low-med priority todo): we should functionalize these mutations for computing se and CIs (or find someone who has done). We do it again and again, and the code is bulky #maybe incorporate my se_bin function #@oska todo ... also functionalize or otherwise preserve a good version of this graph # Calculate standard error, confidence bands and change student factor levels summarise( m_etg = mean(as.numeric(d_career_etg)), se = se_bin(d_career_etg)) %>% mutate( etg_low = m_etg - 1.96*se, etg_high = m_etg + 1.96*se, d_student = as.factor(if_else(d_student == 0, "Non-student", "Student")), year = as.factor(year)) %>% ggplot(aes(x=year, y=m_etg, colour = d_student, group = d_student)) + geom_pointrange(aes(ymin = etg_low, ymax = etg_high), position = position_dodge(width=0.5)) + # Ensure that bars don't overlap geom_line(position = position_dodge(width=0.5)) + xlab("Mean (and 95% CI) response share in 'Earning-to-give'") + ylab("Share of sample") + scale_color_discrete("") + # Remove legend title scale_y_continuous(labels = scales::percent_format(accuracy = 1L), limits=c(0,NA), oob = scales::squish) + # Change y-axis to percentages theme(legend.position = c(0.9, 0.95), #legend.background = element_rect(fill=alpha('blue', 0.001)), legend.key = element_blank()) )

The decline in ETG is less dramatic among non-students (over 23% of non-student respondents still report ETG as their ‘current career’), but it nonetheless appears to be fairly strong and consistent from 2017-present.4

2.4 Donation totals: descriptives

5

Overall donations, totals by groups

Below, we present a histogram of positive reported 2018 donations by all respondents. Note that:

  • the horizontal axis is on a logarithmic scale,
  • 13.7% of the 2,056 total respondents reported donating zero, and
  • 30.8% of the total respondents did not report their donation amount.
donation_c <- eas_new$donation_c require(scales) don_breaks <- c(50, 100, 200, 300, 500, 1000, 2500, 5000, 10000, 25000, 50000, 100000, 250000, 500000, 1000000, 2500000) eas_new %<>% rowwise() %>% mutate(donation_c_50 = max(donation_c, 50)) %>% ungroup ( donhist_tyly <- eas_new %>% hist_plot_lscale(eas_new$donation_c_50, breaks = don_breaks ) + geom_vline_mean(donation_c) + geom_vline_med(donation_c) + geom_vline_90(donation_c) + labs(title="Histogram of prior year's Donations (latest survey)", x="last year's $ Donations (bottom-coded at 50)", y = "Number of respondents") )

# Todo (medium importance): Overlay a display of 'overall percentage shares' ... so we know where the 80th and 90th percentile are, etc.


In 2018 we reported [link/quote] (Or make this into a concise bulleted comparison)

We compare the results for 2020 (for 2019 donations):

  • Median donation (of those reporting): 528 USD
  • Donation of $1000 puts you in the 59.5th percentile.
  • Being in the top 10% requires donating 9,972
  • Being in the top 1% means donating 89,560 USD.


require(treemapify) geom_treemap_opts <- list(treemapify::geom_treemap(alpha = 0.7), geom_treemap_text(fontface = "italic", colour = "white", place = "centre", grow = TRUE, min.size = 1 ), theme(legend.position = "none", plot.title = element_text(hjust = 0.5)) ) ( don_share_by_size <- eas_new %>% geom_tree_tot_by_split(donation_c, donation_c_split, title = "Share of total last year's donation amount, by donation size") )

… discussion or cut


‘Which career paths are driving donation totals?’


Some caveats:

Show

Caveats: The figures below exclude 486 participants who provided no response to the career question (… doublecheck this), 0.236 of the sample. These participants reported a total of $$2,766,310 in donations which makes up 25.9% of the total reported donations for 2019 .

  • Totals
  • Overall shares
#library(treemapify) don_by_career_XXX <- eas_new %>% geom_tree_tot_by_split(donation_c, career_, title= "Share of last year's donations by career path") ( don_by_career <- eas_new %>% select(career_, donation_c) %>% filter(!is.na(career_)) %>% group_by(career_) %>% filter(!is.na(career_)) %>% summarise(total_don = sum(donation_c, na.rm=TRUE), n = n()) %>% mutate(don_share = round(total_don/sum(total_don)*100), freq = n/sum(!is.na(eas_new$career_)) ) %>% ggplot(aes(area = total_don , fill=freq, # Include percentage of total donation label = paste(career_, paste0(don_share, "%"), paste0("(Pop:", round(freq*100) , "%)"), sep = "\n"))) + geom_treemap_opts + # theme(legend.position = "bottom") + #todo -- add title to legend explaining that it's the survey pop; get better colors for this scale_fill_continuous(name = "Frequency", label = scales::percent, trans = "reverse") + labs(title= "Share of last year's donations by career path", subtitle = "(Share of survey population in parentheses; darker = larger share)") )

( career_tab <- eas_new %>% mutate(Career = na_if(career_, "na")) %>% filter(!is.na(Career)) %>% tabyl_ow(Career) %>% adorn_totals() %>% rename_with(snakecase::to_title_case) %>% .kable(caption = "Shares in each career path", padding = 0, digits=c(0,0,2)) %>% .kable_styling() )
Table 2.2: Shares in each career path
Career N Percent
Academia 242 15.41
For Profit Earn to Give 237 15.10
Building Capital 203 12.93
For Profit 183 11.66
Still Deciding 167 10.64
Other 166 10.57
Non Profit Ea 130 8.28
Government 96 6.11
Think Tank Lobby 90 5.73
Non Profit 56 3.57
Total 1,570 100.00


We put this in perspective, considering that income levels are different between these career paths:

grp_sum <- function(df, xvar, yvar, groupvar) { df %>% dplyr::select({{xvar}}, {{yvar}}, {{groupvar}}) %>% group_by({{groupvar}}) %>% drop_na({{xvar}}, {{yvar}}, {{groupvar}}) %>% summarise( mn_y = mean({{yvar}}), mn_x = mean({{xvar}}), med_y = median({{yvar}}), med_x = median({{xvar}}), se_y = sd({{yvar}}, na.rm=TRUE)/sqrt(length({{yvar}})), se_x = sd({{xvar}}, na.rm=TRUE)/sqrt(length({{xvar}})) ) %>% group_by({{groupvar}}) %>% # Calculate confidence intervals mutate( lower = max(0, mn_y - 1.96*se_y), upper = mn_y + 1.96*se_y ) } plot_grp <- function(df, groupvar, labsize=4, labangle=90, force = 1, fp = 1, mo=10, bp=1, arrow=NULL) { df %>% ggplot(aes(x=mn_x, y=mn_y, label = {{groupvar}})) + geom_point() + geom_abline(intercept = 0, slope = 0.1, colour="violetred1") + geom_smooth(method=lm, alpha=0.7) + geom_errorbar(aes(ymin = lower, ymax = upper), alpha=0.7) + scale_y_continuous( oob = scales::squish) + scale_x_continuous( oob = scales::squish) + ggrepel::geom_text_repel( size = labsize, angle = labangle, max.overlaps=mo, force=1, force_pull = fp, box.padding = bp, arrow = arrow, color="brown", alpha=0.75) }
( don_inc_career_plot <- eas_new %>% mutate(Career = na_if(career_, "na")) %>% filter(!is.na(Career)) %>% grp_sum(income_c_imp_bc5k, donation_c, Career) %>% plot_grp(Career, labsize=3) + xlab("Mean income in USD (imputed if <5k/missing)") + ylab("Mean donations, CIs") + scale_y_continuous(limits=c(-10000, 30000), oob = scales::squish) )

The plot above depicts mean income and mean donations by ‘career group,’ with 95% CI’s for the latter. We superimpose a ‘line of best fit’ (blue, with smoothed 95% intervals for this rough fit) and a ‘10% of income donation’ line (red). Unsurprisingly, for-profit ‘not-EtG’ are below the fitted line, and ‘for-profit EtG’ above this line, although 95% CIs are fairly wide. We also note that among people in non-profit careers, there are similar average incomes whether or not the non-profit is EA-aligned, but the non-profit EA people seem to donate somewhat more (although the CI’s do overlap).


Next, we present reported donation amounts by income groupings.6

#p_load(treemapify) ( don_share_by_income <- eas_new %>% select(donation_c, income_c_imp_bc_k, income_c_imp_split) %>% filter(!is.na(income_c_imp_bc_k)) %>% group_by(income_c_imp_split) %>% summarise(total_don = sum(donation_c, na.rm=TRUE), n = n()) %>% mutate(don_share = round(total_don/sum(total_don)*100), freq = n/sum(!is.na(eas_new$income_c_imp_split))) %>% ggplot(aes(area = total_don, fill= freq, # Include percentage of total donation label = paste(income_c_imp_split, paste0(don_share, "%"), paste0("(Pop:", (round(freq*100, 1)) , "%)"), sep = "\n"))) + geom_treemap_opts + scale_fill_continuous(name = "Frequency", label = scales::percent, trans = "reverse") + labs(title= "Share of last year's donations by income groups", subtitle = "(Share of survey population in parentheses; darker = larger share)") )

earn_tab <- eas_new %>% tabyl_ow_plus(income_c_imp_split)

Compare the above graph to the ‘donations by donations size’ graph…


Finally, we report donation totals by country.

First for 2019 donations alone:

#p_load(treemapify) ( don_share_country <- eas_new %>% select(donation_c, country_big) %>% group_by(country_big) %>% summarise(total_don = sum(donation_c, na.rm=TRUE), n = n()) %>% mutate(don_share = round(total_don/sum(total_don)*100), freq = n/sum(!is.na(eas_new$country))) %>% ungroup() %>% filter(don_share != 0 & !is.na(country_big)) %>% ggplot(aes(area = total_don, fill= freq, # Include percentage of total donation label = paste(country_big, paste0(don_share, "%"), paste0("(Pop:", op(round(freq*100, 0)) , "%)"), sep = "\n"))) + geom_treemap_opts + #scale_fill_continuous(name = "Frequency", label = scales::percent, trans = "reverse") + scale_fill_continuous(name = "Frequency", label = scales::percent, trans = "reverse") + labs(title= "Share of last year's donations by country", subtitle = "(Share of survey population in parentheses; darker = larger share)") )

#; darker = larger share


Next, pooling across all years of the EA survey (without any weighting or adjustment):

( don_share_country_all_years <- eas_all %>% select(donation_c, country, year) %>% filter(!is.na(country)) %>% group_by(country) %>% summarise(total_don = sum(donation_c, na.rm=TRUE), n = n()) %>% ungroup() %>% mutate(don_share = round(total_don/sum(total_don)*100), freq = n/sum(!is.na(eas_all$country))) %>% filter(don_share > 0.1) %>% mutate(country = snakecase::to_title_case(country)) %>% ggplot(aes(area = total_don, fill= freq, # Include percentage of total donation label = paste(country, paste0(don_share, "%"), paste0("(Pop:", op(round(freq*100, 0)) , "%)"), sep = "\n"))) + geom_treemap_opts + scale_fill_continuous(name = "Frequency", label = scales::percent, trans = "reverse") + theme(legend.position = "none", plot.title = element_text(hjust = 0.5)) + labs(title= "Share of total (all years) donation amounts by country", subtitle = "(Share of survey population in parentheses; darker = larger share)") )


And again, ‘Winsorizing’ donations at 100K USD (setting larger donations at this value), to reduce the impact of outliers:

( don_share_country_all_years_w <- eas_all %>% select(donation_c, country, year) %>% filter(!is.na(country)) %>% rowwise() %>% mutate(donation_c_w = min(donation_c, 100000)) %>% ungroup() %>% group_by(country) %>% summarise(total_don_w = sum(donation_c_w, na.rm=TRUE), n = n()) %>% ungroup() %>% mutate(don_share = round(total_don_w/sum(total_don_w)*100), freq = n/sum(!is.na(eas_all$country))) %>% filter(don_share > 0.1) %>% mutate(country = snakecase::to_title_case(country)) %>% ggplot(aes(area = total_don_w, fill= freq, # Include percentage of total donation label = paste(country, paste0(don_share, "%"), paste0("(Pop:", op(round(freq*100, 0)) , "%)"), sep = "\n"))) + geom_treemap_opts + scale_fill_continuous(name = "Frequency", label = scales::percent, trans = "reverse") + theme(legend.position = "none", plot.title = element_text(hjust = 0.5)) + labs(title= "Share of 100k-Winsorised donations by country; all years", subtitle = "(Share of survey population in parentheses; darker = larger share)") )

#TODO - @oska -- UK and USA in all-caps above


#TODO - @oska -- capitalization below #TODO - @oska -- sort by shares below ( country_tab <- eas_all %>% group_by(country_big) %>% filter(year>2014) %>% mutate( year_2020 = case_when( year==2020 ~ "2019 share.", TRUE ~ "pre-2019 share." ), `Country` = str_to_title(country_big), ) %>% tabyl(`Country`, year_2020) %>% adorn_percentages("col") %>% .kable(digits=2, caption="Shares (0-1) of survey population by country; larger countries only", label=TRUE) %>% .kable_styling() )
Table 2.3: Shares (0-1) of survey population by country; larger countries only
Country 2019 share. pre-2019 share.
Australia 0.04 0.04
Belgium 0.01 0.00
Brazil 0.00 0.01
Canada 0.03 0.03
Czech Republic 0.02 0.01
Denmark 0.00 0.00
France 0.02 0.01
Germany 0.06 0.05
Netherlands 0.02 0.01
New Zealand 0.02 0.01
Norway 0.02 0.01
Other 0.09 0.06
Poland 0.00 0.00
Sweden 0.01 0.01
Switzerland 0.02 0.01
Uk 0.11 0.12
Usa 0.30 0.29
NA 0.22 0.32


#d_anim <- "Y" #library(gganimate) anim_filename <- here(plots_folder, "animated_tree_plot.gif") if (exists("d_anim")) { if (d_anim == "Y") { animated_dons_country <- eas_all %>% select(year, donation_c, country_big) %>% group_by(year, country_big) %>% filter(year>2014) %>% summarise(total_don = sum(donation_c, na.rm=TRUE)) %>% mutate(don_share = round(total_don/sum(total_don)*100)) %>% ggplot(aes(area = total_don, fill= country_big, # Include percentage of total donation label = paste(country_big, paste0(don_share, "%"), sep = "\n"))) + geom_treemap_opts + ggtitle("Share of total last year's reported donation amounts by country") anim <- animated_dons_country + transition_states(year, state_length = 3) + ggtitle("Share of total {closest_state} reported donation amounts by country") gganimate::anim_save(anim_filename, anim) anim } else{ knitr::include_graphics(anim_filename) } }

if (!exists("d_anim")){ knitr::include_graphics(anim_filename) } #Todo (medium importance): slo

Summarize or skip discussion of this here….

We return to presenting a ‘controlled descriptive picture’ in our modeling work.7

2.4.1 Donation (shares) vs. income and GWWC pledge

Quick notes or recap … (or skip)

The histograms beloware first only for those with positive reported incomes, and next with the previously-discussed income imputation. The blue vertical line depicts the share of total (imputed) income donated by all respondents, with the green line depicting the median and the red line the 90th percentile. …

scale_x_set <- list(scale_x_continuous(limits=c(0,0.35), n.breaks=20)) ( don_share_inc_imp_hist <- eas_new %>% hist_plot(don_share_inc_imp) + geom_vline_med(eas_new$don_share_inc_imp, tgap=0.01) + geom_vline_mean(tot_don/tot_inc, tgap=0.01, label = "Overall share") + geom_vline_90(eas_new$don_share_inc_imp, tgap=0.005) + scale_x_set + labs(title="2019 Donations/Income (no imputing)", x="2019 Donations/income", y="Number of respondents") + ylim(0, 300) )

##Todo -- Medium priority: mean is missing # todo -- low priority: make the above histogram bigger, it's smaller than the rest don_share_inc_imp_hist_imp <- eas_new %>% hist_plot(don_share_inc_imp_bc5k) + geom_vline_mean(tot_don/tot_inc_imp_bc, tgap=0.01, label = "Overall share") + geom_vline_med(eas_new$don_share_inc_imp_bc5k, tgap=0.005) + geom_vline_90(eas_new$don_share_inc_imp_bc5k, tgap=0.005) + scale_x_set + labs(title="2019 Donations/Income (with imputing)", x="2019 Donations/income (with imputing)", y = "Number of respondents") + ylim(0, 300) don_share_inc_imp_hist_imp

#Todo -- Medium priority(@oska): convert to 'share of respondents', add cumulative plot
don_share_inc_imp_hist_imp %>% ggplotly()


The noticeable spike at 10% likely reflects the GWWC pledge (we return to this further below). As noted above, 19.9% of EAs reported a donation at or above 10% of their (imputed) income in 2019. 36.1% reported an amount at or above 5%.


#Donations and donation shares -- scatterplots by income and GWWC 'action' p_load(ggpubr) op_ax <- function(x) round(as.numeric(x), digits=2) scale_y_don <- scale_y_log10( name = "Donation amount (bottom-coded at $50)", # labels = scales::dollar, labels = scales::label_number_si(prefix = "$"), n.breaks = 10, limits = c(50, NA) ) don_income_gwwc_sp <- eas_all %>% filter(year==2020) %>% ggpubr::ggscatter( x = "income_c_imp_bc_k", y = "donation_c_min50", color = "d_gwwc_ever", size = 0.8, xlab = "Income in $1k USD (imputed where missing or lt 5k)", repel = TRUE, palette = "jco", yscale = "log10", xscale = "log10", add = "loess", add.params = list(color = "black", fill = "lightgray"), conf.int = TRUE ) + labs(title = "Donations by income (log scales)") + scale_x_log10(name="Income in $1K USD (imputed if <5k/missing)", labels = op_ax, n.breaks=5, limits=(c(5,5000))) + labs(colour = "Mentioned taking GWWC pledge") + scale_y_don + theme(axis.text.x = element_text( angle = 90, vjust = 0.5, hjust = 1 )) don_income_gwwc_sp_gwwc <- eas_all %>% filter(year==2020) %>% ggplot(aes(x = income_c_imp_bc_k, y = donation_c_min50, color = d_gwwc_ever)) + geom_point(size = 1, alpha = 0.7) + # draw the points geom_smooth(aes(method = 'loess', fill = d_gwwc_ever)) + # @Oska -- note I am using local smoothing here. scale_x_log10(name = "Income in $1K USD (imputed if below 5k/missing)", n.breaks = 5, limits = c(5, 5000)) + scale_y_log10( name = "Donation amount (bottom-coded at $50)", # labels = scales::dollar, labels = scales::label_number_si(prefix = "$"), n.breaks = 10, limits = c(50, NA) ) + scale_color_discrete(name = "GWWC pledge") + scale_fill_discrete(guide = "none") + theme(axis.text.x = element_text( angle = 90, vjust = 0.5, hjust = 1 ), legend.position = c(.87,.15), legend.background = element_rect(fill=alpha('blue', 0.01))) ##Todo -- Medium priority - clean up the above a bit more... get the axes better so that we can really see the 'large mass in the middle a bit better. Maybe slightly smaller dots and bolder smoothed lines, perhaps different colors for the CI shading for each # - perhaps use geom_pointdensity with different shapes to indicate regions of "larger mass" # #TODO -- Add some layer to better capture the masses *exactly at* 10pct # REVIEW # We should note that this doesn't include those who donate nothing due to the log scale (pseudo log scale is a bit weird here as well)
require(ggpointdensity) don_share_income_by_X <- eas_all %>% filter(year==2020) %>% mutate(income_c_imp_bc5k_k = income_c_imp_bc5k/1000) %>% rowwise() %>% mutate(don_share_inc_imp_bc5k = min(don_share_inc_imp_bc5k, 0.4)) %>% ungroup() %>% group_by(d_gwwc_ever_0) %>% mutate(med_gwwc = median(don_share_inc_imp_bc5k, na.rm=TRUE)) %>% ungroup() %>% group_by(engage_high_n) %>% mutate(med_eng = median(don_share_inc_imp_bc5k, na.rm=TRUE)) %>% ggplot(aes(x = income_c_imp_bc5k_k, y = don_share_inc_imp_bc5k)) + ggpointdensity::geom_pointdensity(adjust=0.25) + geom_smooth(method = "loess") + #geom_hline_med(y) + geom_hline(yintercept=0.1, linetype="dashed", size=0.5, color = "red") + scale_y_continuous(labels = scales::label_percent(accuracy = 1L)) + scale_x_log10(breaks = scales::log_breaks(n=7)) + scale_color_viridis_c("density of respondents") + xlab("Income in $1K USD (imputed if missing, bottom-code at 5k)") + theme(axis.title.x = element_text(size = 10)) + ylab("Donations/Income (top-code at 40%)") don_share_income_by_engage_sp <- don_share_income_by_X + geom_hline(aes(yintercept=med_eng), linetype="dashed", size=0.5, color = "blue") + facet_wrap(~engage_high_n, nrow=3) + ylab("Donations/Income (top-coded at 50%)") + labs(title="By 'High-engagement': last year's 'Don. shares of income' by income (w. imputing)") don_share_income_by_gwwc_sp <- don_share_income_by_X + geom_hline(aes(yintercept=med_gwwc), linetype="dashed", size=0.5, color = "blue") + facet_wrap(~d_gwwc_ever_0) + labs(title="By GWWC: last year's 'Don. share of income' by income (w/ imputing)")


How do donations relate to income, and does this relationship differ between those who mention that they took the Giving What We Can (10%) pledge?

don_income_gwwc_sp


We give a scatterplot of reported donations against income, faceted by GWWC pledge, with separate locally-smoothed conditional means (and 95% confidence intervals for these conditional means). (The figure below is for 2019 donations only.)

don_income_gwwc_sp_gwwc


Next we plot donations as shares of income against income for non-GWWC pledgers (combined with non-responders) and GWWC pledgers. The median for each group is given by the dashed blue line, and the dashed red line represents 10 percent of income.

don_share_income_by_gwwc_sp

( tab_don_by_year_pledge <- eas_all %>% filter(!is.na(d_don_10pct_bc5k) & year>=2015) %>% mutate(`Survey year` = year, d_don_plan_10pct = as.numeric(donation_plan_c/income_c_imp_bc5k >=0.1), d_don_plan_10pct = if_else(year<2018, NaN, d_don_plan_10pct)) %>% group_by(d_gwwc_ever_0, `Survey year`) %>% summarise(n = n(), "Donated 10% of income" = mean(d_don_10pct_bc5k), "Donated 10% of income (plan)" = mean(d_don_plan_10pct, na.rm=TRUE) ) %>% rename("Ever GWWC pledge" = d_gwwc_ever_0) %>% adorn_rounding(digits = 2) %>% kable(caption = "GWWC pledgers: Don. 10%+ of income by survey year (exclusions: see text)", label=TRUE) %>% .kable_styling() )
Table 2.3: GWWC pledgers: Don. 10%+ of income by survey year (exclusions: see text)
Ever GWWC pledge Survey year n Donated 10% of income Donated 10% of income (plan)
No/NA 2015 819 0.13 NaN
No/NA 2017 673 0.16 NaN
No/NA 2018 1221 0.13 0.19
No/NA 2019 1125 0.12 0.16
No/NA 2020 959 0.11 0.19
Yes 2015 352 0.36 NaN
Yes 2017 354 0.36 NaN
Yes 2018 668 0.40 0.53
Yes 2019 579 0.37 0.48
Yes 2020 463 0.39 0.48

Among those who report having ever taken a GWWC pledge (and who report donations, and excluding those reporting incomes below 5000 USD) [… share who report donating over 10%, discussion and caveats]

Further discussion (or link), relation to previous posts and figures, linkk to supplements…

2.4.2 Employment and student status

We present income and donation statistics for those “statuses” with more than 50 respondents in the forest plot below (a full table of statistics for each group can be found in the appendix).8 In each of the forest plots in this subsection, the blue line presents a simple linear best-fit of these points, and the red line represents a 10% donation rate.

se <- function(x) sqrt(var(x)/length(x)) sumstatvec <- c("{median}", "{p10}-{p90}", "{mean} [{se}] ({sd})")
doninclabs <- list(income_k_c ~ "Income in $1000 USD", donation_c ~ "Last year's donation (in USD)", donation_plan_c ~ "Latest planned donation")
don_inc_by_student <- eas_new %>% group_by(status_) %>% mutate( status_ = as.character(status_), large_group = case_when( n()<50 ~ "Other", TRUE ~ status_) ) %>% ungroup() %>% dplyr::select(income_k_c, donation_c, donation_plan_c, large_group) %>% tbl_summary(by = large_group, type = c(all_continuous()) ~ "continuous2", statistic = list(all_continuous() ~ sumstatvec), label = doninclabs, missing = c("no") ) %>% bold_labels() %>% add_n() %>% add_overall() #TODO: High -- fix the column labels #todo (low) -- we use this several times and it's a good format; let's functionalise it #Todo (medium): Bootstrapping the SE of the median would be nice, see, e.g., https://clayford.github.io/dwir/dwr_12_generating_data.html
library(ggrepel) # # 1.summarize donation and income (mean and 95pct CI for each) by status_ # 2. plot median (and mean) donation by income for each group (income lowest to highest) # 3. fit a line/curve of donation by income for each group (do for ) -- replace with the regression line based on the population not the groups # 4. Add error bars (for donations, not income) -- hard to do for median, though #TODO -- High Priority: Make this nice in the ways discussed (@oska it seems you have already started this) # why are the error bars not surrounding the point? # make it pretty (use your judgment), fix labels, add median colored dot, ( don_inc_status_plot <- eas_new %>% mutate( status_ = str_replace_all( status_, c("_" = " ") ) ) %>% grp_sum(income_c_imp_bc5k, donation_c, status_) %>% plot_grp(status_, labsize=3, fp=0.3, force=5, mo=20, bp=1.5, arrow = arrow(length = unit(0.02, "npc")) ) + xlab("Mean income in USD (imputed if < 5k/missing)") + ylab("Mean donations, 95% CIs") + scale_y_continuous(limits=c(-10000, 30000), oob = scales::squish) )

# Todo (low): Plot regression line for full pop # Todo: HIGH -- get this to look nicer, label it better, add better axis breaks (every 5k for donation, every 20k for income) #Todo (Medium) -- add plots for the medians #Todo

Donations generally track income by this aggregation, with some groups possibly ‘under-performing’ or ‘over-performing’; we return to this in our descriptive modeling.9

2.4.3 Donations by country

Donations and income by country

We report similar income and donation statistics for all countries with more than 50 respondents:

( don_income_by_ctry <- eas_new %>% dplyr::select(income_k_c, donation_c, donation_plan_c, country_big) %>% tbl_summary( by = country_big, sort = all_categorical() ~ "frequency", #reverse this ordering or maybe reverse sort by average income type = c(all_continuous()) ~ "continuous2", statistic = list(all_continuous() ~ sumstatvec), label = doninclabs, missing = c("no") ) %>% bold_labels() %>% add_n() %>% add_overall() )
Characteristic N Overall, N = 1,607 australia, N = 82 belgium, N = 13 brazil, N = 9 canada, N = 59 czech republic, N = 32 denmark, N = 10 france, N = 51 germany, N = 123 netherlands, N = 51 new zealand, N = 37 norway, N = 48 Other, N = 179 poland, N = 10 sweden, N = 16 switzerland, N = 45 uk, N = 221 usa, N = 621
Income in $1000 USD 1,384
Median 33 36 31 7 27 17 17 13 17 15 44 29 18 9 41 25 36 50
10%-90% 1-130 7-109 3-50 4-54 3-86 2-54 9-71 0-42 2-70 0-64 0-120 1-72 0-102 5-24 2-59 0-77 3-82 2-192
Mean [se] (SD) 60 [4] (136) 49 [5] (46) 29 [5] (18) 20 [9] (24) 41 [6] (42) 25 [4] (23) 33 [11] (35) 18 [3] (17) 28 [3] (29) 31 [6] (38) 52 [9] (50) 34 [5] (31) 40 [5] (56) 13 [3] (9) 34 [8] (27) 39 [6] (37) 69 [17] (242) 85 [6] (146)
Last year's donation (in USD) 1,397
Median 533 872 237 183 231 100 626 178 355 237 1,026 654 296 119 805 484 660 1,000
10%-90% 0-9,577 0-7,271 0-5,468 0-2,015 0-6,948 0-1,796 46-7,124 0-3,551 0-4,971 0-2,367 14-12,317 0-5,453 0-5,367 4-861 25-5,339 55-12,049 0-6,600 0-16,180
Mean [se] (SD) 7,348 [1,956] (73,113) 3,647 [971] (8,411) 1,301 [679] (2,351) 759 [500] (1,322) 2,753 [930] (6,769) 589 [180] (933) 2,672 [1,230] (3,889) 1,178 [283] (1,919) 1,582 [273] (2,850) 925 [226] (1,536) 5,157 [2,492] (14,315) 2,025 [385] (2,523) 2,041 [402] (4,354) 337 [195] (552) 2,267 [914] (3,167) 16,645 [13,326] (85,328) 20,727 [12,512] (181,320) 7,211 [864] (20,215)
Latest planned donation 1,377
Median 1,000 1,636 473 733 761 398 528 473 947 395 1,369 1,091 414 172 1,769 548 1,320 2,000
10%-90% 0-12,020 11-12,360 44-5,848 165-2,840 0-8,676 0-4,030 0-6,054 0-4,660 17-5,823 0-4,438 192-6,843 109-6,380 0-7,640 55-1,399 12-13,002 55-12,049 0-7,788 0-21,000
Mean [se] (SD) 9,831 [2,001] (74,249) 4,852 [964] (8,404) 2,115 [960] (3,461) 1,165 [501] (1,325) 2,909 [844] (6,142) 1,320 [459] (2,342) 1,716 [876] (2,628) 2,026 [514] (3,450) 2,216 [360] (3,692) 1,358 [280] (1,901) 5,233 [2,465] (14,162) 2,434 [439] (2,915) 3,996 [1,287] (13,798) 630 [406] (1,149) 6,071 [3,311] (11,468) 4,522 [1,618] (10,358) 24,592 [11,797] (169,316) 12,033 [2,345] (54,382)
#todo (medium?): make a stem-leaf thing here #todo (High): add *medians* to the above


# don_inc_status_plot <- eas_new %>% # dplyr::select(status_, donation_c, income_k_c) %>% # group_by(status_) %>% # drop_na(status_, donation_c, income_k_c) %>% # summarise(across(c(donation_c, income_k_c), # list(mean=mean, # median=median, # se = ~sd(.x)/sqrt(length(.x))))) %>% # group_by(status_) %>% p_load(ggimage) country_codes <- tibble(country = c("Australia", "Canada", "France", "Germany", "Netherlands", "Other", "United Kingdom", "USA"), code = c("ac", "ca", "fr", "de", "nl", "yt", "gb", "us")) ( don_inc_country_plot <- eas_new %>% grp_sum(income_c_imp_bc5k, donation_c, country_big) %>% left_join(., country_codes, by = c("country_big" = "country")) %>% plot_grp(country_big) + xlab("Mean income in USD (imputed if <5k/missing)") + ylab("Mean donations, CIs") + scale_y_continuous(limits=c(-3000, 30000), oob = scales::squish) )

#Note -- plotly seems to destroy country labels here #+ggimage::geom_flag()

Above, we plot donations and income by country of residence for the countries with the largest number of EA respondents. We fit a simple best-fit (least-squares) line in blue, and add a red line depicting a 10% donation rate. Again, donations generally track income, with some under and over-performers (see later modeling).10

( don_by_country_viol_ly <- eas_new %>% plot_box_pt_viol(donation_c, country_big, notch=TRUE) + labs(title = "Donation amounts by country (last year)") ) ( don_by_country_viol_all <- eas_all %>% plot_box_pt_viol(donation_c, where_live_cat, notch=TRUE) + labs(title = "Donation amounts by country grouping (2013-2019)") ) ( don_by_yr_viol_all <- eas_all %>% mutate(year=as.factor(year)) %>% plot_box_pt_viol(donation_c, year) + labs(title = "Donation amounts by year") )

Donations, age and years in EA

Next, we consider how donations may increase or decrease with ‘time-in-EA’ (i.e., ‘tenure’) as well as age.11

Note on Plotly12

don_by_tenure_facet_age <- eas_new %>% filter(!is.na(age_ranges)) %>% ggplot() + aes(x = tenure, y = donation_c_min50) + geom_point(size = 0.15, colour = "#0c4c8a", position = position_jitter(seed = 42, width = 0.1, height = 0.001)) + geom_smooth(span = 0.75) + scatter_theme + facet_grid(vars(), vars(age_ranges), scales = "free") + labs(title = "Last year's donation by time in EA", subtitle = "Faceted by Age ranges") + labs(x = get_label(eas_new$tenure)) + scale_y_don
don_by_tenure_facet_age

We next report the comparable chart for donation as a share of income:

donshare_by_tenure_facet_age <- eas_new %>% filter(!is.na(age_approx_ranges)) %>% ggplot() + aes(x = tenure, y = don_share_inc_imp_bc5k) + geom_point(size = 0.15, colour = "#0c4c8a", position = position_jitter(seed = 42, width = 0.1, height = 0.001)) + geom_smooth(span = 0.75) + scatter_theme + facet_grid(vars(), vars(age_approx_ranges), scales = "free") + labs(title = "Last year's donation as share of (imputed) income by time in EA", subtitle = "Faceted by Age ranges") + labs(x = get_label(eas_new$tenure)) + ylab(element_blank()) + ylim(0, 0.3)
donshare_by_tenure_facet_age

13

By referrer

Next, we consider how donations vary by ‘which referrer’ (i.e., which link) took an individual to the EA survey. Again, the blue line gives linear fit (for group means), and the red line the slope for donating 10% of income.

( don_inc_referrer_plot <- eas_new %>% grp_sum(income_c_imp_bc5k, donation_c, referrer_cat) %>% plot_grp(referrer_cat) + scale_y_continuous(limits=c(0, 15000), oob = scales::squish) + xlab("Mean income by group in USD (imputed if <5k/missing)") + ylab("Mean donations by group, CIs") + ggtitle("Donation by income and referrer") )

# (Todo?) I wonder if we should get rid of the blue line and gray line for this … or replace it with one from an individual-based regression

2.6 Which charities (causes and categories) are EAs donating to?

Only a small share of respondents report where they are donating. We group this into several categories summarized below, reporting for only those 429 respondents who indicated at least one category of donations.

#don-statistics-category-show, #TODO - HIGH: add better cause labels to this, visualise it in a way that conveys the aggregate shares of donations counts and amounts #created near the top of this file don_stats
(#tab:don_stats)Donations by category (where indicated)
Variable Number of Responses Number reporting donation to cause Share of reporters donating to cause Mean donation of reporters (including 0’s) Sd Median 90th pct
Global health + development 429 266 0.62 3971.7 21405.3 153.9 6017.6
Animal welfare 429 117 0.27 1503.7 11025.4 0 964.5
EA meta and organization 429 74 0.17 771.4 7106.9 0 261.7
Long term & AI 429 78 0.18 975.3 9963.9 0 345.6
Other 429 47 0.11 874.4 12863.9 0 25.3

Below, we depict the amounts and density of donations for each category.


Details:

Show

The vertical axis is on a logarithmic scale. The width of the violin plot depicts the smoothed density. In the box, the horizontal lines represent medians for each, lower and upper margins of the box 25th and 75th percentiles, “whisker” lines extends from the box to to the largest (smallest) value no further than 1.5 \(\times\) the inter-quartile range, and large dots represent outlying points beyond the edge of the whiskers.

##TODO -- #sort by reverse frequency of donations to a cause #TODO: bottom code and change the scale on this, time permitted ( don_by_cause_viol <- eas_new %>% filter(num_named_dons>0) %>% select(where_don_vars, action_gwwc_f) %>% gather(cause, don, -action_gwwc_f) %>% ggplot() + aes(cause, don) + geom_violin() + geom_boxplot() + ylab("Donation amount") + geom_point(size = 0.30, colour = "#0c4c8a", position = position_jitter(seed = 42, width = 0.3, height = 0.01)) + scatter_theme + scale_y_log10(labels = scales::label_number_si(prefix = "$"), n.breaks = 10) + scale_x_discrete(labels = function(x) str_wrap(all_char_labels, width = 10)) + labs(title = "Donation amounts by category: see description above") )

#Todo (Low to medium) ... @oska: if it's easy-ish, maybe gganimate this one across years? #@David: Kinda difficult to do this as the variables in where_don_vars don't seem to align with eas_all #@oska -- it is there, in variables like `donate_[charity]_year` but it would require considerable data cleaning work. Will ask/see if it's worth it.


We also check whether donations to each cause (incidence and amounts) vary by whether the person (ever) took a GWWC pledge.

#TODO -- High Priority (@oska): clean up the below to be more readable, add the mean and a CI for the mean ( don_by_cause_viol_gwwc <- eas_new %>% filter(num_named_dons>0 & !is.na(action_gwwc_f)) %>% select(where_don_vars, action_gwwc_f) %>% gather(cause, don, -action_gwwc_f) %>% ggplot() + aes(cause, don, color=action_gwwc_f) + scale_color_discrete(name="GWWC pledge", labels=c("No", "Yes")) + geom_violin() + geom_boxplot(notch=TRUE) + geom_point(size = 0.30, colour = "#0c4c8a", position = position_jitter(seed = 42, width = 0.3, height = 0.01)) + scatter_theme + scale_y_log10() + scale_x_discrete(labels = function(x) str_wrap(all_char_labels, width = 10)) )


Difference from previous plots :

Show

Comparing this to the “Donation amounts by category…” plot, Here, the lower and upper margins of the (now ‘notched’) box present an estimate of 95% confidence interval for medians (for those reporting at least one category of donations and reporting GWWC status).


Below, we tabulate donations for each cause, by group.15

don_stats_by_gwwc
Table 2.4: Donations by category (where indicated), by GWWC
Variable N Responses Share positive Mean Median N Responses Share positive Mean Median Test
GWWC Pledge No Yes
Global health + development 241 0.59 2832.7 1144.1 187 0.66 5450.2 1854.9 F=1.573
Animal welfare 241 0.22 1480.4 867.1 187 0.34 1531 495.5 F=0.002
EA meta and organization 241 0.12 696.4 553.1 187 0.24 872.1 335.7 F=0.064
Long term & AI 241 0.13 1144.7 830 187 0.25 762.1 274.8 F=0.155
Other 241 0.06 1289.5 1103.3 187 0.17 344.1 96.8 F=0.567


Next, as above, but for donation incidence:16

ddon_stats_by_gwwc
Table 2.5: Binary: Indicated donating to category, by GWWC
Variable N Responses Donated to… ? N Responses Donated to… ? Test
GWWC Pledge No Yes
Global health + development 241 187 X2=2.399
… No 100 41% 63 34%
… Yes 141 59% 124 66%
Animal welfare 241 187 X2=7.897***
… No 189 78% 123 66%
… Yes 52 22% 64 34%
EA meta and organization 241 187 X2=8.283***
… No 211 88% 143 76%
… Yes 30 12% 44 24%
Long term & AI 241 187 X2=9.831***
… No 210 87% 140 75%
… Yes 31 13% 47 25%
Other 241 187 X2=11.681***
… No 226 94% 155 83%
… Yes 15 6% 32 17%
# #TODO -- High Priority (@oska): -- the below is a mess... we want both the frequency table and test for each of these ... but how to do it. I feel like I've done this before. maybe the function in rstuff `fisherme` would help? # TODO (high-medium): Once we get it to work, do similar plots and tests for different 'which cause' comparisons ... fisher_cats <- eas_new %>% filter(num_named_dons>0) %>% dplyr::select(all_of(where_don_vars)) %>% lapply(janitor::fisher.test, y = eas_new$action_gwwc_f[eas_new$num_named_dons>0], simulate.p.value=TRUE)


Discussion of results above:

Show

… As suggested in the first of the two tables above, among those who report a charity category, those who took the GWWC pledge tend to give …

2.7 Donations: plans/aspirations vs. actual (reported) amounts

To consider:17

#filtering and shaping functions f_don_plan_by_year <- function(df=eas_all, years=latest_years) { #adjusting for comparing planned and actual donation for same year in question (but not always for 'same individuals') {df} %>% select(year, donation_c, donation_plan_c) %>% gather(donation_type, value, -year) %>% mutate(year = if_else(donation_type == "donation_plan_c", year, year-1)) %>% mutate(year = fct_rev(as.factor(year)), donation_type = fct_recode(donation_type, "Planned Donation" = "donation_plan_c", "Donation" = "donation_c")) %>% filter(year %in% years) } f_don_last_3 <- function(df=eas_all, years=last_3_years) { #this is for comparing to 'planned donation' (next year) df %>% dplyr::filter(year %in% years) %>% group_by(year) %>% select(year, donation_c, donation_plan_c) %>% gather(donation_type, value, -year) } f_next_d_don <- function(df=eas_all) { #same as f_don_last_3, but instead of gather it constructs a differenced variable `next_d_don` df %>% dplyr::filter(year %in% last_3_years) %>% select(year, donation_c, donation_plan_c) %>% transmute(next_d_don = donation_plan_c - donation_c) }
#Construct key tibbles to use in comparing planned and actual for 'this year' demographics <- c('age', 'gender', 'country', 'employ_status') # Filtering for those present in both datasets planned_actual_ly_ty <- eas_all %>% filter(year %in% c(year_n, year_n-1) & !is.na(ea_id)) %>% select(ea_id, donation_c, donation_plan_c, year) %>% distinct() %>% group_by(ea_id) %>% filter(n() == 2) %>% # Filter for those appearing in both years pivot_wider(names_from = "year", values_from = c("donation_c", "donation_plan_c")) %>% # Remove unnecessary columns select(-donation_plan_c_2020, -donation_c_2019) %>% # drop_na() %>% # Ensure that each participant had planned donation from 2019 and actual donation from 2020 # TODO - fix, this is dropping everything rename(donation_last_survey_year = donation_c_2020, planned_donation_prior_survey_year = donation_plan_c_2019) %>% #2022: renamed from 'donation_ly' and 'planned_donation_ly' for greater clarity # Add demographic information left_join(., select(eas_new, all_of(demographics), ea_id, action_gwwc, start_date, end_date, income_c), by = "ea_id") #Convert to long format for ggplot planned_actual_ly_ty_l <- planned_actual_ly_ty %>% group_by(ea_id) %>% gather(donation_type, value, donation_last_survey_year, planned_donation_prior_survey_year)
## helper functions f_ly_hyp <- function(df) { #2019 data for donation difference df %>% filter(donation_last_survey_year > 0 & planned_donation_prior_survey_year>0) %>% #positive don in each year transmute(don_diff = donation_last_survey_year - planned_donation_prior_survey_year) #only the difference is used; this adds an 'attribute' to this object }
#test_rep_don_diff_mn_19 #point hypothesis of 0 mean (+attribute) #1000 replications of the relevant 'data' #test_rep_don_diff_med_19: as above but for median #test_rep_next_d_don_mn_18_20 #for actual vs *next* year's plan (means) #test_rep_next_d_don_med_18_20 ... (medians) #### Linked tests: New Purr testing framework ##### # ...Alternate between testing mean and median = 0 #### mean_zero_hyp <- list(null = "point", mu = 0) med_zero_hyp <- list(null = "point", med = 0) hyps <- list(mean_zero_hyp, med_zero_hyp) # ...Stats to calculate ##### stat_mean <- list(list(stat = "mean")) stat_median <- list(list(stat = "median")) bs_1000 <- list(reps = 1000, type = "bootstrap") #dataframes for testing 'current less next donation' and 'actual less planned donation' df_next_don <- eas_all %>% f_next_d_don df_don_diff <- planned_actual_ly_ty %>% f_ly_hyp n <- 4 # Total number of tests ... mean and median for each dataframe (better to softcode this?) responses <- c( rep("don_diff", n/2), rep("next_d_don", n/2)) p_value_directions <- rep("two_sided", n) # ... Functionalize #### dfs <- list( rep(list(df_don_diff), n/2), rep(list(df_next_don), n/2)) linked_df_labels <- c(rep("Last year-this year linked responses", 2), rep("Last 3 years all responses", 2)) linked_test_var_type <- c(rep("Actual vs Planned", 2), rep("'Next year' vs Current", 2))
linked_tests_df <- tibble(df = do.call(c, dfs), # Dataframes (needs tidying) # Stats to calculate stat = rep(c(stat_mean, stat_median), n/2), # Hypotheses to test hypothesis = rep(hyps, n/2), # Samples to generate gen = rep(list(bs_1000), n), # Outcome variables response = responses, # Direction for p-value calculation p_val_dir = rep("two_sided", n)) # .... actually run tests and collect pvalues etc #### linked_tests_df <- linked_tests_df %>% mutate(results = pmap(., htest_infer_sim)) #htest_infer_sim was defined in `hypothesis_test.R`; it runs the steps in the Infer testing package with options selected based on the content of the arguments. linked_tests_results <- extract_hyp_results(linked_tests_df) %>% #extract and label key results for reporting and plotting mutate(data_label = linked_df_labels, data_type = linked_test_var_type)
# ... make a tibble of the relevant dataframes and 'test formula elements' #### unlinked_tests_df <- tibble(df = do.call(c, unlinked_data), formula = rep(list(unlinked_formula), n), hypothesis = rep(hyp_unlinked, n), gen = rep(list(perm_200), n), stat = c(rep(d_order_diff_means, n-1), d_order_next_diff_means), p_val_dir = rep("two_sided", n))
# Column labels rename_test_results <- c("Statistic" = "stat", "Null type" = "null", "Null value" = "null_value", "Point estimate" = "point_estimate", "CI Lower" = "lower_ci", "CI Upper" = "upper_ci", "P-value" = "p_value", "Sample" = "data_label") # This can be used for plotting full_test_results <- dplyr::bind_rows(linked_tests_results, unlinked_diff_in_means_results, unlinked_diff_in_medians_results) %>% select(-c(order)) %>% mutate(across(c(stat, null, p_val_dir), ~ snakecase::to_sentence_case(.x))) # This forms the basis for tables/displaying stats full_test_results_clean <- full_test_results %>% select(-c(reps, type, formula, p_val_dir, response)) %>% rename(!!rename_test_results) %>% mutate(Statistic = str_replace_all(Statistic, c("means" = "Mean", "medians" = "Median", "Diff" = "Difference")))
#making tables #For linked tests: current_next_test_results_clean <- full_test_results_clean %>% filter(data_type == "'Next year' vs Current") %>% select(-c(data_type, null_dist)) planned_v_actual_test_results_clean <- full_test_results_clean %>% filter(data_type == "Actual vs Planned") %>% select(-c(data_type, null_dist)) planned_v_actual_test_table <- planned_v_actual_test_results_clean %>% select(-c(`Null value`, `Null type`, `Sample`)) %>% kable(caption = "Actual minus planned donations for last year, linked participants (this year and last year)", digits=c(0,0,0,3)) %>% kable_styling() current_next_test_table <- current_next_test_results_clean %>% select(-c(`Null value`, `Null type`, `Sample`)) %>% kable(digits=c(0,0,0,0,3), caption = "Planned minus last year's donation, 2018-20, all participants who report donations") %>% kable_styling()
#making tables for UNLINKED tests: planned_actual_unlinked_results_table <- full_test_results_clean %>% arrange(match(Sample, c("Full sample (Prior donation years)", "Involved before last year (Prior-year don)", "GwwC only (Prior-year don)", "'Matched individuals'"))) %>% filter(`Null type` == "Independence" & data_type == "Actual - Planned" ) %>% select(-c(data_type, null_dist, `Null value`, `Null type`)) %>% select(Sample, Statistic, everything()) %>% kable(caption = "Actual versus Planned donation distributions: permutation tests", digits=c(0,0,0,0,0,3)) %>% kable_styling() next_current_unlinked_results_table <- full_test_results_clean %>% filter(`Null type` == "Independence" & data_type == "'Next year' - 'this year'" ) %>% select(-c(data_type, null_dist, `Null value`, `Null type`)) %>% select(Sample, Statistic, everything()) %>% kable(caption = "'Next year (plan)' - 'this year' donation distributions: permutation tests", digits=c(1,1,1,3)) %>% kable_styling()

Do people meet or exceed the amount they intended or planned to donate for the next year?

In recent surveys, we have asked “In [current year] how much do you currently plan to donate?” We also ask “in [previous year], roughly how much money did you donate?”

Note: timings of surveys:

Show

The EA surveys have been released at various points in the year:

In 2017, the survey was released in April; thus the ‘plan’ was reported only about 1/3 of the way through the year (or slightly later, depending on response time).

In 2018, the survey was released in May.

In 2019, it was released in August, about 3/4 of the way throughout the year.

Thus, for each of these years, the year-to-year comparison may tell us something about whether people lived up to their plans.

2019 Planned vs. actual: Individuals present in both surveys

We first consider those 441 respondents who can be matched across the 2019 and 2020 surveys.18

The plots below cover only respondents who appear in both samples and provide planned and actual donation values. These individuals make up 22.9% of the total respondents that appear in the 2020 sample and 15% of the total respondents across 2019 and 2020.19

# Create plots for planned and actual donations matched across 2019 scales_point_density_min50 <- list(limits = c(50, 500000), trans = scales::pseudo_log_trans(base=10),