19 Other suggested sections

Code

library("here")
source(here("code", "methods_setup.R"))

Installing remotes [2.4.2] ...
    OK [linked cache]
Installing downloader [0.4] ...
    OK [linked cache]

19.1 ‘Conjoint analysis’

See end of this Slack thread

19.2 (Open and robust science: RP attitudes, discussions, resources)

Integrate from:

Reinstein discussions here

Code to do Robustness checks as a ‘specification chart’, and sensitivity analysis

Source: https://pbs.twimg.com/media/ESyDHGjUYAYHfva?format=jpg&name=medium

19.3 (Meta-analysis)

Incorporate and consolidate from Reinsteins meta notes and more

Related: ‘aggregating expert judgment’

Testing out the AggreCAT package on the ACX prediction contest

Install aggreCAT

install.packages("devtools")

Installing devtools [2.4.5] ...
    OK [linked cache]

Install aggreCAT

#install.packages("sessioninfo", dependencies = TRUE)

devtools::install_github("metamelb-repliCATS/aggreCAT")
#Note: this failed initially because dependent packages needed updating.  `update.packages()` nay solve it. Nope, it didn't but all sorts of errors in that update

library(aggreCAT)

input_acx_prediction_data

acx <- read_csv(here("sample_data", "acx_2023blindmode_predictions.csv"))
#Improve: source from a hosted file or API

identify my own predictions

acx  %<>%  mutate(
  d_reinstein =  ifelse(
    `@1.WillVladimirPutinbePresidentofRussia`==82 & `@2.WillUkrainecontrolthecityofSevastopol`==54, TRUE, FALSE)
)

What can aggreCAT do for us?

First testing this package with their built-in data.

Below we demonstrate how to use the most simple commonly implemented aggregation method ArMean, which takes the arithmetic mean of participant Best Estimates. We first use a small subset of 5 participants for a single claim, 28

Code

data(data_ratings)
set.seed(1234)

participant_subset <- data_ratings %>% #just 5 users with names
  distinct(user_name) %>%
  sample_n(5) %>%
  mutate(participant_name = paste("participant", rep(1:n()))) #numbers rather than id codenames

single_claim <- data_ratings %>% 
  filter(paper_id == "28") %>% #all data on a single claim
  right_join(participant_subset, by = "user_name") #join to the above 5 users (keeping only these 5)

#note this stat only uses the rows with element == `three_point_best`
single_claim %>% 
  filter(element == "three_point_best") %>% 
AverageWAgg(expert_judgements = ., 
            type = "ArMean")

method	paper_id	cs	n_experts
ArMean	28	70.8	5

Can we do similar across multiple claims with this package?

Code

all_claims <- data_ratings %>% 
  right_join(participant_subset, by = "user_name")   #join to the above 5 users (keeping only these 5) 

all_claims %>% 
 filter(element == "three_point_best") %>% 
AverageWAgg(expert_judgements = ., 
            type = "ArMean")

method	paper_id	cs	n_experts
ArMean	100	75	5
ArMean	102	34	5
ArMean	103	59.4	5
ArMean	104	53.4	5
ArMean	106	45	5
ArMean	108	74	5
ArMean	109	68	5
ArMean	116	61	5
ArMean	118	60.8	5
ArMean	133	58.6	5
ArMean	137	61	5
ArMean	138	68.6	5
ArMean	145	67	5
ArMean	168	35	5
ArMean	176	46	5
ArMean	186	57.6	5
ArMean	20	70	5
ArMean	203	41	5
ArMean	21	60	5
ArMean	215	45	5
ArMean	24	22.2	5
ArMean	26	63	5
ArMean	28	70.8	5
ArMean	38	33.4	5
ArMean	79	31	5

This seems to work, at least with the data structured as they have it. Now what if we want multiple aggregation methods?

Code

three_aggregations <- purrr::map_dfr(.x = list(AverageWAgg, IntervalWAgg, ShiftingWAgg),
                                .f = ~ .x(data_ratings))

three_aggregations %>%  tabyl(method)

method	n	percent
ArMean	25	0.333
IntWAgg	25	0.333
ShiftWAgg	25	0.333

Now let’s see what can be done with our ACX dataset. As ACX doesn’t ask for CIs (and doesn’t have multiple rounds of expert revisions), aggreCAT may be of limited use here.^[I’d have to estimate these bounds arbitrarily (e.g., make them larger for people with less experience). Only a subset of their measures may be informative. First I’ll check which ones might work, using their data for all aggregations (with baseline settings for each), but only keeping this ‘best prediction’ for each.

Code

all_aggsOK <- list(AverageWAgg,  IntervalWAgg, ShiftingWAgg, ExtremisationWAgg)

all_aggs <- list(LinearWAgg, AverageWAgg, BayesianWAgg, IntervalWAgg, ShiftingWAgg, ReasoningWAgg, DistributionWAgg, ExtremisationWAgg)


three_aggs <- list(LinearWAgg, AverageWAgg, BayesianWAgg)


data_ratings %>% 
 filter(element == "three_point_best") %>% 
  purrr::map_dfr(.x = all_aggsOK,
                                .f = ~ .x(data_ratings)) %>%
  tabsums(cs, method)

method	nonmissing	mean	sd
ArMean	25	51.7	17.3
BetaArMean	25	1	0
IntWAgg	25	48.6	23.4
ShiftWAgg	25	50.2	17.1

All of the aggregators seemed computable even without having elements other than ‘three_point_best’ What’s going on here? E.g,. IntervalWAgg says ‘the weights are dependent on the lower and upper bounds of three-point elicitation’ … but I left those out. ‘ShiftingWAgg’ is meant to depend on discussion; but here I have no indicators of that. Yet still, these all seem to come up with different means. BetaARMean which is meant to do some extremisation wotks rather strangely; shifting everything to 1, apparently.

Focusing in on some key interesting methods… I’ve been told ‘Geometric Means’ are good in these sorts of situations

Code

data_ratings %>% 
 filter(element == "three_point_best") %>% 
AverageWAgg(expert_judgements = ., 
            type = "GeoMean")

method	paper_id	cs	n_experts
GeoMean	100	67.3	25
GeoMean	102	26.4	25
GeoMean	103	60.2	25
GeoMean	104	40.4	25
GeoMean	106	28.8	25
GeoMean	108	71.2	25
GeoMean	109	71.3	25
GeoMean	116	60	25
GeoMean	118	52	25
GeoMean	133	56.2	25
GeoMean	137	60.5	25
GeoMean	138	74.8	25
GeoMean	145	61.3	25
GeoMean	168	23.7	25
GeoMean	176	21.7	25
GeoMean	186	48.3	25
GeoMean	20	67.9	25
GeoMean	203	42.7	25
GeoMean	21	52.1	25
GeoMean	215	35.8	25
GeoMean	24	15.7	25
GeoMean	26	62.5	25
GeoMean	28	61.2	25
GeoMean	38	23.7	25
GeoMean	79	23.4	25

LinearWAgg seems interesting, either suppliying some sort of weights by Participant (e.g., downweight those who haven’t done forecasting), or with the supplied GranWAgg – “granularity of best estimates”

Let’s move to trying to use these on the ACX data. First I adjust the data to make it more compatible with the aggreCAT package.

Note – it cannot deal with NAs:

Code

data_ratings2 <- data_ratings
data_ratings2[4,6] <- NA_real_

data_ratings2 %>% 
AverageWAgg(expert_judgements = ., 
            type = "GeoMean")

Error in `preprocess_judgements()`:
! NAs Found in Values

Upweighting and downweighting people – ad-hoc.

Now, I have little time left, so I’ll come up with some ad-hoc weights.

How to use what we know/what ACX suggested in aggregation

We have some idea of how different characteristics predicted success in previous prediction contests. Suppose I had a model that was good at predict in the accuracy of an individual’s prediction, as a function of their demography, their other prediction, etc. How would I then use that (along with everyone’s predictions), to come up with a good prediction aggregation?

E.g., how should we map those into the ‘weights’ suggested? I’m not sure, and it would take me some time to figure this out.

I guess I will do something lame and ad-hoc. Everyone starts out with weight 1.

Anyone who has no forecasting experience is downweighted fivefold (x0.2),
Superforecasters are upweighted 20x
SAT math uprated if above average (unless already a superforecaster)
PHd upweighted 1.5x
Downweight x 0.1 if you put 98+ or lt 4% chance of Putin being in power
Downweight by quantile of share of ‘exact 50% predictions’ (up to about 90% downweighting)

Also, replace all ‘50% predictions’ with the arithmetic average (done later)

Code

# get column names starting with "@"
pred_cols <- colnames(acx)[startsWith(colnames(acx), "@")]
acx_for_ag_wts <- acx 
acx_for_ag_wts$share50 <- (rowSums(acx[pred_cols]==50, na.rm=TRUE))/50
# subset data based on column names and calculate row sums

acx_for_ag_wts %<>%
   mutate(
    user_name = row_number()
    ) %>% 
 mutate(
   SATscoremath =  case_when(
     (SATscoremath>800|SATscoremath<100) ~ NA_real_, 
     TRUE ~ SATscoremath),
   SAT_math_rank = percent_rank(SATscoremath),
    SAT_math_rank=
     case_when(
    is.na(SAT_math_rank) ~ mean(0.5, na.rm=TRUE),
    TRUE ~ SAT_math_rank),
   share50_qtl = percent_rank(share50) 
    ) %>% 
      mutate(across(c(ForecastingExperience, Superforecaster, Degree), ~replace(., is.na(.), "No"))) %>% 
mutate(
  weight = 
    (ifelse(ForecastingExperience=="No",0.2,1))*
    (ifelse(Superforecaster=="Yes",20,1))*
    (1+((SAT_math_rank>0.5)*(Superforecaster!="Yes")*(SAT_math_rank-0.5))*4)*
    (1+(Degree=="Ph D.")*.5)*
    (ifelse(`@1.WillVladimirPutinbePresidentofRussia`>=99,0.1,1))*
    (ifelse(`@1.WillVladimirPutinbePresidentofRussia`< 5, 0.1,1))*
  (1-(share50_qtl)/1.2) 
)

#the code above is kind of crappy; better to convert it to logs to allow simple addition?

acx_for_ag_wts_only <- acx_for_ag_wts %>% 
  select(user_name, weight)

Code

acx_for_ag_wts_long <- acx_for_ag_wts %>%  
   select(where(~n_distinct(.) > 1)) %>%  #delete columns that are identical everywhere
  pivot_longer( #each 'question' needs it's own row for each 'user'
    contains("@"),
    names_to = "paper_id",
    values_to = "value"
    ) 
  
acx_for_ag_wts_long %<>%  select(user_name, paper_id, value, everything()) %>%  #make all predictions (for an individual) into a row 
  mutate(element = "three_point_best",
    round = "round_2") %>% #aggreCAT needs these things 
  group_by(paper_id) %>% 
  mutate(
    value = 
    case_when(
      is.na(value) ~ mean(value, na.rm=TRUE), #Sadly the package can't deal with NAs so I cheaply set them to the arithmetic average
      value == 0.5 ~ mean(value, na.rm=TRUE),
      TRUE ~ value
  )
  ) %>% ungroup

Let’s see if it works

Code

acx_for_ag_wts_long %>%  
  ungroup() %>% 
 filter(paper_id=="@1.WillVladimirPutinbePresidentofRussia") %>%
  filter(user_name<=154)  %>% 
  AverageWAgg(expert_judgements = ., 
            type = "GeoMean")

method	paper_id	cs	n_experts
GeoMean	@1.WillVladimirPutinbePresidentofRussia	99	154

Code

acx_for_ag_wts_long %>%  
  ungroup() %>% 
filter(paper_id=="@1.WillVladimirPutinbePresidentofRussia") %>%
  filter(user_name<=155)  %>% 
  AverageWAgg(expert_judgements = ., 
            type = "GeoMean")

method	paper_id	cs	n_experts
GeoMean	@1.WillVladimirPutinbePresidentofRussia	Inf	155

Strangely, this seems to only work here with 155 or less rows!

Now to do some funny calculations and blend them with my own predictions

acx_for_ag_wts_results: The weighted arithmetic means

Code

acx_for_ag_wts_results <- acx_for_ag_wts_long %>%
 LinearWAgg(expert_judgements = ., 
            type = "Participant", 
   weights = acx_for_ag_wts_only
    ) %>%  
    mutate(agg = "linear_wt", 
      value = cs)

Now (weighted?) geometric means

Code

weighted.geomean <- function(x, w, ...)
{
  return(prod(x^w, ...)^(1/sum(w, na.rm = TRUE)))
}

weighted.geomean <- function(x, w, ...) exp(weighted.mean(log(x), w, ...))


sumwt <- sum(acx_for_ag_wts_long$weight[acx_for_ag_wts_long$weight>1], na.rm=TRUE)

reg.geomean <- function(x, ...)
{
  return(prod(x, ...))
}

acx_for_ag_geom_wt <- acx_for_ag_wts_long %>%
  filter(!is.na(weight)  & !is.na(value))  %>%
  group_by(paper_id) %>% 
  mutate(
    sumwt = sum(weight, na.rm=TRUE)
   ) %>% 
    summarize(
      weighted_geomean = weighted.geomean(value, weight),
    ) %>% 
  mutate(agg = "geom_wt",
    value = weighted_geomean)

OK, looks like I forgot about the prediction market data in metaculus HERE on the same thing.

I’ll make a tibble of the weighted linear mean, weighted geometric mean, metaculus data, my own predictions)

Code

#p_load(DataEditR)
#metaculus_data <- acx_for_ag_wts_results %>% 
 # select(paper_id)  %>% 
  #mutate(value=50)
  

#data_edit(metaculus_data)


metaculus_data <- read_csv(here("sample_data", "metaculus.csv")) %>% 
  mutate(agg = "metaculus")


#and me

reinstein_pred <- acx %>%  
  filter(d_reinstein) %>% 
  select(starts_with("@")) %>% 
  pivot_longer(col=everything()) %>% 
  rename(paper_id = name) %>% 
  mutate(agg = 'reinstein_pred')


aggregates <- bind_rows(reinstein_pred, metaculus_data, acx_for_ag_geom_wt, acx_for_ag_wts_results) %>%  
  select(-method, -cs, -n_experts, -weighted_geomean) %>%  
  pivot_wider(names_from=agg, id_cols = paper_id, values_from=value) %>% 
  mutate(guided_weight = 0.2*reinstein_pred + 0.4*metaculus + 0.2*linear_wt +  0.2*geom_wt,
    final_impulsive_choice = guided_weight)

write_csv(aggregates, here("sample_data", "aggregates.csv"))



#data_edit(aggregates)

Fixed some stuff at end and cleaned it and made some ad-hoc extremizing around a few correlated events

Code

final_ad_hoc <- read_csv(here("sample_data", "aggregates.csv")) 

final_ad_hoc %>%  DT::datatable()

Code

#final_ad_hoc %<>%  mutate(dif_meta = reinstein_pred/metaculus)

# Other suggested sections ```{r} library("here") source(here("code", "methods_setup.R")) ``` ## 'Conjoint analysis' See end of [this Slack thread](https://rethinkpriorities.slack.com/archives/G01962YABHB/p1670240321964009) ## (Open and robust science: RP attitudes, discussions, resources) {#opensci} Integrate from: [Reinstein discussions here](https://daaronr.github.io/metrics_discussion/robust-diag.html) Code to do [Robustness checks as a 'specification chart'](https://github.com/ArielOrtizBobea/spec_chart), and sensitivity analysis Source: https://pbs.twimg.com/media/ESyDHGjUYAYHfva?format=jpg&name=medium ## (Meta-analysis) {#meta} Incorporate and consolidate from [Reinsteins meta notes](https://daaronr.github.io/metrics_discussion/metaanalysis.html) and more ## Related: 'aggregating expert judgment' {-} Testing out the `AggreCAT` package on the ACX prediction contest ```{r} #| label: install aggrecat #| code-summary: "Install aggreCAT" install.packages("devtools") #install.packages("sessioninfo", dependencies = TRUE) devtools::install_github("metamelb-repliCATS/aggreCAT") #Note: this failed initially because dependent packages needed updating. `update.packages()` nay solve it. Nope, it didn't but all sorts of errors in that update library(aggreCAT) ``` ```{r} #| label: input_acx_prediction #| code-summary: "input_acx_prediction_data" acx <- read_csv(here("sample_data", "acx_2023blindmode_predictions.csv")) #Improve: source from a hosted file or API ``` ```{r} #| label: my_row #| code-summary: "identify my own predictions" acx %<>% mutate( d_reinstein = ifelse( `@1.WillVladimirPutinbePresidentofRussia`==82 & `@2.WillUkrainecontrolthecityofSevastopol`==54, TRUE, FALSE) ) ``` What can `aggreCAT` do for us? First testing this package with their built-in data. > Below we demonstrate how to use the most simple commonly implemented aggregation method ArMean, which takes the arithmetic mean of participant Best Estimates. We first use a small subset of 5 participants for a single claim, 28 ```{r} data(data_ratings) set.seed(1234) participant_subset <- data_ratings %>% #just 5 users with names distinct(user_name) %>% sample_n(5) %>% mutate(participant_name = paste("participant", rep(1:n()))) #numbers rather than id codenames single_claim <- data_ratings %>% filter(paper_id == "28") %>% #all data on a single claim right_join(participant_subset, by = "user_name") #join to the above 5 users (keeping only these 5) #note this stat only uses the rows with element == `three_point_best` single_claim %>% filter(element == "three_point_best") %>% AverageWAgg(expert_judgements = ., type = "ArMean") ``` Can we do similar across multiple claims with this package? ```{r} all_claims <- data_ratings %>% right_join(participant_subset, by = "user_name") #join to the above 5 users (keeping only these 5) all_claims %>% filter(element == "three_point_best") %>% AverageWAgg(expert_judgements = ., type = "ArMean") ``` This seems to work, at least with the data structured as they have it. Now what if we want multiple aggregation methods? ```{r} three_aggregations <- purrr::map_dfr(.x = list(AverageWAgg, IntervalWAgg, ShiftingWAgg), .f = ~ .x(data_ratings)) three_aggregations %>% tabyl(method) ``` Now let's see what can be done with our ACX dataset. As ACX doesn't ask for CIs (and doesn't have multiple rounds of expert revisions), `aggreCAT` may be of limited use here.^[I'd have to estimate these bounds arbitrarily (e.g., make them larger for people with less experience). Only a subset of their measures may be informative. First I'll check which ones might work, using *their* data for all aggregations (with baseline settings for each), but only keeping this 'best prediction' for each. ```{r} all_aggsOK <- list(AverageWAgg, IntervalWAgg, ShiftingWAgg, ExtremisationWAgg) all_aggs <- list(LinearWAgg, AverageWAgg, BayesianWAgg, IntervalWAgg, ShiftingWAgg, ReasoningWAgg, DistributionWAgg, ExtremisationWAgg) three_aggs <- list(LinearWAgg, AverageWAgg, BayesianWAgg) data_ratings %>% filter(element == "three_point_best") %>% purrr::map_dfr(.x = all_aggsOK, .f = ~ .x(data_ratings)) %>% tabsums(cs, method) ``` All of the aggregators seemed computable even without having elements other than 'three_point_best' What's going on here? E.g,. `IntervalWAgg` says 'the weights are dependent on the lower and upper bounds of three-point elicitation' ... but I left those out. 'ShiftingWAgg' is meant to depend on discussion; but here I have no indicators of that. Yet still, these all seem to come up with different means. `BetaARMean` which is meant to do some extremisation wotks rather strangely; shifting everything to 1, apparently. Focusing in on some key interesting methods... I've been told 'Geometric Means' are good in these sorts of situations ```{r} data_ratings %>% filter(element == "three_point_best") %>% AverageWAgg(expert_judgements = ., type = "GeoMean") ``` `LinearWAgg` seems interesting, either suppliying some sort of weights by `Participant` (e.g., downweight those who haven't done forecasting), or with the supplied `GranWAgg` -- "granularity of best estimates" Let's move to trying to use these on the ACX data. First I adjust the data to make it more compatible with the aggreCAT package. ::: {.callout-note collapse="true"} ## Note -- it cannot deal with NAs: ```{r} data_ratings2 <- data_ratings data_ratings2[4,6] <- NA_real_ data_ratings2 %>% AverageWAgg(expert_judgements = ., type = "GeoMean") ``` ::: Upweighting and downweighting people -- ad-hoc. Now, I have little time left, so I'll come up with some ad-hoc weights. ::: {.callout-note collapse="false"} ## How to use what we know/what ACX suggested in aggregation We have some idea of how different characteristics predicted success in previous prediction contests. Suppose I had a model that was good at predict in the accuracy of an individual's prediction, as a function of their demography, their other prediction, etc. How would I then use that (along with everyone's predictions), to come up with a good prediction aggregation? E.g., how should we map those into the 'weights' suggested? I'm not sure, and it would take me some time to figure this out. ::: I guess I will do something lame and ad-hoc. Everyone starts out with weight 1. 1. Anyone who has no forecasting experience is downweighted fivefold (x0.2), 2. Superforecasters are upweighted 20x 3. SAT math uprated if above average (unless already a superforecaster) 4. PHd upweighted 1.5x 5. Downweight x 0.1 if you put 98+ or lt 4% chance of Putin being in power 6. Downweight by quantile of share of 'exact 50% predictions' (up to about 90% downweighting) Also, replace all '50% predictions' with the arithmetic average (done later) ```{r} # get column names starting with "@" pred_cols <- colnames(acx)[startsWith(colnames(acx), "@")] acx_for_ag_wts <- acx acx_for_ag_wts$share50 <- (rowSums(acx[pred_cols]==50, na.rm=TRUE))/50 # subset data based on column names and calculate row sums acx_for_ag_wts %<>% mutate( user_name = row_number() ) %>% mutate( SATscoremath = case_when( (SATscoremath>800|SATscoremath<100) ~ NA_real_, TRUE ~ SATscoremath), SAT_math_rank = percent_rank(SATscoremath), SAT_math_rank= case_when( is.na(SAT_math_rank) ~ mean(0.5, na.rm=TRUE), TRUE ~ SAT_math_rank), share50_qtl = percent_rank(share50) ) %>% mutate(across(c(ForecastingExperience, Superforecaster, Degree), ~replace(., is.na(.), "No"))) %>% mutate( weight = (ifelse(ForecastingExperience=="No",0.2,1))* (ifelse(Superforecaster=="Yes",20,1))* (1+((SAT_math_rank>0.5)*(Superforecaster!="Yes")*(SAT_math_rank-0.5))*4)* (1+(Degree=="Ph D.")*.5)* (ifelse(`@1.WillVladimirPutinbePresidentofRussia`>=99,0.1,1))* (ifelse(`@1.WillVladimirPutinbePresidentofRussia`< 5, 0.1,1))* (1-(share50_qtl)/1.2) ) #the code above is kind of crappy; better to convert it to logs to allow simple addition? acx_for_ag_wts_only <- acx_for_ag_wts %>% select(user_name, weight) ``` ```{r} acx_for_ag_wts_long <- acx_for_ag_wts %>% select(where(~n_distinct(.) > 1)) %>% #delete columns that are identical everywhere pivot_longer( #each 'question' needs it's own row for each 'user' contains("@"), names_to = "paper_id", values_to = "value" ) acx_for_ag_wts_long %<>% select(user_name, paper_id, value, everything()) %>% #make all predictions (for an individual) into a row mutate(element = "three_point_best", round = "round_2") %>% #aggreCAT needs these things group_by(paper_id) %>% mutate( value = case_when( is.na(value) ~ mean(value, na.rm=TRUE), #Sadly the package can't deal with NAs so I cheaply set them to the arithmetic average value == 0.5 ~ mean(value, na.rm=TRUE), TRUE ~ value ) ) %>% ungroup ``` Let's see if it works ```{r} acx_for_ag_wts_long %>% ungroup() %>% filter(paper_id=="@1.WillVladimirPutinbePresidentofRussia") %>% filter(user_name<=154) %>% AverageWAgg(expert_judgements = ., type = "GeoMean") acx_for_ag_wts_long %>% ungroup() %>% filter(paper_id=="@1.WillVladimirPutinbePresidentofRussia") %>% filter(user_name<=155) %>% AverageWAgg(expert_judgements = ., type = "GeoMean") ``` Strangely, this seems to only work here with 155 or less rows! Now to do some funny calculations and blend them with my own predictions `acx_for_ag_wts_results`: The weighted arithmetic means ```{r} acx_for_ag_wts_results <- acx_for_ag_wts_long %>% LinearWAgg(expert_judgements = ., type = "Participant", weights = acx_for_ag_wts_only ) %>% mutate(agg = "linear_wt", value = cs) ``` Now (weighted?) geometric means ```{r} weighted.geomean <- function(x, w, ...) { return(prod(x^w, ...)^(1/sum(w, na.rm = TRUE))) } weighted.geomean <- function(x, w, ...) exp(weighted.mean(log(x), w, ...)) sumwt <- sum(acx_for_ag_wts_long$weight[acx_for_ag_wts_long$weight>1], na.rm=TRUE) reg.geomean <- function(x, ...) { return(prod(x, ...)) } acx_for_ag_geom_wt <- acx_for_ag_wts_long %>% filter(!is.na(weight) & !is.na(value)) %>% group_by(paper_id) %>% mutate( sumwt = sum(weight, na.rm=TRUE) ) %>% summarize( weighted_geomean = weighted.geomean(value, weight), ) %>% mutate(agg = "geom_wt", value = weighted_geomean) ``` OK, looks like I forgot about the prediction market data in [metaculus HERE](https://www.metaculus.com/project/2023-contest/) on the same thing. I'll make a tibble of the weighted linear mean, weighted geometric mean, metaculus data, my own predictions) ```{r} #p_load(DataEditR) #metaculus_data <- acx_for_ag_wts_results %>% # select(paper_id) %>% #mutate(value=50) #data_edit(metaculus_data) metaculus_data <- read_csv(here("sample_data", "metaculus.csv")) %>% mutate(agg = "metaculus") #and me reinstein_pred <- acx %>% filter(d_reinstein) %>% select(starts_with("@")) %>% pivot_longer(col=everything()) %>% rename(paper_id = name) %>% mutate(agg = 'reinstein_pred') aggregates <- bind_rows(reinstein_pred, metaculus_data, acx_for_ag_geom_wt, acx_for_ag_wts_results) %>% select(-method, -cs, -n_experts, -weighted_geomean) %>% pivot_wider(names_from=agg, id_cols = paper_id, values_from=value) %>% mutate(guided_weight = 0.2*reinstein_pred + 0.4*metaculus + 0.2*linear_wt + 0.2*geom_wt, final_impulsive_choice = guided_weight) write_csv(aggregates, here("sample_data", "aggregates.csv")) #data_edit(aggregates) ``` Fixed some stuff at end and cleaned it and made some ad-hoc extremizing around a few correlated events ```{r} final_ad_hoc <- read_csv(here("sample_data", "aggregates.csv")) final_ad_hoc %>% DT::datatable() #final_ad_hoc %<>% mutate(dif_meta = reinstein_pred/metaculus) ```