We begin by cleaning up our data, removing responses where users chose not to include a generalization. A power analysis based on the data from a pilot suggested we would need 90 participants to detect differencs with 80% power. Following our pre-registered analysis plan, we iteratively collected data and excluded datasets based on poor performance. Overall, we recruited 97 participants. One participant was excluded due to the same confidence on all trials, and six were excluded because the majority of their generalizations were not about the data they saw, per our pre-registered exclusion criteria.
We subset from a total of 1941 to 1743, removing a total of 198 generalizations, representing trials where a participant did not provide a generalization for the presented stimuli or generalizations that misinterpreted the presented stimuli.
d <- read.csv("./data/[E1]N=1000-Full-Cleaned.tsv", sep="\t")
d$confidence <- suppressWarnings(as.numeric(as.character(paste(d$confidence))))
d$initSliderValue <- suppressWarnings(as.numeric(as.character(paste(d$initSliderValue))))
sapply(d, class)
df <- subset(d, d$correct!="NA")
## Min. 1st Qu. Median Mean 3rd Qu. Max.
##
## [1] NA
We look at overall summaries of the aggregation strategy. In addition, we coded each generalization into a class as specified by our pre-registration, and summarize the results.
There are less generalizations made in the aggregation condition. In addition, the most common generalizations were categorized into the mean or shape classes, with the next most frequent being correlation and rank. There were extremely few variance generalizations, likely because of the strict nature of the coding for this generalization class — participants had to explicitly mention the variance of data in a view. On average per participant we encoded 3.13 ± 1.76 correlation, 6.92 ± 4.23 mean, 2.44 ± 2.13 rank, 6.81 ± 5.12 shape, 0.04 ± 0 variance generalizations.
To see this better, we plot the distribution of generalizations across generalization class. We observe that participants made the most shape class generalizations with the disaggregation condition, the most rank class generalizations with the mean condition and the most mean class generalizaitons with the disaggregation with mean condition.
summary(df$aggStrat)
## disagg disagg+mean mean
## 607 608 528
summary(df$insightClass)
## correlation mean misinterpretation rank
## 283 623 0 220
## shape variance
## 613 4
suppressWarnings(ddply(df, c("aggStrat"), summarise,
n=nrow(df),
k=sum(df$aggStrat == aggStrat),
pbar = k/n,
se = sqrt(pbar*(1 - pbar)/n)))
## aggStrat n k pbar se
## 1 disagg 1743 607 0.3482501 0.01141136
## 2 disagg+mean 1743 608 0.3488239 0.01141573
## 3 mean 1743 528 0.3029260 0.01100675
suppressWarnings(ddply(df, c("insightClass", "workerId"), summarise,
n=nrow(df),
k=sum(df$insightClass == insightClass & df$workerId == workerId))) %>%
ddply(c("insightClass"), summarise,
generalizationClassTotal=sum(k),
percentTotal=sum(k)/nrow(df),
avgNumberPerParticipant=sum(k) / 90,
sd=sd(k))
## insightClass generalizationClassTotal percentTotal avgNumberPerParticipant
## 1 correlation 283 0.162363741 3.14444444
## 2 mean 623 0.357429719 6.92222222
## 3 rank 220 0.126219162 2.44444444
## 4 shape 613 0.351692484 6.81111111
## 5 variance 4 0.002294894 0.04444444
## sd
## 1 1.755742
## 2 4.235293
## 3 2.130628
## 4 5.122532
## 5 0.000000
suppressWarnings(ddply(df, c("aggStrat", "insightClass"), summarise,
n=nrow(df[df$aggStrat == aggStrat,]),
k=sum(df$aggStrat == aggStrat & df$insightClass == insightClass),
pbar = k/n,
se = sqrt(pbar*(1 - pbar)/n),
min=pbar-1.96*se,
max=pbar+1.96*se))
## aggStrat insightClass n k pbar se min
## 1 disagg correlation 607 80 0.131795717 0.013729897 0.104885119
## 2 disagg mean 607 152 0.250411862 0.017585084 0.215945096
## 3 disagg rank 607 57 0.093904448 0.011839565 0.070698901
## 4 disagg shape 607 317 0.522240527 0.020274287 0.482502924
## 5 disagg variance 607 1 0.001647446 0.001646089 -0.001578888
## 6 disagg+mean correlation 608 96 0.157894737 0.014788197 0.128909871
## 7 disagg+mean mean 608 239 0.393092105 0.019808736 0.354266983
## 8 disagg+mean rank 608 69 0.113486842 0.012863631 0.088274126
## 9 disagg+mean shape 608 202 0.332236842 0.019102198 0.294796535
## 10 disagg+mean variance 608 2 0.003289474 0.002322180 -0.001262000
## 11 mean correlation 528 107 0.202651515 0.017493715 0.168363833
## 12 mean mean 528 232 0.439393939 0.021599265 0.397059381
## 13 mean rank 528 94 0.178030303 0.016647841 0.145400536
## 14 mean shape 528 94 0.178030303 0.016647841 0.145400536
## 15 mean variance 528 1 0.001893939 0.001892145 -0.001814665
## max
## 1 0.158706314
## 2 0.284878627
## 3 0.117109995
## 4 0.561978130
## 5 0.004873781
## 6 0.186879603
## 7 0.431917228
## 8 0.138699558
## 9 0.369677149
## 10 0.007840947
## 11 0.236939197
## 12 0.481728498
## 13 0.210660071
## 14 0.210660071
## 15 0.005602544
We calculate summary statistics for accuracy for each of our aggregation strategies to get a better sense of our data. While the aggregate condition generalizations have a lower frequency, however there is no significant difference between accuracies of aggregation strategy.
df %>%
ddply(~aggStrat, summarise,
Correct=sum(correct==TRUE),
Incorrect=sum(correct==FALSE),
Accuracy=Correct/(Incorrect+Correct),
Total=Incorrect+Correct)
## aggStrat Correct Incorrect Accuracy Total
## 1 disagg 399 208 0.6573311 607
## 2 disagg+mean 405 203 0.6661184 608
## 3 mean 354 174 0.6704545 528
# Accounting for differences in workers
df_agg_accuracy <- df %>%
ddply(.(aggStrat, workerId), summarise,
Correct=sum(correct==TRUE),
Incorrect=sum(correct==FALSE),
PercCorrect=Correct/(Incorrect+Correct),
Total=Incorrect+Correct) %>%
ddply(~aggStrat, summarise,
N = sum((Total)),
meanAcc = mean(PercCorrect),
sd = sd(PercCorrect),
se = sd / sqrt(N))
df_agg_accuracy
## aggStrat N meanAcc sd se
## 1 disagg 607 0.6563772 0.2049842 0.008320052
## 2 disagg+mean 608 0.6591138 0.2238778 0.009079443
## 3 mean 528 0.7116326 0.2779547 0.012096425
Next, we look at the the accuracy of aggregation strategy, faceted by generalization class. We also look at the breakdown in percentages per aggregation condition (PercTotalofAggStrat). We plot the results for visual aid.
## aggStrat insightClass Correct Incorrect Total Accuracy
## 1 disagg correlation 56 24 80 0.7000000
## 2 disagg mean 99 53 152 0.6513158
## 3 disagg rank 20 37 57 0.3508772
## 4 disagg shape 223 94 317 0.7034700
## 5 disagg variance 1 0 1 1.0000000
## 6 disagg+mean correlation 76 20 96 0.7916667
## 7 disagg+mean mean 173 66 239 0.7238494
## 8 disagg+mean rank 21 48 69 0.3043478
## 9 disagg+mean shape 135 67 202 0.6683168
## 10 disagg+mean variance 0 2 2 0.0000000
## 11 mean correlation 92 15 107 0.8598131
## 12 mean mean 154 78 232 0.6637931
## 13 mean rank 56 38 94 0.5957447
## 14 mean shape 51 43 94 0.5425532
## 15 mean variance 1 0 1 1.0000000
## PercTotalofAggStrat
## 1 0.131795717
## 2 0.250411862
## 3 0.093904448
## 4 0.522240527
## 5 0.001647446
## 6 0.157894737
## 7 0.393092105
## 8 0.113486842
## 9 0.332236842
## 10 0.003289474
## 11 0.202651515
## 12 0.439393939
## 13 0.178030303
## 14 0.178030303
## 15 0.001893939
We analyze accuracy with respect to data type combination (univariate, 1 quantitative x 1 nominal, 2 quantitative)
## dataTypeCombination Correct Incorrect Accuracy Total
## 1 nominalBivariate 293 358 0.4500768 651
## 2 quantBivariate 461 149 0.7557377 610
## 3 univariate 404 78 0.8381743 482
## dataTypeCombination aggStrat Correct Incorrect Accuracy Total
## 1 nominalBivariate disagg 80 141 0.3619910 221
## 2 nominalBivariate disagg+mean 86 137 0.3856502 223
## 3 nominalBivariate mean 127 80 0.6135266 207
## 4 quantBivariate disagg 171 40 0.8104265 211
## 5 quantBivariate disagg+mean 158 44 0.7821782 202
## 6 quantBivariate mean 132 65 0.6700508 197
## 7 univariate disagg 148 27 0.8457143 175
## 8 univariate disagg+mean 161 22 0.8797814 183
## 9 univariate mean 95 29 0.7661290 124
## dataTypeCombination N meanAcc sd se
## 1 nominalBivariate 651 0.4517140 0.2567745 0.010063784
## 2 quantBivariate 610 0.7966920 0.2291487 0.009277960
## 3 univariate 482 0.8289268 0.1961107 0.008932596
## dataTypeCombination aggStrat N meanAcc sd se
## 1 nominalBivariate disagg 221 0.3225760 0.3456768 0.02325274
## 2 nominalBivariate disagg+mean 223 0.3554995 0.3736702 0.02502281
## 3 nominalBivariate mean 207 0.5817370 0.4627837 0.03216569
## 4 quantBivariate disagg 211 0.8509070 0.3012627 0.02073978
## 5 quantBivariate disagg+mean 202 0.8064374 0.3252616 0.02288533
## 6 quantBivariate mean 197 0.7357242 0.3919885 0.02792802
## 7 univariate disagg 175 0.8460034 0.2832715 0.02141332
## 8 univariate disagg+mean 183 0.8437500 0.3130358 0.02314027
## 9 univariate mean 124 0.7785088 0.3583792 0.03218340
We’ll run some Bayesian regressions. First let’s set up the data for the modeling.
We run a hierarchical logistic regression model to evaluate the impact of aggregation strategy on accuracy as per our pre-registration. We report the results as the distribution of posterior mean estimates for effects of both aggregation strategies and trial and the standard eviation for varying intercepts of participant ID and view ID. We find that there doesn’t seem to be an evidence of effect, as all intervals are centered near 0.
Let’s plot results
## 105 vector or matrix parameters omitted in display. Use depth=2 to show them.
## Mean StdDev lower 0.95 upper 0.95 n_eff Rhat
## a 0.84 0.33 0.18 1.47 1111 1
## bnoagg -0.17 0.15 -0.46 0.11 6668 1
## bmean -0.12 0.15 -0.40 0.17 7553 1
## btrial 0.03 0.01 0.00 0.06 7535 1
## sigma_worker 0.75 0.10 0.57 0.95 2120 1
## sigma_spec 1.07 0.22 0.70 1.51 4401 1
## 105 vector or matrix parameters omitted in display. Use depth=2 to show them.
## 105 vector or matrix parameters omitted in display. Use depth=2 to show them.
## a_worker[1] a_worker[2] a_worker[3] a_worker[4] a_worker[5] a_worker[6]
## 0.5254999 0.7769337 1.6731374 0.5668260 0.4996277 0.2949677
## a_worker[7] a_worker[8] a_worker[9] a_worker[10] a_worker[11] a_worker[12]
## 1.4273466 0.6718786 0.8964898 2.4798704 0.7435681 1.3452745
## a_worker[13] a_worker[14] a_worker[15] a_worker[16] a_worker[17] a_worker[18]
## 1.3181030 0.4738298 1.3181567 1.8415908 1.2444036 0.6405744
## a_worker[19] a_worker[20] a_worker[21] a_worker[22] a_worker[23] a_worker[24]
## 1.5664999 0.2694058 1.4292765 0.6687203 5.6909185 0.9203934
## a_worker[25] a_worker[26] a_worker[27] a_worker[28] a_worker[29] a_worker[30]
## 2.0560940 0.6487823 0.6625672 1.9433761 0.6666385 1.1978313
## a_worker[31] a_worker[32] a_worker[33] a_worker[34] a_worker[35] a_worker[36]
## 2.1247549 1.1695404 1.3004574 1.1687971 1.4174942 1.1834207
## a_worker[37] a_worker[38] a_worker[39] a_worker[40] a_worker[41] a_worker[42]
## 0.9357123 1.9023398 2.8517602 0.8298821 1.2389077 0.3091390
## a_worker[43] a_worker[44] a_worker[45] a_worker[46] a_worker[47] a_worker[48]
## 0.5282128 1.1590770 0.9560430 0.7209489 0.7350940 1.9702879
## a_worker[49] a_worker[50] a_worker[51] a_worker[52] a_worker[53] a_worker[54]
## 1.7839562 1.1848855 1.1273089 1.7748883 1.9493087 1.3430219
## a_worker[55] a_worker[56] a_worker[57] a_worker[58] a_worker[59] a_worker[60]
## 1.0684683 1.4726984 1.0029128 1.0444521 2.8043671 1.0777341
## a_worker[61] a_worker[62] a_worker[63] a_worker[64] a_worker[65] a_worker[66]
## 1.4013273 0.6941974 2.7046391 1.0277860 0.5880429 1.0809941
## a_worker[67] a_worker[68] a_worker[69] a_worker[70] a_worker[71] a_worker[72]
## 0.6892810 0.7090654 0.5057780 0.9942274 0.8862223 0.4488369
## a_worker[73] a_worker[74] a_worker[75] a_worker[76] a_worker[77] a_worker[78]
## 0.7454162 0.8917071 0.9691866 0.4101154 0.4798583 1.7754696
## a_worker[79] a_worker[80] a_worker[81] a_worker[82] a_worker[83] a_worker[84]
## 0.7139392 1.7638034 0.7092945 1.5871459 0.6551029 1.7813546
## a_worker[85] a_worker[86] a_worker[87] a_worker[88] a_worker[89] a_worker[90]
## 0.9054145 0.2446092 0.5979112 1.5323368 0.2730194 1.8444700
## a_spec[1] a_spec[2] a_spec[3] a_spec[4] a_spec[5] a_spec[6]
## 2.4588018 3.0270586 2.1946182 2.8655694 1.3354744 0.6084346
## a_spec[7] a_spec[8] a_spec[9] a_spec[10] a_spec[11] a_spec[12]
## 0.3891960 0.1099008 0.6119618 0.1988181 0.9471993 1.6829296
## a_spec[13] a_spec[14] a_spec[15] a bnoagg bmean
## 1.4535131 1.0741174 1.9884529 2.3090839 0.8398876 0.8881068
## btrial sigma_worker sigma_spec
## 1.0315892 2.1182386 2.9074778
It is possible that there is a learning effect, or our choice of mark type led participants to focus on a particular stimulus over another. As a result, we analyze accuracy by aggregation strategy and generalization class for the first trial of each particiapnt to better understand the extent of this effect. We find that there appears to be a small difference between the disaggregated and aggregation conditions, although results are not reliably different.
df[df$trial==1,] %>%
ddply(c("aggStrat", "insightClass"), summarise,
Correct=sum(correct==TRUE),
Incorrect=sum(correct==FALSE),
Accuracy=Correct/(Incorrect+Correct),
Total=Incorrect+Correct)
## aggStrat insightClass Correct Incorrect Accuracy Total
## 1 disagg correlation 1 0 1.0000000 1
## 2 disagg mean 2 1 0.6666667 3
## 3 disagg rank 0 3 0.0000000 3
## 4 disagg shape 31 7 0.8157895 38
## 5 disagg+mean correlation 2 0 1.0000000 2
## 6 disagg+mean mean 13 7 0.6500000 20
## 7 disagg+mean rank 0 1 0.0000000 1
## 8 disagg+mean shape 15 3 0.8333333 18
## 9 mean correlation 3 4 0.4285714 7
## 10 mean mean 11 6 0.6470588 17
## 11 mean shape 1 0 1.0000000 1
# Gives us 80 observations, since 10 participants didn't make observations on the first trial
df_firstTrial_facetWorker_stats <- df[df$trial==1,] %>%
ddply(.(aggStrat, workerId), summarise,
Correct=sum(correct==TRUE),
Incorrect=sum(correct==FALSE),
PercCorrect=Correct/(Incorrect+Correct),
Total=Incorrect+Correct) %>%
ddply(~aggStrat, summarise,
N = sum((Total)),
meanAcc = mean(PercCorrect),
sd = sd(PercCorrect),
se = sd / sqrt(N))
df_firstTrial_facetWorker_stats
## aggStrat N meanAcc sd se
## 1 disagg 45 0.7380952 0.4112076 0.06129921
## 2 disagg+mean 41 0.6964286 0.4582431 0.07156555
## 3 mean 25 0.6041667 0.4885464 0.09770927
A summary of reported confidence from participants. We first investigate general summary statistics for confidence taking into account individual differences between participants. We find that between aggregation strategies, there are small differences in total mean confidence (disaggregation - 67%, disaggregation+mean - 69%, mean - 72%).
df %>%
ddply(.(aggStrat, workerId), summarise, ## to get confidence per participant
N = sum(correct==TRUE, na.rm=TRUE) + sum(correct==FALSE, na.rm=TRUE),
correct = sum(correct==TRUE),
meanConf = mean(confidence, na.rm=TRUE),
sd = sd(confidence),
se = sd / sqrt(N)) %>%
ddply(~aggStrat, summarise, ## then to average the average confidence per participant
Total = sum(N),
correct = sum(correct, na.rm=TRUE),
meanTotalConf = mean(meanConf, na.rm=TRUE),
sdTotal = sd(meanConf),
seTotal = sdTotal / sqrt(Total))
## aggStrat Total correct meanTotalConf sdTotal seTotal
## 1 disagg 607 399 67.27951 18.54913 0.7528859
## 2 disagg+mean 608 405 68.63591 20.08251 0.8144535
## 3 mean 528 354 71.76873 18.00758 0.7836794
## Warning: Removed 2 rows containing missing values (geom_errorbar).
We examine how aggregation strategy impacts confidence by again running a Bayesian hierarchical model. We find that there is an effect of the mean aggregation condition on confidence compared to the disaggregated and disaggregated with means condition.
## Mean StdDev lower 0.95 upper 0.95 n_eff Rhat
## bnoagg -3.543878 1.285270 -5.878993 -0.8698803 2963.221 0.9999075
## bmean -1.928507 1.305901 -4.550995 0.5861019 2741.627 0.9997666
## 105 vector or matrix parameters omitted in display. Use depth=2 to show them.
## 105 vector or matrix parameters omitted in display. Use depth=2 to show them.
One potential confound for the effect that we observe in confidence is the intial confidence slider value each participant saw when they were asked to record their confidence. For each trial, we randomized the initial confidence slider position, as giving a default value could influence the confidence a participant gave. However, it is possible that sampling caused a higher rate of a range of values for a given aggregation condition, which could potentially confound any results we found in the confidence effect. We investigate the effects of this potential confound by plotting the distribution of initial confidence slider positions. We select a bin size of 4, resulting in 25 bins to see the data in a somewhat high resolution. Given the histograms of the distribution for each aggregation strategy, we see that though there is variance, generally initial slider value is the same.
df %>%
ddply(.(aggStrat), summarise,
total = sum(correct==TRUE) + sum(correct==FALSE),
meanInitialSliderValue = mean(initSliderValue, na.rm=TRUE),
sd = sd(initSliderValue),
se = sd / sqrt(total))
## aggStrat total meanInitialSliderValue sd se
## 1 disagg 607 50.36079 29.32739 1.190362
## 2 disagg+mean 608 50.33717 29.71545 1.205121
## 3 mean 528 50.10227 29.41093 1.279946
We perform an exploratory analysis to investigate the relationship between accuracy, confidence and aggregation strategy. We plot the the average confidence of each worker per aggregation strategy to get a better understanding of the distribution of confidence. Just from plotting all trials and their confidence, we see that consistently, generalizations marked as correct tend to have a higher confidence.
Per our preregistration, we report on how overall accuracy changes with respect to confidence. We see that accuracy stays relatively consistent when we threshold with the exception of when we consider generalizations where participants report 100 confidence. However, it is unclear if this is a reliable effect, as increasing thresholds of confidence reduce the sample size of generalizations (i.e. the subset of all generalizations of 0 reported confidence or greater will be larger than the subset of generalizations with a value of 100 reported confidence). We investigate this difference by calculating the biserial point correlation between accuracy and confidence.
Because there may be individual differences between how participants use confidence, we first find the average point biserial correlation between confidence and accuracy for each participant-aggregation strategy pair. Then we average these correlations across participants for each aggregation strategy. We interpret when a worker gets all of their observations correct (biserial.cor gives NaN) as 0 correlation. We find that the average correlation between accuracy and confidence across aggregation strategies are not reliably different.
ddply(df_worker_CorConf[df_worker_CorConf$biserial!=0,], ~aggStrat, summarize,
avgConf=mean(conf),
avgBiserial=mean(biserial),
sdConf=sd(conf),
sdCorr=sd(biserial),
seConf=sdConf/sqrt(length(aggStrat)),
seCorr=sdCorr/sqrt(length(aggStrat)))
## aggStrat avgConf avgBiserial sdConf sdCorr seConf seCorr
## 1 mean 69.74438 -0.10411885 18.50268 0.5455777 2.472524 0.07290589
## 2 disagg 66.24506 -0.03499143 18.81899 0.4774175 2.117302 0.05371367
## 3 disagg+mean 69.73120 -0.04704603 16.86296 0.5531112 1.987319 0.06518478
As per our pre-registration, we analyze first trial confidence before participants are aware they will be asked for confidence, in order to to compare confidence between aggregation strategy. We find that on the first trial, aggregation as a mean mark is slightly less accurate and less confident in their generalizations. We plot these results for aid.
Now let’s look at the two new codes for effect magnitude estimate and quantitative predictions. There are 211 EM generalizations and 991 QP generalizations. There is a greater amount of EM generalizations for the aggregation by default condition. Looking at distributions of both, we find that there are more QP generalizations for the shape class types. Most of these are likely because shape class generalizations included those where the participant noted the shape of a distribution or the size of a bin (i.e. “Ages range from 18 to 68.” or “The majority of purchases are between 75 and 175 dollars.”)
df %>%
ddply(~aggStrat, summarise,
es = sum(effectSizeMagnitudeRemoveNulls==TRUE, na.rm=TRUE),
qp = sum(quantitativePrediction==TRUE, na.rm=TRUE),
es_null = sum(effectSizeMagnitude==TRUE & effectSizeMagnitudeRemoveNulls==FALSE, na.rm=TRUE),
total = sum(effectSizeMagnitude==TRUE | effectSizeMagnitude==FALSE, na.rm=TRUE),
PercES = es/total,
PercQP = qp/total,
PercES_null = es_null/total,
PercES_null_ofES = es_null/es)
## aggStrat es qp es_null total PercES PercQP PercES_null
## 1 disagg 18 374 22 607 0.02965404 0.6161450 0.03624382
## 2 disagg+mean 27 353 45 607 0.04448105 0.5815486 0.07413509
## 3 mean 22 264 77 528 0.04166667 0.5000000 0.14583333
## PercES_null_ofES
## 1 1.222222
## 2 1.666667
## 3 3.500000
## Using aggStrat as id variables
Per our preregistration, we run models for both QP and EM to see if there is any effect between aggregation condition and either effect. We see a slight effect in the effect magnitude estimates model, where the disagg condition in particular appears to result in less effect magnitude estimates, however the 95% CI slightly crosses 0 so this effect is not entirely reliable. Similarly, we find no reliable difference in aggregation condition for predicting the quantitative prediction code.
## Mean StdDev lower 0.95 upper 0.95 n_eff Rhat
## bnoagg 0.8221094 0.1851149 0.4477505 1.1569307 3063.505 0.9998044
## bmean 0.4402947 0.1816723 0.0747635 0.7822615 3684.267 1.0000511
## a_worker[1] a_worker[2] a_worker[3] a_worker[4] a_worker[5] a_worker[6]
## 30.78829127 0.96019737 1.95213155 1.01920260 0.88433542 36.36157391
## a_worker[7] a_worker[8] a_worker[9] a_worker[10] a_worker[11] a_worker[12]
## 0.48305934 0.50551285 0.30587174 1.37984098 0.64069629 3.00512664
## a_worker[13] a_worker[14] a_worker[15] a_worker[16] a_worker[17] a_worker[18]
## 0.36025403 0.27918094 0.29617222 0.64930723 3.05985434 1.09990007
## a_worker[19] a_worker[20] a_worker[21] a_worker[22] a_worker[23] a_worker[24]
## 0.46346892 2.63761826 2.23523377 3.60846655 0.94353618 0.16654778
## a_worker[25] a_worker[26] a_worker[27] a_worker[28] a_worker[29] a_worker[30]
## 1.65855073 0.65551287 0.59022110 0.61422105 0.38599907 0.81313153
## a_worker[31] a_worker[32] a_worker[33] a_worker[34] a_worker[35] a_worker[36]
## 0.22427363 0.89675095 0.75492025 0.36229353 0.22066419 0.19255487
## a_worker[37] a_worker[38] a_worker[39] a_worker[40] a_worker[41] a_worker[42]
## 5.15856020 1.81980332 0.41265509 0.39140692 0.51509545 76.65077420
## a_worker[43] a_worker[44] a_worker[45] a_worker[46] a_worker[47] a_worker[48]
## 8.55109972 1.66947701 1.59515203 0.42143869 1.91153296 0.43313010
## a_worker[49] a_worker[50] a_worker[51] a_worker[52] a_worker[53] a_worker[54]
## 3.69067995 2.18063619 1.05267098 0.18117429 3.73991471 0.32364518
## a_worker[55] a_worker[56] a_worker[57] a_worker[58] a_worker[59] a_worker[60]
## 0.31624826 1.80140514 0.85356970 0.33055260 0.93875658 1.08006222
## a_worker[61] a_worker[62] a_worker[63] a_worker[64] a_worker[65] a_worker[66]
## 0.80989807 2.21856457 1.52191852 0.58872443 1.82593072 0.45575077
## a_worker[67] a_worker[68] a_worker[69] a_worker[70] a_worker[71] a_worker[72]
## 0.52774399 1.11527696 0.49238907 1.13818393 0.79374572 1.41150616
## a_worker[73] a_worker[74] a_worker[75] a_worker[76] a_worker[77] a_worker[78]
## 2.69248913 0.24991897 0.22176391 1.57730855 0.82619165 0.26533352
## a_worker[79] a_worker[80] a_worker[81] a_worker[82] a_worker[83] a_worker[84]
## 3.60174079 0.23944439 6.20088790 1.03860351 0.11962059 0.42049225
## a_worker[85] a_worker[86] a_worker[87] a_worker[88] a_worker[89] a_worker[90]
## 0.32021280 44.97205923 0.42180136 2.19589181 3.67855880 2.42468668
## a_spec[1] a_spec[2] a_spec[3] a_spec[4] a_spec[5] a_spec[6]
## 60.21403035 241.25743895 58.61470721 4.82421634 13.90000351 0.02951658
## a_spec[7] a_spec[8] a_spec[9] a_spec[10] a_spec[11] a_spec[12]
## 0.08802007 0.04133725 0.07159606 0.06116567 1.77998753 0.31799608
## a_spec[13] a_spec[14] a_spec[15] a bnoagg bmean
## 0.66913714 0.53785165 0.41353185 2.03622793 2.27529424 1.55316480
## btrial sigma_worker sigma_spec
## 0.98216444 4.08243150 21.21822003
## 105 vector or matrix parameters omitted in display. Use depth=2 to show them.
## 105 vector or matrix parameters omitted in display. Use depth=2 to show them.
## Mean StdDev lower 0.95 upper 0.95 n_eff Rhat
## bnoagg -1.916758 0.2726352 -2.469291 -1.4124389 3172.284 1.0001260
## bmean -1.012913 0.2271087 -1.474882 -0.5725651 4554.330 0.9998404
## a_worker[1] a_worker[2] a_worker[3] a_worker[4] a_worker[5] a_worker[6]
## 0.10721523 0.99512706 1.93761449 0.31589697 1.96322801 0.38753877
## a_worker[7] a_worker[8] a_worker[9] a_worker[10] a_worker[11] a_worker[12]
## 4.47259449 0.90529033 3.15834227 1.10713591 0.30697576 0.33119925
## a_worker[13] a_worker[14] a_worker[15] a_worker[16] a_worker[17] a_worker[18]
## 0.90604338 0.71459864 1.75510620 3.33801419 2.58656870 0.14779381
## a_worker[19] a_worker[20] a_worker[21] a_worker[22] a_worker[23] a_worker[24]
## 3.87312664 0.19073328 2.15796172 0.19275717 1.36740082 1.02122427
## a_worker[25] a_worker[26] a_worker[27] a_worker[28] a_worker[29] a_worker[30]
## 2.14104380 0.81440122 1.02864986 3.05995626 0.26599871 0.70594167
## a_worker[31] a_worker[32] a_worker[33] a_worker[34] a_worker[35] a_worker[36]
## 2.65431465 1.15923871 0.72542791 2.23180328 4.78808983 3.30965247
## a_worker[37] a_worker[38] a_worker[39] a_worker[40] a_worker[41] a_worker[42]
## 0.30907558 0.44647343 16.93376309 2.89172385 1.70107775 0.11832887
## a_worker[43] a_worker[44] a_worker[45] a_worker[46] a_worker[47] a_worker[48]
## 0.43986727 2.60080792 0.95445962 1.32375822 0.85287681 0.19416983
## a_worker[49] a_worker[50] a_worker[51] a_worker[52] a_worker[53] a_worker[54]
## 0.21851091 0.16874253 0.21733452 15.53169471 0.22770791 1.95964554
## a_worker[55] a_worker[56] a_worker[57] a_worker[58] a_worker[59] a_worker[60]
## 3.49330827 0.70523671 0.92682892 0.82010816 1.55142042 1.42228401
## a_worker[61] a_worker[62] a_worker[63] a_worker[64] a_worker[65] a_worker[66]
## 0.58781296 0.19325708 1.39562222 13.15986971 0.19578397 1.46902741
## a_worker[67] a_worker[68] a_worker[69] a_worker[70] a_worker[71] a_worker[72]
## 1.55948454 0.93743453 1.53113519 1.19046553 0.89703989 1.80261179
## a_worker[73] a_worker[74] a_worker[75] a_worker[76] a_worker[77] a_worker[78]
## 3.60882459 2.45456538 2.02406362 0.18030708 1.06091358 2.44796396
## a_worker[79] a_worker[80] a_worker[81] a_worker[82] a_worker[83] a_worker[84]
## 0.71513935 6.38835220 0.45183420 1.07027765 2.54363855 0.65632782
## a_worker[85] a_worker[86] a_worker[87] a_worker[88] a_worker[89] a_worker[90]
## 1.59095977 0.21251424 5.77584845 0.67267812 0.15218842 0.38723519
## a_spec[1] a_spec[2] a_spec[3] a_spec[4] a_spec[5] a_spec[6]
## 0.02898528 0.02902509 0.03165066 0.02661199 0.03213437 0.60104659
## a_spec[7] a_spec[8] a_spec[9] a_spec[10] a_spec[11] a_spec[12]
## 77.78575033 6.94200811 3.27045207 13.43827881 10.43981490 0.71629881
## a_spec[13] a_spec[14] a_spec[15] a bnoagg bmean
## 12.16674834 24.97323128 1.02865896 0.03867311 0.14708296 0.36315944
## btrial sigma_worker sigma_spec
## 1.01935323 4.05029496 22.75104265
## 105 vector or matrix parameters omitted in display. Use depth=2 to show them.
## 105 vector or matrix parameters omitted in display. Use depth=2 to show them.
Results
## Mean StdDev lower 0.95 upper 0.95 n_eff Rhat
## bnoagg -0.48345160 0.3693295 -1.241617 0.2316175 4210.976 1.000095
## bmean -0.01623524 0.3352617 -0.692010 0.6330092 4159.446 1.000071
## a_worker[1] a_worker[2] a_worker[3] a_worker[4] a_worker[5] a_worker[6]
## 0.50100249 1.44086284 1.61635801 1.02495007 0.79328642 1.18890139
## a_worker[7] a_worker[8] a_worker[9] a_worker[10] a_worker[11] a_worker[12]
## 2.12352479 0.59385158 2.08078343 0.61591522 0.75314360 1.23568505
## a_worker[13] a_worker[14] a_worker[15] a_worker[16] a_worker[17] a_worker[18]
## 0.63226914 1.46245002 0.72050738 2.00725339 1.21727331 0.55873673
## a_worker[19] a_worker[20] a_worker[21] a_worker[22] a_worker[23] a_worker[24]
## 0.97644795 0.60508587 4.07672760 0.53540105 0.96487614 0.61369462
## a_worker[25] a_worker[26] a_worker[27] a_worker[28] a_worker[29] a_worker[30]
## 0.64977870 1.99899764 0.73319962 0.79424685 0.57202028 1.47555333
## a_worker[31] a_worker[32] a_worker[33] a_worker[34] a_worker[35] a_worker[36]
## 0.79178937 1.47725603 1.25251508 0.76339749 0.62649538 0.60162545
## a_worker[37] a_worker[38] a_worker[39] a_worker[40] a_worker[41] a_worker[42]
## 0.49966347 1.08235997 5.16548289 2.46415650 0.56524445 0.40781018
## a_worker[43] a_worker[44] a_worker[45] a_worker[46] a_worker[47] a_worker[48]
## 0.53647838 3.80617028 1.63321090 0.81389465 0.64732810 0.58657883
## a_worker[49] a_worker[50] a_worker[51] a_worker[52] a_worker[53] a_worker[54]
## 0.54527507 0.53798643 0.65254359 1.37777698 0.55488911 1.07810943
## a_worker[55] a_worker[56] a_worker[57] a_worker[58] a_worker[59] a_worker[60]
## 2.37975170 1.49125014 0.62743463 0.57539195 0.62151526 2.79384795
## a_worker[61] a_worker[62] a_worker[63] a_worker[64] a_worker[65] a_worker[66]
## 0.62722810 0.60989141 0.46824021 3.04380184 0.65025406 3.81356283
## a_worker[67] a_worker[68] a_worker[69] a_worker[70] a_worker[71] a_worker[72]
## 1.43840790 0.65130152 2.91434238 0.58431946 0.59300159 1.64593718
## a_worker[73] a_worker[74] a_worker[75] a_worker[76] a_worker[77] a_worker[78]
## 3.17821848 3.39199590 0.58051115 0.52222142 1.23274006 0.62966732
## a_worker[79] a_worker[80] a_worker[81] a_worker[82] a_worker[83] a_worker[84]
## 1.36212026 0.63697347 0.76210187 1.29843804 1.75182981 0.64535902
## a_worker[85] a_worker[86] a_worker[87] a_worker[88] a_worker[89] a_worker[90]
## 0.59760086 0.55192005 9.09946231 0.98441798 0.54646360 0.73130402
## a_spec[1] a_spec[2] a_spec[3] a_spec[4] a_spec[5] a_spec[6]
## 0.18761129 0.17619858 0.20312334 0.15997216 0.19231273 0.71540897
## a_spec[7] a_spec[8] a_spec[9] a_spec[10] a_spec[11] a_spec[12]
## 18.44706909 4.98426814 0.69231102 4.62222992 3.83070138 1.70602665
## a_spec[13] a_spec[14] a_spec[15] a bnoagg bmean
## 0.48747877 3.32762361 1.61169149 0.01913411 0.61665128 0.98389585
## btrial sigma_worker sigma_spec
## 0.93758693 2.98347777 5.92455170
Plot
## 105 vector or matrix parameters omitted in display. Use depth=2 to show them.
We do an exploratory analysis of a code to investigate generalizations that do not include a magnitude of effect, but infer an effect or not (i.e. “Number of time on sites is pretty unrelated to the number of visits” or “Ad Campaign B resulted in the greatest number of purchases”). We find that the mean aggregation strategy has the highest rate of dichotomous thinking, at 38%, vs disaggregation with means at 32%, and finally disaggregation at 30%. This supports our hypothesis that there will be a higher rate of generalizations coded as dichotomous under the mean aggregation condition.
df %>%
subset(df$dichotomous == TRUE, na.rm=TRUE) %>%
ddply(.(aggStrat), summarise,
total=sum(correct==TRUE)+sum(correct==FALSE),
percentOfTotal=total/618,
Correct=sum(correct==TRUE),
acc=Correct/total) # hard coded from nrow of subset(df$dichotomous == TRUE, na.rm=TRUE)
## aggStrat total percentOfTotal Correct acc
## 1 disagg 184 0.2977346 103 0.5597826
## 2 disagg+mean 200 0.3236246 121 0.6050000
## 3 mean 234 0.3786408 180 0.7692308
## Mean accuracy with standard error, taking into account differences between workers
df %>%
subset(df$dichotomous == TRUE, na.rm=TRUE) %>%
ddply(.(workerId, aggStrat), summarise,
TotalIsEffect=sum(effectSizeMagnitude==TRUE),
total=sum(correct==TRUE)+sum(correct==FALSE),
correct=sum(correct==TRUE),
accuracy=correct/total) %>%
ddply(~aggStrat, summarise,
N = sum((total)),
meanAccuracy = mean(accuracy),
sd = sd(accuracy),
se = sd / sqrt(N))
## aggStrat N meanAccuracy sd se
## 1 disagg 184 0.5472944 0.3866051 0.02850091
## 2 disagg+mean 200 0.5801282 0.3823680 0.02703750
## 3 mean 234 0.7849307 0.3143897 0.02055230
## Using aggStrat as id variables
## 105 vector or matrix parameters omitted in display. Use depth=2 to show them.
## Mean StdDev lower 0.95 upper 0.95 n_eff Rhat
## bnoagg -0.8566891 0.1581353 -1.1685537 -0.5562622 4608.113 1.000128
## bmean -0.5644486 0.1577958 -0.8762769 -0.2564761 5174.422 1.000734
## Mean StdDev lower 0.95 upper 0.95 n_eff Rhat
## bnoagg 0.4245654 1.171325 0.3108162 0.5733481 Inf 2.718630
## bmean 0.5686736 1.170927 0.4163301 0.7737735 Inf 2.720279
## a_worker[1] a_worker[2] a_worker[3] a_worker[4] a_worker[5] a_worker[6]
## 0.14157342 2.84402149 2.02056983 0.82161308 3.80030043 0.42449920
## a_worker[7] a_worker[8] a_worker[9] a_worker[10] a_worker[11] a_worker[12]
## 2.21518137 1.30905073 1.90027121 0.74562103 0.79227399 0.95354742
## a_worker[13] a_worker[14] a_worker[15] a_worker[16] a_worker[17] a_worker[18]
## 2.33272060 3.37762828 2.29886750 1.16813066 2.67251928 5.73893486
## a_worker[19] a_worker[20] a_worker[21] a_worker[22] a_worker[23] a_worker[24]
## 4.96133868 0.49007779 0.88957351 0.91469013 1.03435029 1.13001261
## a_worker[25] a_worker[26] a_worker[27] a_worker[28] a_worker[29] a_worker[30]
## 1.09561114 1.60376669 1.48539833 1.78881737 2.78552001 0.43352650
## a_worker[31] a_worker[32] a_worker[33] a_worker[34] a_worker[35] a_worker[36]
## 1.33906455 1.24983534 0.72334710 1.41242162 2.09764342 2.73002110
## a_worker[37] a_worker[38] a_worker[39] a_worker[40] a_worker[41] a_worker[42]
## 0.60396873 0.23053052 1.13512473 1.27548046 0.82286178 0.06027365
## a_worker[43] a_worker[44] a_worker[45] a_worker[46] a_worker[47] a_worker[48]
## 0.14854118 0.50511196 1.77080170 0.94569199 1.77011396 1.50826733
## a_worker[49] a_worker[50] a_worker[51] a_worker[52] a_worker[53] a_worker[54]
## 0.67860945 0.96179969 0.43404275 2.59343987 0.75074608 1.34063041
## a_worker[55] a_worker[56] a_worker[57] a_worker[58] a_worker[59] a_worker[60]
## 0.82993285 0.36169230 2.91050182 0.85130740 1.38948189 0.40885941
## a_worker[61] a_worker[62] a_worker[63] a_worker[64] a_worker[65] a_worker[66]
## 1.25015453 0.24077538 0.88079930 1.51756259 0.72081710 0.20436293
## a_worker[67] a_worker[68] a_worker[69] a_worker[70] a_worker[71] a_worker[72]
## 1.71889707 0.49528387 0.62774952 1.07180450 1.10820912 0.94002904
## a_worker[73] a_worker[74] a_worker[75] a_worker[76] a_worker[77] a_worker[78]
## 0.41168817 1.63532014 3.09004940 0.74483042 1.79618374 3.84255086
## a_worker[79] a_worker[80] a_worker[81] a_worker[82] a_worker[83] a_worker[84]
## 0.55338873 2.74246807 0.33935102 0.73121980 0.84268166 3.07434946
## a_worker[85] a_worker[86] a_worker[87] a_worker[88] a_worker[89] a_worker[90]
## 2.32832524 0.06677319 0.53647950 1.25349825 0.37563455 0.49321930
## a_spec[1] a_spec[2] a_spec[3] a_spec[4] a_spec[5] a_spec[6]
## 0.03861059 0.03769095 0.20156934 0.10810241 0.15730483 4.48918121
## a_spec[7] a_spec[8] a_spec[9] a_spec[10] a_spec[11] a_spec[12]
## 4.33769020 6.40467838 4.66579165 4.16231966 1.04367312 4.59556069
## a_spec[13] a_spec[14] a_spec[15] a bnoagg bmean
## 2.33362494 3.10906211 3.58113876 0.30243252 0.42456544 0.56867360
## btrial sigma_worker sigma_spec
## 1.04608106 2.81358175 7.50134184
## 105 vector or matrix parameters omitted in display. Use depth=2 to show them.
## Mean StdDev lower 0.95 upper 0.95 n_eff Rhat
## bnoagg 0.4245654 1.171325 0.3108162 0.5733481 Inf 2.718630
## bmean 0.5686736 1.170927 0.4163301 0.7737735 Inf 2.720279
## 105 vector or matrix parameters omitted in display. Use depth=2 to show them.
## aggStrat N meanAcc sd se
## 1 disagg 184 0.5472944 0.3866051 0.02850091
## 2 disagg+mean 200 0.5801282 0.3823680 0.02703750
## 3 mean 234 0.7849307 0.3143897 0.02055230