Power and Sample Size Simulation: Survival Processing Usefulness
Necdet Gürkan & Yangyang Yu
2024-11-10
roer_vignette.Rmd
Project/Data Description:
The data come from a conceptual replication study on the survival processing effect. The survival processing effect refers to the finding that rating words according to their relevance in a survival-related scenario leads to better retention than processing words in a number of other fictional scenarios. Participants were randomly assigned to one of the rating scenarios (survival, afterlife, moving). The to-be-rated words were presented individually in a random order on the computer screen. Each word remained on the screen for five seconds. Participants rated the words by clicking on a 5-point scale that ranged from completely useless (1) to very useful (5), which was displayed right below the word.
Methods Description:
Participants were students at Heinrich Heine University Düsseldorf, Germany that were paid for participating or received course credit. Their ages ranged from 18 to 55 years. The words to-be-rated consisted of 30 typical members of 30 categories drawn from the updated Battig and Montague norms (Van Overschelde, Rawson, & Dunlosky, 2004).
Data Location:
Data included within this vignette. We drop the scenario column because the standard deviation and mean of item ratings across the scenarios were identical. We also add a participant column to keep this script similar to other ones.
DF <- import("data/roer_data.xlsx")
drops <- c("Scenario")
DF <- DF[ , !(names(DF) %in% drops)]
DF <- cbind(Participant_Number = 1:nrow(DF) , DF)
str(DF)
#> 'data.frame': 218 obs. of 31 variables:
#> $ Participant_Number: int 1 2 3 4 5 6 7 8 9 10 ...
#> $ Item_1 : num 1 3 3 2 3 3 3 1 4 1 ...
#> $ Item_2 : num 2 5 4 1 4 5 4 3 5 3 ...
#> $ Item_3 : num 5 3 1 5 2 2 5 2 3 4 ...
#> $ Item_4 : num 1 4 2 0 3 2 2 1 4 2 ...
#> $ Item_5 : num 3 5 1 5 1 3 5 2 0 4 ...
#> $ Item_6 : num 1 3 2 1 4 1 5 2 2 1 ...
#> $ Item_7 : num 4 2 4 1 4 4 4 1 5 3 ...
#> $ Item_8 : num 3 5 5 4 4 2 5 5 2 3 ...
#> $ Item_9 : num 1 5 4 5 4 2 4 1 3 3 ...
#> $ Item_10 : num 0 5 5 5 0 5 5 4 0 4 ...
#> $ Item_11 : num 3 5 4 5 3 1 5 5 2 1 ...
#> $ Item_12 : num 3 5 5 5 5 1 3 5 1 1 ...
#> $ Item_13 : num 1 1 3 1 2 1 1 4 1 1 ...
#> $ Item_14 : num 1 2 1 4 1 2 2 1 2 5 ...
#> $ Item_15 : num 1 4 2 1 5 1 3 2 2 1 ...
#> $ Item_16 : num 4 5 4 2 3 3 4 3 3 4 ...
#> $ Item_17 : num 3 5 0 1 3 1 4 3 3 3 ...
#> $ Item_18 : num 2 5 4 1 5 5 5 2 5 4 ...
#> $ Item_19 : num 2 1 1 4 1 3 4 2 4 3 ...
#> $ Item_20 : num 5 3 4 5 2 4 5 3 3 2 ...
#> $ Item_21 : num 3 0 4 5 3 4 1 1 4 2 ...
#> $ Item_22 : num 3 5 1 1 2 3 5 4 3 4 ...
#> $ Item_23 : num 1 1 2 1 1 4 3 1 5 2 ...
#> $ Item_24 : num 1 4 4 2 3 2 4 5 2 0 ...
#> $ Item_25 : num 2 5 5 3 5 1 5 4 3 4 ...
#> $ Item_26 : num 1 4 3 1 3 1 2 1 2 3 ...
#> $ Item_27 : num 1 5 1 1 5 2 4 4 3 4 ...
#> $ Item_28 : num 1 5 1 1 4 2 3 5 2 3 ...
#> $ Item_29 : num 1 5 2 3 2 1 4 3 1 3 ...
#> $ Item_30 : num 3 5 4 1 1 2 5 4 4 5 ...
Dataset Citation:
Röer, J. P., Bell, R., & Buchner, A. (2013). Is the survival-processing memory advantage due to richness of encoding? Journal of Experimental Psychology: Learning, Memory, and Cognition, 39, 1294-1302.
Column Metadata:
Variable Name |
Variable Description |
Type (numeric, character, logical, etc.) |
---|---|---|
Items |
Item ratings for item_1 to item_30 |
Numeric |
Scenario |
Categorical scenarios- 1,2,3 |
Numeric |
AIPE Analysis:
DF_long <- pivot_longer(DF, cols = -c(Participant_Number)) %>%
dplyr:: rename(item = name, score = value)
flextable(head(DF_long)) %>% autofit()
Participant_Number |
item |
score |
---|---|---|
1 |
Item_1 |
1 |
1 |
Item_2 |
2 |
1 |
Item_3 |
5 |
1 |
Item_4 |
1 |
1 |
Item_5 |
3 |
1 |
Item_6 |
1 |
Stopping Rule
What is the usual standard error for the data that could be considered for our stopping rule using the 40% decile?
# individual SEs
SE <- tapply(DF_long$score, DF_long$item, function (x) { sd(x)/sqrt(length(x)) })
SE
#> Item_1 Item_10 Item_11 Item_12 Item_13 Item_14 Item_15
#> 0.08860091 0.09743922 0.10419876 0.10725788 0.07167329 0.11502337 0.09746261
#> Item_16 Item_17 Item_18 Item_19 Item_2 Item_20 Item_21
#> 0.09425595 0.08981453 0.08968434 0.09817041 0.10421504 0.10768420 0.10195369
#> Item_22 Item_23 Item_24 Item_25 Item_26 Item_27 Item_28
#> 0.09891678 0.11403914 0.08751293 0.10279479 0.08015419 0.09612067 0.09697587
#> Item_29 Item_3 Item_30 Item_4 Item_5 Item_6 Item_7
#> 0.08758437 0.10963101 0.11319421 0.07944272 0.11514133 0.08852481 0.10163361
#> Item_8 Item_9
#> 0.09615698 0.09118122
cutoff <- quantile(SE, probs = .40)
cutoff
#> 40%
#> 0.09614246
# we could also use the cutoff score function in semanticprimeR
cutoff_score <- calculate_cutoff(population = DF_long,
grouping_items = "item",
score = "score",
minimum = min(DF_long$score),
maximum = max(DF_long$score))
cutoff_score$cutoff
#> 40%
#> 0.09614246
Using our 40% decile as a guide, we find that 0.096 is our target standard error for an accurately measured item.
Minimum Sample Size
To estimate the minimum sample size, we should figure out what number of participants it would take to achieve 80%, 85%, 90%, and 95% of the SEs for items below our critical score of 0.096.
# sequence of sample sizes to try
nsim <- 10 # small for cran
samplesize_values <- seq(20, 500, 5)
# create a blank table for us to save the values in
sim_table <- matrix(NA,
nrow = length(samplesize_values)*nsim,
ncol = length(unique(DF_long$item)))
# make it a data frame
sim_table <- as.data.frame(sim_table)
# add a place for sample size values
sim_table$sample_size <- NA
iterate <- 1
for (p in 1:nsim){
# loop over sample sizes
for (i in 1:length(samplesize_values)){
# temp dataframe that samples and summarizes
temp <- DF_long %>%
group_by(item) %>%
sample_n(samplesize_values[i], replace = T) %>%
summarize(se = sd(score)/sqrt(length(score)))
colnames(sim_table)[1:length(unique(DF_long$item))] <- temp$item
sim_table[iterate, 1:length(unique(DF_long$item))] <- temp$se
sim_table[iterate, "sample_size"] <- samplesize_values[i]
sim_table[iterate, "nsim"] <- p
iterate <- iterate + 1
}
}
final_sample <-
sim_table %>%
pivot_longer(cols = -c(sample_size, nsim)) %>%
dplyr::rename(item = name, se = value) %>%
group_by(sample_size, nsim) %>%
summarize(percent_below = sum(se <= cutoff)/length(unique(DF_long$item))) %>%
ungroup() %>%
# then summarize all down averaging percents
dplyr::group_by(sample_size) %>%
summarize(percent_below = mean(percent_below)) %>%
dplyr::arrange(percent_below) %>%
ungroup()
#> `summarise()` has grouped output by 'sample_size'. You can override using the
#> `.groups` argument.
flextable(final_sample %>% head()) %>% autofit()
sample_size |
percent_below |
---|---|
20 |
0 |
25 |
0 |
30 |
0 |
35 |
0 |
40 |
0 |
45 |
0 |
final_table <- calculate_correction(
proportion_summary = final_sample,
pilot_sample_size = DF_long %>% group_by(item) %>%
summarize(sample_size = n()) %>% ungroup() %>%
summarize(avg_sample = mean(sample_size)) %>% pull(avg_sample),
proportion_variability = cutoff_score$prop_var
)
flextable(final_table) %>%
autofit()
percent_below |
sample_size |
corrected_sample_size |
---|---|---|
80.33333 |
270 |
60.18641 |
85.00000 |
285 |
67.54308 |
92.33333 |
300 |
75.18211 |
95.33333 |
315 |
82.85322 |
Based on these simulations, we can decide our minimum sample size is likely close to 60.