Power and Sample Size Simulation: Typicality, goodness, imageability, and familiarity of stimuli across 16 categories
Necdet Gürkan & Yangyang Yu
2024-11-10
vanpaemel_vignette.Rmd
Project/Data Title:
Exemplar by feature applicability matrices and other Dutch normative data for semantic concepts
Data provided by: Wolf Vanpaemel
Project/Data Description:
This data provides extensive exemplar by feature applicability matrices covering 15 or 16 different categories (birds, fish, insects, mammals, amphibians/reptiles, clothing, kitchen utensils, musical instruments, tools, vehicles, weapons, fruit, vegetables, professions, and sports), as well as two large semantic domains (animals and artifacts). For all exemplars of the semantic categories, typicality ratings, goodness ratings, goodness rank order, generation frequency, exemplar associative strength, category associative strength, estimated age of acquisition, word frequency, familiarity ratings, imageability ratings, and pairwise similarity ratings are described as well. The structure of the dataset is not programming language friendly. Here, we only consider typicality.
Methods Description:
The typicality data were collected as part of a larger data collection. Here we describe the typicality data collection only. The data collection took place in a large classroom where all the participants were present at the same time. The participants received a booklet with instructions on the first page, followed by four sheets with a semantic category label printed in bold on top. Each of the category labels was followed by a list of 5–33 items belonging to that category, referring to exemplars. The participants were asked to indicate, for every item in the list, how typical it was for the category printed on top of the page. They used a Likert-type rating scale, ranging from 1 for very atypical items to 20 for very typical items. If they encountered an exemplar they did not know, they were asked to circle it. Every participant completed typicality ratings for four different categories. The assignment of categories to participants was randomized. For every category, four different random permutations of the exemplars were used, and each of these permutations was distributed with an equal frequency among the participants. All the exemplars of a category were rated by 28 different participants.
Data Location:
https://static-content.springer.com/esm/art%3A10.3758%2FBRM.40.4.1030/MediaObjects/DeDeyne-BRM-2008b.zip and included here.
### for typicality data -- cleaning and processing
typicality_fnames <- list.files(path = "data/vanpaemel_data",
full.names = TRUE)
typicality_dfs <- lapply(typicality_fnames, read.csv)
ID <- c(1:16)
typicality_dfs <- mapply(cbind, typicality_dfs, "SampleID" = ID, SIMPLIFY = F)
typicality_all_df <- bind_rows(typicality_dfs)
typicality_all_df_v2 <- typicality_all_df %>%
unite("comp_group", X:X.1, remove = TRUE) %>%
select(-c(30,31,32,33,34)) %>%
drop_na(c(2:29)) %>%
filter_all(any_vars(!is.na(.))) %>%
dplyr::rename(compType = SampleID)
# typicality_all_df_v2
typicality_all_df_v3 <- typicality_all_df_v2 %>%
select(starts_with("X"), compType, comp_group) %>%
pivot_longer(cols = starts_with("X"),
names_to = "participant",
values_to = "score")
head(typicality_all_df_v3)
#> # A tibble: 6 × 4
#> compType comp_group participant score
#> <int> <chr> <chr> <int>
#> 1 1 kikker_frog X.2 18
#> 2 1 kikker_frog X.3 20
#> 3 1 kikker_frog X.4 19
#> 4 1 kikker_frog X.5 12
#> 5 1 kikker_frog X.6 20
#> 6 1 kikker_frog X.7 15
Dataset Citation:
De Deyne, S., Verheyen, S., Ameel, E. et al. Exemplar by feature applicability matrices and other Dutch normative data for semantic concepts. Behavior Research Methods 40, 1030–1048 (2008). https://doi.org/10.3758/BRM.40.4.1030
Column Metadata:
Variable Name |
Variable Description |
Type (numeric, character, logical, etc.) |
---|---|---|
compType |
Comparison type for typicality rating |
Character |
comp_group |
Individual items within compType |
Character |
participant |
Participant number |
Character |
score |
Typicality: how typical is the item for the category? |
Numeric |
AIPE Analysis:
Stopping Rule
In this example, we will pick one comparison type and use the items within that to estimate sample size. This choice is arbitrary!
# individual SEs among different comparison group
SE <- tapply(typicality_all_df_v3$score, typicality_all_df_v3$compType, function (x) { sd(x)/sqrt(length(x)) })
SE
#> 1 2 3 4 5 6 7 8
#> 0.4847915 0.1868793 0.1894860 0.2326625 0.1862387 0.2310363 0.1433243 0.1751163
#> 9 10 11 12 14 16
#> 0.1888044 0.1563060 0.2512611 0.1945454 0.2042343 0.2520606
min(SE)
#> [1] 0.1433243
max(SE)
#> [1] 0.4847915
# comparison type 1: amphibians
typicality_data_gp1_sub <- subset(typicality_all_df_v3, compType == 1)
# individual SEs for comparison type 1
SE1 <- tapply(typicality_data_gp1_sub$score, typicality_data_gp1_sub$comp_group, function (x) { sd(x)/sqrt(length(x)) })
SE1
#> kikker_frog krokodil_crocodile pad_toad
#> 0.4836714 1.1085074 0.7368140
#> salamander_salamander schildpad_tortoise
#> 0.7531742 1.6330366
# sequence of sample sizes to try
nsim <- 10 # small for cran
samplesize_values <- seq(5, 200, 5)
# create a blank table for us to save the values in
sim_table <- matrix(NA,
nrow = length(samplesize_values)*nsim,
ncol = length(unique(typicality_data_gp1_sub$comp_group)))
# make it a data frame
sim_table <- as.data.frame(sim_table)
# add a place for sample size values
sim_table$sample_size <- NA
sim_table$var <- "score"
iterate <- 1
for (p in 1:nsim){
# loop over sample sizes for comparison type
for (i in 1:length(samplesize_values)){
# temp dataframe for comparison type 1 that samples and summarizes
temp1 <- typicality_data_gp1_sub %>%
dplyr::group_by(comp_group) %>%
dplyr::sample_n(samplesize_values[i], replace = T) %>%
dplyr::summarize(se2 = sd(score)/sqrt(length(score)))
# add to table
colnames(sim_table)[1:length(unique(typicality_data_gp1_sub$comp_group))] <- temp1$comp_group
sim_table[iterate, 1:length(unique(typicality_data_gp1_sub$comp_group))] <- temp1$se2
sim_table[iterate, "sample_size"] <- samplesize_values[i]
sim_table[iterate, "nsim"] <- p
iterate <- 1 + iterate
}
}
Calculate the cutoff score with information necessary for correction.
cutoff <- calculate_cutoff(population = typicality_data_gp1_sub,
grouping_items = "comp_group",
score = "score",
minimum = min(typicality_data_gp1_sub$score),
maximum = max(typicality_data_gp1_sub$score))
cutoff$cutoff
#> 40%
#> 0.7466301
### for response outputs
# figure out cut off
final_sample <-
sim_table %>%
pivot_longer(cols = -c(sample_size, var, nsim)) %>%
dplyr::rename(item = name, se = value) %>%
dplyr::group_by(sample_size, var, nsim) %>%
dplyr::summarize(percent_below = sum(se <= cutoff$cutoff)/length(unique(typicality_data_gp1_sub$comp_group))) %>%
ungroup() %>%
# then summarize all down averaging percents
dplyr::group_by(sample_size, var) %>%
summarize(percent_below = mean(percent_below)) %>%
dplyr::arrange(percent_below) %>%
ungroup()
#> `summarise()` has grouped output by 'sample_size', 'var'. You can override
#> using the `.groups` argument.
#> `summarise()` has grouped output by 'sample_size'. You can override using the
#> `.groups` argument.
flextable(final_sample) %>% autofit()
sample_size |
var |
percent_below |
---|---|---|
10 |
score |
0.12 |
5 |
score |
0.16 |
15 |
score |
0.18 |
20 |
score |
0.30 |
25 |
score |
0.38 |
30 |
score |
0.52 |
35 |
score |
0.56 |
40 |
score |
0.56 |
45 |
score |
0.62 |
50 |
score |
0.62 |
55 |
score |
0.66 |
60 |
score |
0.68 |
65 |
score |
0.78 |
70 |
score |
0.80 |
75 |
score |
0.82 |
80 |
score |
0.82 |
85 |
score |
0.82 |
95 |
score |
0.82 |
110 |
score |
0.84 |
115 |
score |
0.84 |
100 |
score |
0.86 |
90 |
score |
0.88 |
105 |
score |
0.88 |
125 |
score |
0.88 |
135 |
score |
0.88 |
130 |
score |
0.90 |
150 |
score |
0.92 |
120 |
score |
0.94 |
140 |
score |
0.94 |
155 |
score |
0.94 |
195 |
score |
0.96 |
160 |
score |
0.98 |
170 |
score |
0.98 |
175 |
score |
0.98 |
145 |
score |
1.00 |
165 |
score |
1.00 |
180 |
score |
1.00 |
185 |
score |
1.00 |
190 |
score |
1.00 |
200 |
score |
1.00 |
Calculate the final corrected scores:
final_scores <- calculate_correction(proportion_summary = final_sample,
pilot_sample_size = length(unique(typicality_data_gp1_sub$participant)),
proportion_variability = cutoff$prop_var)
flextable(final_scores) %>% autofit()
percent_below |
sample_size |
corrected_sample_size |
---|---|---|
80 |
70 |
69.05319 |
88 |
90 |
86.00674 |
94 |
120 |
108.71054 |
100 |
145 |
126.05262 |