Power and Sample Size Simulation: Attractiveness Ratings Example
Erin M. Buchanan
2024-11-10
batres_vignette.Rmd
Vignette Setup:
knitr::opts_chunk$set(echo = TRUE)
# Set a random seed
set.seed(5989320)
# Libraries necessary for this vignette
library(rio)
library(flextable)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
library(psych)
library(semanticprimeR)
#>
#> Attaching package: 'semanticprimeR'
#> The following object is masked from 'package:dplyr':
#>
#> top_n
# Function for simulation
item_power <- function(data, # name of data frame
dv_col, # name of DV column as a character
item_col, # number of items column as a character
nsim = 10, # small for cran
sample_start = 20,
sample_stop = 200,
sample_increase = 5,
decile = .5){
DF <- cbind.data.frame(
"dv" = data[ , dv_col],
"items" = data[ , item_col]
)
# just in case
colnames(DF) <- c("dv", "items")
# figure out the "sufficiently narrow" ci value
SE <- tapply(DF$dv, DF$items, function (x) { sd(x)/sqrt(length(x)) })
cutoff <- quantile(SE, probs = decile)
# sequence of sample sizes to try
samplesize_values <- seq(sample_start, sample_stop, sample_increase)
# create a blank table for us to save the values in
sim_table <- matrix(NA,
nrow = length(samplesize_values)*nsim,
ncol = length(unique(DF$items)))
# make it a data frame
sim_table <- as.data.frame(sim_table)
# add a place for sample size values
sim_table$sample_size <- NA
iterate <- 1
for (p in 1:nsim){
# loop over sample sizes
for (i in 1:length(samplesize_values)){
# temp that samples and summarizes
temp <- DF %>%
group_by(items) %>%
sample_n(samplesize_values[i], replace = T) %>%
summarize(se = sd(dv)/sqrt(length(dv)))
# dv on items
colnames(sim_table)[1:length(unique(DF$items))] <- temp$items
sim_table[iterate, 1:length(unique(DF$items))] <- temp$se
sim_table[iterate, "sample_size"] <- samplesize_values[i]
sim_table[iterate, "nsim"] <- p
iterate <- iterate + 1
}
}
# figure out cut off
final_sample <- sim_table %>%
pivot_longer(cols = -c(sample_size, nsim)) %>%
dplyr::rename(item = name, se = value) %>%
group_by(sample_size, nsim) %>%
summarize(percent_below = sum(se <= cutoff)/length(unique(DF$items))) %>%
ungroup() %>%
# then summarize all down averaging percents
dplyr::group_by(sample_size) %>%
summarize(percent_below = mean(percent_below)) %>%
dplyr::arrange(percent_below) %>%
ungroup()
return(list(
SE = SE,
cutoff = cutoff,
DF = DF,
sim_table = sim_table,
final_sample = final_sample
))
}
Project/Data Description:
This dataset contains 200 participants rating 20 faces on attractiveness. Ethical approval was received from the Franklin and Marshall Institutional Review Board and each participant provided informed consent. All participants were located in the United States. Participants were instructed that they would be viewing several faces which were photographed facing forward, under constant camera and lighting conditions, with neutral expressions, and closed mouths. Each participant would have to rate the attractiveness of the presented faces. More specifically, participants were asked “How attractive is this face?”, where 1 = “Not at all attractive” and 7 = “Very attractive”. Participants rated each face individually, in random order, and with no time limit. Upon completion, participants were paid for participation in the study.
Data Location:
Included with the vignette.
DF <- import("data/batres_data.sav")
str(DF)
#> 'data.frame': 200 obs. of 21 variables:
#> $ Participant_Number: num 1 2 3 4 5 6 7 8 9 10 ...
#> ..- attr(*, "label")= chr "Unique number assigned to each participant"
#> ..- attr(*, "format.spss")= chr "F3.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_1 : num 1 2 5 2 3 1 2 2 1 2 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #1"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_2 : num 1 6 5 2 3 1 3 2 2 2 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #2"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_3 : num 3 6 7 7 4 3 5 4 4 4 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #3"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_4 : num 3 7 5 3 4 3 3 3 4 3 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #4"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_5 : num 5 7 7 5 5 6 3 3 3 3 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #5"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_6 : num 5 5 4 5 6 5 4 4 5 3 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #6"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_7 : num 5 7 7 7 4 5 4 4 5 4 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #7"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_8 : num 4 1 5 3 4 4 4 4 2 4 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #8"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_9 : num 3 5 4 4 3 1 2 2 2 2 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #9"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_10 : num 4 4 7 2 3 3 3 3 5 4 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #10"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_11 : num 2 3 5 4 3 2 3 3 4 2 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #11"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_12 : num 4 7 5 4 4 4 3 3 6 1 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #12"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_13 : num 3 3 4 5 4 3 3 3 3 2 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #13"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_14 : num 5 7 5 5 3 5 5 5 4 2 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #14"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_15 : num 3 7 6 3 4 6 3 3 4 4 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #15"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_16 : num 4 7 5 5 5 4 4 3 5 3 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #16"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_17 : num 4 4 5 3 5 4 3 2 4 2 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #17"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_18 : num 3 5 4 6 4 5 4 5 4 2 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #18"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_19 : num 3 4 5 6 4 4 4 3 3 4 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #19"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
#> $ Face_20 : num 4 6 6 3 6 4 3 3 4 3 ...
#> ..- attr(*, "label")= chr "Attractiveness rating for face #20"
#> ..- attr(*, "format.spss")= chr "F1.0"
#> ..- attr(*, "display_width")= int 12
Column Metadata:
Variable Name |
Variable Description |
Type (numeric, character, logical, etc.) |
---|---|---|
Participant_Number |
Unique number assigned to each participant |
Numeric |
Face_1 |
Attractiveness rating for face #1 |
Numeric |
Face_2 |
Attractiveness rating for face #2 |
Numeric |
Face_3 |
Attractiveness rating for face #3 |
Numeric |
Face_4 |
Attractiveness rating for face #4 |
Numeric |
Face_5 |
Attractiveness rating for face #5 |
Numeric |
Face_6 |
Attractiveness rating for face #6 |
Numeric |
Face_7 |
Attractiveness rating for face #7 |
Numeric |
Face_8 |
Attractiveness rating for face #8 |
Numeric |
Face_9 |
Attractiveness rating for face #9 |
Numeric |
Face_10 |
Attractiveness rating for face #10 |
Numeric |
Face_11 |
Attractiveness rating for face #11 |
Numeric |
Face_12 |
Attractiveness rating for face #12 |
Numeric |
Face_13 |
Attractiveness rating for face #13 |
Numeric |
Face_14 |
Attractiveness rating for face #14 |
Numeric |
Face_15 |
Attractiveness rating for face #15 |
Numeric |
Face_16 |
Attractiveness rating for face #16 |
Numeric |
Face_17 |
Attractiveness rating for face #17 |
Numeric |
Face_18 |
Attractiveness rating for face #18 |
Numeric |
Face_19 |
Attractiveness rating for face #19 |
Numeric |
Face_20 |
Attractiveness rating for face #20 |
Numeric |
AIPE Analysis:
The data should be in long format with each rating on one row of data.
# Reformat the data
DF_long <- pivot_longer(DF, cols = -c(Participant_Number)) %>%
dplyr::rename(item = name, score = value)
flextable(head(DF_long)) %>% autofit()
Participant_Number |
item |
score |
---|---|---|
1 |
Face_1 |
1 |
1 |
Face_2 |
1 |
1 |
Face_3 |
3 |
1 |
Face_4 |
3 |
1 |
Face_5 |
5 |
1 |
Face_6 |
5 |
Stopping Rule
# Function for simulation
var1 <- item_power(data = DF_long, # name of data frame
dv_col = "score", # name of DV column as a character
item_col = "item", # number of items column as a character
nsim = 10,
sample_start = 20,
sample_stop = 300,
sample_increase = 5,
decile = .4)
#> `summarise()` has grouped output by 'sample_size'. You can override using the
#> `.groups` argument.
What the usual standard error for the data that could be considered for our stopping rule using the 40%% decile?
# individual SEs
var1$SE
#> Face_1 Face_10 Face_11 Face_12 Face_13 Face_14 Face_15
#> 0.09117808 0.09064190 0.10007472 0.09739767 0.08562437 0.08767230 0.09331351
#> Face_16 Face_17 Face_18 Face_19 Face_2 Face_20 Face_3
#> 0.10262632 0.09082536 0.09530433 0.09123386 0.08818665 0.09799754 0.08644573
#> Face_4 Face_5 Face_6 Face_7 Face_8 Face_9
#> 0.08915009 0.09127172 0.09078109 0.09968796 0.08977638 0.09481468
var1$cutoff
#> 40%
#> 0.09080765
Using our 40%% decile as a guide, we find that 0.091 is our target standard error for an accurately measured item.
Minimum Sample Size
To estimate minimum sample size, we should figure out what number of participants it would take to achieve 80%, 85%, 90%, and 95% of the SEs for items below our critical score of 0.091?
cutoff <- calculate_cutoff(population = DF_long,
grouping_items = "item",
score = "score",
minimum = 1,
maximum = 7)
# showing how this is the same as the person calculated version versus semanticprimeR's function
cutoff$cutoff
#> 40%
#> 0.09080765
Please note that you will always need to simulate larger than the pilot data sample size to get the starting numbers. We will correct them below. As shown in our manuscript, we need to correct for the overestimation of sample sizes based on the original pilot data size. Given that the pilot data is large: 200, this correction is especially useful. This correction is built into our function.
final_table <- calculate_correction(
proportion_summary = var1$final_sample,
pilot_sample_size = nrow(DF),
proportion_variability = cutoff$prop_var
)
flextable(final_table) %>%
autofit()
percent_below |
sample_size |
corrected_sample_size |
---|---|---|
80.0 |
230 |
54.60714 |
90.5 |
245 |
62.29201 |
90.5 |
245 |
62.29201 |
97.0 |
255 |
68.01402 |
Our minimum suggested sample size does not exist at exactly 80% of the items, but instead we can use the first available over 80% (n = 55 as the minimum).
Maximum Sample Size
While there are many considerations for maximum sample size (time, effort, resources), the simulation suggests that 68 people would ensure nearly all items achieve cutoff criterions.
Final Sample Size
In any estimate for sample size, you should also consider the potential for missing data and/or unusable data due to any other exclusion criteria in your study (i.e., attention checks, speeding, getting the answer right, etc.). In this study, we likely expect all participants to see all items and therefore, we could expect to use the minimum sample size as our final sample size, the point at which all items reach our SE criterion, or the maximum sample size.