Signal detection theory is one of the most popular modeling methods applied in behavior experiments. This post uses a data set from a previously published paper to demonstrate how to apply Signal Detection Theory (SDT) on behavioral in R.
There are (at least) mainly two approaches to apply SDT (and subsequent statistics) in R:
1. Calculate d’ in each condition and then perform repeated-measures ANOVA;
2. Apply generalized linear mixed-model with probit
link.
In this post, we will focus on the first (and easier) approach. If you are interested in the second approach, you may refer to another paper, where the data analysis codes are available online.
Getting data
Data used here were from a subset of Experiment 1b in Jin et al., (2022). You may get the data by installing the custom package remotes::install_github("haiyangjin/psychr")
and getting the data psychr::jin2022noncon
. (Alternatively, you may download the data directly from GitHub.)
# to install psychr package used here:
# remotes::install_github("haiyangjin/psychr")
df_clean <- psychr::jin2022noncon
head(df_clean, 10)
There are three independent variables:
Congruency
:con
vsinc
;Alignment
:ali
vsmis
;SD
:same
vsdiff
(we will treatsame
as signal anddiff
as noise.)
The dependent variables are:
isSame
: whether the response was “same” (1
) or “different” (0
);Correct
: whether the response was “correct” (1
) or “incorrect” (0
);RT
: response time (this DV will not be used in this post).
Apply signal detection theory in R
In this application, we treat same
trials as “signal” and diff
trials as “noise”. The general steps for calculating the sensitivity d’ are:
- Calculate the hit (
hit
) and false alarm rates (fa
); - Apply corrections if needed;
- Convert rates (in percentage) to standard (Z) values (
z_hit
&z_fa
); - Get the sensitivity d’ with
z_hit - z_fa
.
It is noteworthy that we may use either isSame
or Correct
to calculate hit and false alarm rates. So we try both here and can compare both approaches.
# load packages
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
With Correct
Probably this is more popular to calculate hit and false alarm rates with Correct
, i.e., whether the response was correct on each trial.
The specific steps of calculating d’ with Correct
is as followings:
df_sdt_correct <- df_clean %>%
group_by(SubjID, Congruency, Alignment, SD) %>%
summarize(acc = mean(Correct), # hit and correct rejection
count = n(), # number of trials in each condition
.groups = "drop") %>%
mutate(acc = if_else(SD=="diff", 1-acc, acc), # hit and false alarm (Step 1)
acc = if_else(SD=="same"&acc==1, (2*count-1)/(2*count), acc), # correction for hit
acc = if_else(SD=="diff"&acc==0, 1/(2*count), acc), # correction for false alarm
z = qnorm(acc)) %>% # convert to standard Z value
pivot_wider(id_cols = c(SubjID, Congruency, Alignment), # make hit and false alarm into two columns
names_from = SD,
values_from = z) %>%
mutate(d = same - diff, # calculate d from hit and false alarm
d = round(d,2)) %>%
select(-c(same, diff))
# display results as wide format
df_sdt_correct %>%
pivot_wider(names_from = c(Congruency, Alignment),
values_from = d)
With isSame
The other approach is to use isSame
, i.e., whether the response was “same” on each trial.
df_sdt_issame <- df_clean %>%
group_by(SubjID, Congruency, Alignment, SD) %>%
summarize(acc = mean(isSame), # hit and false alarm
count = n(), # number of trials in each condition
.groups = "drop") %>%
mutate(acc = if_else(SD=="same"&acc==1, (2*count-1)/(2*count), acc),
acc = if_else(SD=="diff"&acc==0, 1/(2*count), acc), # correction for false alarm
z = qnorm(acc)) %>% # convert to standard Z value
pivot_wider(id_cols = c(SubjID, Congruency, Alignment), # make hit and false alarm into two columns
names_from = SD,
values_from = z) %>%
mutate(d = same - diff, # calculate d from hit and false alarm
d = round(d,2)) %>%
select(-c(same, diff))
df_sdt_issame %>%
pivot_wider(names_from = c(Congruency, Alignment),
values_from = d)
isSame
vs. Correct
We may compare the data frames obtained from both approaches:
all.equal(df_sdt_correct, df_sdt_issame)
## [1] TRUE
The data frames obtained from both approaches are the same. For those who wonder why the second approach is worthy discussing here, the second approach is more consistent to when probit
link is used in generalized linear mixed-effects modeling to apply SDT. For more, please see this paper and its open codes.
Use library(psychr)
In addition to calculating d’ manually in R, I also made an R function (psychr::sdt()
) to calculate d’ in R.
# to install psychr package used here:
# remotes::install_github("haiyangjin/psychr")
library(psychr)
An example with the same data set is:
sdt_clean <- psychr::sdt(df_clean,
SN = "SD", # the column containing the signal/noise information
isSignal = "isSame", # the column contains "isSignal"
SubjID = "SubjID",
group_other = c("Congruency", "Alignment"),
signal = "same") # which level in SN is signal
sdt_clean$df %>%
mutate(d = round(d,2)) %>%
pivot_wider(names_from = c(Congruency, Alignment),
values_from = d)
all.equal(df_sdt_correct, sdt_clean$df)
## [1] "Component \"d\": Mean relative difference: 0.001039788"
all.equal(df_sdt_issame, sdt_clean$df)
## [1] "Component \"d\": Mean relative difference: 0.001039788"
The data frame obtained from psychr::sdt()
is the same as those from the first two approaches. So you may either use custom codes or psychr::sdt()
to apply SDT in R!