**Note that due to version-related issues the code in this post is displayed but is not excecuted/run.**

#### Background

Research in classrooms and schools can be complex because of all of the factors that matter. A question that often comes up when we say that we observed some pattern in data is, *but did you control for X*?

In the context of working on an approach to find out how impactful an omitted variable would need to be to invalidate an inference, we had to modify a function that worked for a single sensitivity analysis to work for many and to be easier to use.

#### The initial version of the function

In the context of programming (and mathematics!), many functions take inputs and then transform them into output.

Imagine we have a function that takes two values used to calculate a *t*-test, the *t* statistic (the coefficient divided by its standard error) and the degrees of freedom of the *t* distribution. it then returns a bunch of output about how sensitive an inference based on the *t*-test is to bias or to an omitted confounding variable.

Here is a function that outputs values about the sensitivity of a *t*-test as a row in a row of a data frame (actually an R `data.frame`

modified to be a bit easier to work with, an `R`

tibble).

It’s a bit of a whopper:

```
core_sensitivity_mkonfound <- function(t, df, alpha = .05, tails = 2) {
critical_t <- stats::qt(1 - (alpha / tails), df)
critical_r <- critical_t / sqrt((critical_t ^ 2) + df)
obs_r <- abs(t / sqrt(df + (t ^ 2)))
# for replacement of cases framework
if (abs(obs_r) > abs(critical_r)) {
action <- "to_invalidate"
inference <- "reject_null"
pct_bias <- 100 * (1 - (critical_r / obs_r)) }
else if (abs(obs_r) < abs(critical_r)) {
action <- "to_sustain"
inference <- "fail_to_reject_null"
pct_bias <- 100 * (1 - (obs_r / critical_r)) }
else if (obs_r == critical_r) {
action <- NA
inference <- NA
pct_bias <- NA
}
if ((abs(obs_r) > abs(critical_r)) & ((obs_r * critical_r) > 0)) {
mp <- -1
} else {
mp <- 1
}
# for correlation based framework
itcv <- (obs_r - critical_r) / (1 + mp * abs(critical_r))
r_con <- round(sqrt(abs(itcv)), 3)
out <- dplyr::data_frame(t, df, action, inference, pct_bias, itcv, r_con)
names(out) <- c("t", "df", "action", "inference", "pct_bias_to_change_inference", "itcv", "r_con")
out$pct_bias_to_change_inference <- round(out$pct_bias_to_change_inference, 3)
out$itcv <- round(out$itcv, 3)
out$action <- as.character(out$action)
out$inference <- as.character(out$inference)
return(out)
}
```

Let’s test it out. Imagine we have a *t* statistic of `3`

for a hypothesis test associated with `100`

degrees of freedom.

`core_sensitivity_mkonfound(3, 100)`

Works.

It looks like in order to invalidate the inference, around `32`

% of the effect would be need to be due to bias; or, an committed variable would need to be correlated with both the predictor of interest and the dependent variable at `.339`

in order to invalidate the inference. You can read more about sensitivity analysis and an in-development R package on the approach to sensitivity analysis with Ran Xu and Ken Frank here.

#### The second version of the function

How could we write a function to provide output not only for one *t* and its associated *df*, but rather many values?

We can write a simple function to iterate through multiple values and to bind them together. The key is the `map()`

function (from the `tidyverse`

package `purrr`

; if you are familiar with `R`

, it is similar to many of the `apply()`

functions). Specifically, because we:

- Have two variables that we are iterating through
- Want the output in
`data.frame`

form

We use `map2_dfr()`

. Check out this helpful chapter of R for Data Science for more on iteration using approaches such as for and while loops as well as the useful `apply()`

/ `map()`

families of functions.

Here is what a function could look like:

```
library(purrr)
mkonfound <- function(t, df, alpha = .05, tails = 2) {
map2_dfr(.x = t, .y = df, .f = core_sensitivity_mkonfound)
}
```

Simple! But does it work? :)

Instead of passing a single *t* and *df*, as we did above with the `core_sensitivity_monfound()`

function, we can pass vectors of *t* and *df* values:

```
mkonfound(t = c(3, 2, 2.5),
d = c(100, 200, 150))
```

We could also do something like binding *t* and *df* together into a small `data.frame`

:

```
d <- data.frame(t = c(3, 2, 2.5),
df = c(100, 200, 150))
d
```

And then `mkonfound()`

could work like this:

`mkonfound(d$t, d$df)`

#### The third version of the function

Still seems to work fine. Those of you familiar with the tidyverse may sense another possible improvement. Namely, the function could be written to both input and output a `data.frame`

, and be a bit more intuitive to use via non-standard evaluation.

The goal is to add an additional argument for the `data.frame`

(`d`

), and then use non-standard evaluation to capture *and then later evaluate in the context of the data.frame* the names of the *t* and *df* columns

```
library(rlang)
library(dplyr)
mkonfound <- function(d, t, df, alpha = .05, tails = 2) {
t_enquo <- enquo(t)
df_enquo <- enquo(df)
t = pull(select(d, !!t_enquo))
df = pull(select(d, !!df_enquo))
map2_dfr(.x = t, .y = df, .f = core_sensitivity_mkonfound)
}
```

But does it work? Now, the first argument is the name of the `data.frame`

, the second is the unquoted name of the column with the *t* statistics, and the third is the same as the second, but for the *df* associated with the *t*’s.

`mkonfound(d, t, df)`

If we have an entire spreadsheet, read in R as a `data.frame`

using the `read.csv()`

(or, `read_csv()`

from the very useful `readr`

package) function, then we can easily compute output for all of the statistics in the spreadsheet. Here is a spreadsheet from a website (from Ken’s):

```
spreadsheet_of_vals <- read.csv("https://msu.edu/~kenfrank/example%20dataset%20for%20mkonfound.csv")
head(spreadsheet_of_vals)
```

We would use it the same way as above but with `d`

replaced with what we named the `data.frame`

we read from the website, `spreadsheet_of_vals`

:

`mkonfound(spreadsheet_of_vals, t, df)`

Since the output is in a `data.frame`

, we can, for example, easily plot output:

`results_df <- mkonfound(spreadsheet_of_vals, t, df)`

```
library(ggplot2)
results_df$action <- dplyr::case_when(
results_df$action == "to_invalidate" ~ "To Invalidate",
results_df$action == "to_sustain" ~ "To Sustain"
)
ggplot(results_df, aes(x = pct_bias_to_change_inference, fill = action)) +
geom_histogram() +
scale_fill_manual("", values = c("#1F78B4", "#A6CEE3")) +
theme_bw() +
ggtitle("Histogram of Percent Bias") +
facet_grid(~ action) +
theme(legend.position = "none") +
ylab("Count") +
xlab("Percent Bias")
```

Like many functions in R, this could be written many different ways, and this post shows just one approach to writing a function.

In some cases, non-standard evaluation makes the function a bit harder to use - particularly in cases in which we are interested in the output from only a single study.

In that case, we would want to go back to the function we initially wrote (`core_sensitivity_mkonfound()`

) or would have to write something a bit like:

```
single_study <- data.frame(t = 3, df = 100)
mkonfound(single_study, t, df)
```

So, this is one approach that is useful in one use - for the in-development package for sensitivity analysis with a number of functions, with a version of this `mkonfound()`

function for meta-analyses that make use of the approach.

Oh, and if you are interested in sensitivity analysis, please check out the `konfound`

package this is for here.