### Table of Contents

# Mediation Toolbox

The Multilevel Mediation and Moderation (M3) Toolbox is a joint project of Tor D. Wager, currently Assoc. Professor of Psychology at the University of Colorado, and Martin A. Lindquist, currently Assoc. Professor of Biotatistics at Johns Hopkins.

Dr. Wager heads the Cognitive and Affective Neuroscience Lab at the University of Colorado, Boulder.

A) Installation and requirements: What do I need to run the toolbox?

B) Using the M3 toolbox to run analyses: A 3-variable analysis

- A single-level example: One observation per subject
- A multi-level example: t observations on each of N subjects C) Running a whole-brain analysis

- Single-level
- Multi-level D) Thresholding and visualizing results

E) Index of functions

## A) Installation and requirements

You will need to download two SCNlab packages, each with several subdirectories, and place them on your Matlab path. You will also need SPM installed and on your Matlab path. The most extensive testing of the current M3 toolbox version has been done with SPM5. Most or all of it should work with SPM99 or SPM2 as well.

The first SCNlab toolbox is the mediation_toolbox package. The second is the SCN_Core_Support package. Both are available on Tor's lab website.

Optional: If you want to take full advantage of the results visualization of the SCNlab tools, you will also need the 3DheadUtility folder (not on the web currently; ask the authors if interested).

To install these, place them in a folder on your hard drive, and type

pathtool at the Matlab command prompt. Select each of the folders named above, and select “add with subfolders”

##### Use of SPM and other toolboxes

The robust regression toolbox is not an SPM toolbox per se. It uses SPM image manipulation (data I/O) functions, but does not rely on any SPM functions for statistics. It does, however, use the Matlab Statistics Toolbox, so you will need that as well.

There is an SPM5 GUI, which you can use to set up analyses. However, at this point, the GUI is “very” beta…actually, we all just run things from the command line and I've never used the GUI. I'd like to make the GUI work well, but it might take a bit of time. In the meantime, the command line functions are very easy to run.

## Using the M3 toolbox to run analyses: A 3-variable analysis

NOTE: This is not intended as a comprehensive guide to mediation analysis, but as a way to provide at least a minimal description of the M3 toolbox functions. We hope that this will develop and expand over time into more comprehensive documentation, and we hope that you will contribute to this effort!

### Performing a single mediation test on data from any source

mediation.m is a function that will run single- or multi-level mediation analyses on any kind of data. You first need to define variables in Matlab with the data in them. type » help mediation at the command line for a list of options.

You need to enter 3 variables:

- X, the initial variable
- Y, the outcome variable
- M, a potential mediator

You can also specify:

- Covariates, which are controlled for in all regressions in the mediation
- Additional mediators, which are controlled for only in M→Y regressions
- Moderators (but not yet! COMING SOON, we hope.)

Here are some examples using brain imaging data, though the tools are generic and can be applied to any kind of data.

#### Single-level example

For a single-level analysis, you might begin with data from a contrast of interest, so you have one image per subject. You might then extract average data for each subject from a brain region of interest (ROI). This becomes the X variable. If you have 20 subjects, you will have X = a [20 x 1] vector of average ROI scores.

You might then specify a performance variable as the Y variable. It would be one number per subject representing the outcome of interest. So Y = a [20 x 1] vector of average performance scores.

M might be data from another ROI that might mediate the relationship between X and Y. M = a [20 x 1] vector of average contrast values from ROI 2.

In the example below, we'll run mediation.m on simulated data with 20 “subjects,” using a bootstrap test.

X = rand(20, 1); M = X + rand(20, 1); Y = M + rand(20, 1); [paths, stats] = mediation(X, Y, M, 'plots', 'verbose', 'boot', 'bootsamples', 10000)

With plotting options on, the output I get to the screen looks like this:

The output I get in the command window shows me the stats for each relationship: ————- Mediation analysis

Observations: 20, Replications: 1 Predictor (X): X, Outcome (Y): Y: Mediator (M): M

Covariates: No

Single-level analysis. Options:

Plots: Yes Bootstrap: Yes Robust: No Bootstrap or sign perm samples: 10000

Bootstrapping: Min p-value is 0.000100. Adding 0 samples Done in 2 (s) Done. Mediation scatterplots: Replications: 1 Covariates controlled for in all regressions: 0 predictors Additional mediators controlled for in outcome predictions: 1 predictors

Single-level model

a b c' c ab

Coeff 1.18 1.03 -0.01 1.19 1.20

STE 0.18 0.24 0.35 0.24 0.29

t (~N) 6.65 4.29 -0.04 4.88 4.11

Z 3.58 3.67 -0.11 3.52 3.71

p 0.0003 0.0002 0.9132 0.0004 0.0002

Total time: 3 s

The output is tab delimited so you can paste it into excel etc.

#### Multi-level example

For a multi-level analysis, let's stick with the same example. We have X = ROI1, Y = performance, M = ROI2 But now, consider the case where I want to analyze relationships across trials, within subjects. Now, X, Y, and M will be time-series data. Let's say you have 200 observations across time for each of 20 participants. You'll need to define X, Y, and M vectors for EACH SUBJECT that are each 200 x 1 elements long.

NOTE: if x is performance data from a series of trials or events, you'll need trial-by-trial data for X and Y as well. Another example where you'd be most likely to have this kind of data readily available from fMRI imaging is when you're trying to predict scan-to-scan timecourses of physiological responses (e.g., heart rate) across time with brain ROIs.

For a multi-level analysis, you enter data in cells. X{1} is X data for subject 1, M{1} is M data for subject 1, Y{1} is Y data for subject 1, X{2} s X data for subject 2, and so on. You should have X = a 1 x 20 cell array, where each cell is a [200 x 1] vector. Same for M and Y.

You can enter these variables into mediation.m in the same way as for the single-level model.

## Running a whole-brain analysis

Download sample data and Powerpoint walkthrough!!! This is in the “Mediation_example_data” subfolder in the Mediation Toolbox. This is a step-by-step guide to running an analysis and getting the results.

The idea of a whole-brain mediation analysis (or search through mask or ROI voxels) is to search for regions that show statistical mediation of the relationship between a known initial variable (X) and a known outcome variable (Y). You can search for voxels that satisfy the multiple constraints that make the strongest case for mediation, or search for maps of each effect in the 3-variable x-m-y mediation equation.

In a search, you specify data for 2 of the 3 variables (X, M, Y) and a set of images as the other variable. So, for a single-level analysis, X could be a 20 x 1 vector of data from the prefrontal cortex, Y could be a 20 x 1 vector of outcome scores, and M could be a string matrix of 20 image names (one subject's image per row, one image per subject).

For a multi-level analysis, let's say you have t = 200 observations on each of 20 subjects. X is 1 x 20 cell array, and each cell contains a 200 x 1 vector of observations on each subject. Ditto for Y. M is a 1 x 20 cell array, where each cell contains a string matrix of 200 image names for one subject (4-D files are OK as well; the key is to have 200 volumes for each subject).

The main functions are below. All of them write out images in the current directory and write a mediation_SETUP.mat file. There are many options for how to use them, which are documented in their individual help files.

**mediation_brain** – Search using a set of images for either X, M, or Y.

**mediation_brain_multilevel** - Search using a set of images for M, using a two-level model of observations nested within subjects

**igls** - Two-level hierarchical regression with cariance component estimation using both restricted and normal Iterative Generalized Least Squares

NOTE: right now, you have to enter X, M, and Y data or images. There is not yet facility for handling interposed design matrices. If you want to work with “betas”, you're best off running a first-level analysis on each subject and then doing a single-level mediation on betas or contrast images, one per subject. You might want to use the multi-level search option if you: a) Have a series of trial-by-trial response estimates, or b) are using timeseries data and can specify the timeseries observations for X, M, and Y. An example of where you DON'T satisfy (b) is if you have 100 trials and 500 images per subject, and you're interested in a performance measure assessed on each trial (t = 100), but you have 500 fMRI time points. Getting a trial-by-trial estimate would mean estimating responses for each of the 100 trials. An example of where you DO satisfy (b) is if you're looking for brain-physiology relationships both sampled at the same time resolution.

## Thresholding and visualizing results

Once you've estimated a model with one of the functions above, the next step is to visualize the results. The main gateway function for this is:

**mediation_brain_results**

Using this, you can display results from mediation_brain and other SCNlab toolboxes on brain “orthoviews”, extract data and save clusters (blobs) with useful info for visualization and ROI analysis

mediation_brain_results will display orthviews with one or more effects in separate viewer panes. There are many options. For example, if you enter 'all', it will show you the 'a' (X→M), 'b' (M→Y), 'c' (X→Y), 'c-prime' (direct) and 'ab' (mediation) effects from a mediation search analysis.

It also returns clusters, a structure with a special format that is widely used in our lab. Clusters are a series of structures in a vector, each element of which contains information about a contiguous blob. By default (if output variables are requested and it can find valid images), mediation_brain_results returns clusters for both positive and negative effects for the *last* image in the series of effects shown with image data extracted for each region.

Once you have a clusters structure, you can do lots of things with it:

- Extract data from any images you want and do ROI analysis: see extracting_timeseries_data
- Plot blobs on orthviews, with many options for control
- Make a montage with montage_clusters
- Make a table of coordinates, etc. with cluster_table or mediation_brain_print_tables
- Make a surface image plot with cluster_surf
- Plot on flexible, custom surfaces by combining use of addbrain and cluster_surf

Here are some links to other pages on our WIKI that cover results visualization with mediation_brain_results and other tools:

### Small volume correction

The mediation toolbox contains a function for small-volume correction based on a permutation test.

The function corrects for multiple comparisons within a region of continuous voxels (e.g., an ROI or mask).

The key function to run is mediation_permutation_svc_fwe.m

However, this works only for a single-level mediation model. (And, right now, it probably only works if you're searching for mediators).

For a multi-level mediation, nothing exists yet, so you would have to create your own function.

For nonparametric multiple comparisons correction in robust_regression, use: robust_reg_nonparam.m and related functions, and see: Robust_nonparam_flowchart.ppt, which is in the robust regression directory in the “nonparam” subfolder.

#### Example code: mediation_permutation_svc_fwe

For notes on what mediation_permutation_svc_fwe does, type (in matlab)

>> help mediation_permutation_svc_fwe

This code runs an analysis in an amygdala region of interest:

cd('/Users/tor/Documents/Tor_Documents/CurrentExperiments/Lab_Emotion/Pre-appraisal/Spontaneous_vs_voluntary_reg/mediation_RVLPFC_final') load mediation_SETUP cl = mask2clusters('mask_spm2_amy_new.img'); cl = tor_extract_rois(SETUP.M, cl); dat = cl(2).all_data; cl(2).mm_center diary FWEsim_R_amy_ROI_boot.txt, MC_FWE = mediation_permutation_svc_fwe(SETUP.X, SETUP.Y, dat, 1000, 'boot'); save FWEsim_R_amy_ROI_boot MC_FWE, diary off

## Index of functions

#### Top-level functions many users will want to run from the command line

mediation – Examines 3 timeseries to determine if one of them acts as a mediator between the other two. Works for both single-level and multi-level (multiple subjects/observations) data

mediation_brain – Given two variables of the mediation formula, searches over functional MRI data for candidates for the third variable

igls - Two-level hierarchical regression with cariance component estimation using both restricted and normal Iterative Generalized Least Squares

igls_mediation – Use (R)IGLS to compute mediation paths NOTE: NEEDS TO BE UPDATED BEFORE READY FOR USE

mediation_brain_results – Display results from mediation_brain and other SCNlab toolboxes on brain “orthoviews”, extract data and save clusters (blobs) with useful info for visualization and ROI analysis

#### Plotting and output support functions

These are called within the main top-level functions when plotting options are turned on. Many of them can be called from the command line with output from top-level functions as input, so you can display plots for analyses you've already run.

igls_plot_slopes – Display (R)IGLS results

mediation_brain_print_tables – Print out info on clusters found by a mediation search across the brain

mediation_brain_results_detail – Display detailed results for a particular cluster found by mediation_brain

mediation_brain_surface_figs – Display visually clusters found by mediation_brain on a 3-D rendering of the brain

mediation_path_diagram – Displays a traditional mediation plot of the path results

mediation_scatterplots – Display scatterplots of mediation output

mediation_sim_output_figs – Display simulation results

#### Computation support functions: High-level

These are functions that are called within top-level functions that provide computational support. Like all toolbox functions, they can also be used as stand-alone tools from the command line. These, however, are ones that I think are likely to be useful to run as stand-alone tools.

matrix_direct_effects – Given a set of pairs of correlated data, examines all possible mediators amongst the set.

mediation_X_search, mediation_M_search, mediation_Y_search, mediation_search – Holds two variables (X, Y, or M) of the mediation formula constant while comparing a set of possibilities for the third to see how well it fits

mediation_region_stepwise – Examines a set of ROIs to see if they are mediators

#### Computation support functions: Low-level

These are functions that are called within top-level functions that provide computational support. It's unlikely that you would want to run these on your own unless you're programming your own tools.

bootbca_ci – Computes confidence interval from corrected and accelerated percentile bootstrap

bootbca_pval – Computes p-values from corrected and accelerated percentile bootstrap

mediation_X_search_mask – Generates a mask based on mediation path p-vals less than .05.

mediation_latent – compute mediation on data after deconvolving the hemodynamic response function, to compare purported neural activity rather than BOLD activity

mediation_latent_sse – computes OLD mediation on HRF-deconvolved data and returns sums of square errors (SSE)

mediation_path_coefficients - Returns single-level mediation path coefficients and standard errors using subfunctions optimized for efficiency

mediation_results_p2z – Backend function for converting images from p-values to z-scores

mediation_shift – Compute mediation paths on X, Y, and M while shifting them back and forth in time for better fits

mediation_shift_sse - Compute mediation paths using OLS only on X, Y, and M while shifting them back and forth in time for better fits and returning sums of squared errors (SSE)

optimal_delay_mediation_search – Compute optimal shift for mediation on a given set of timeseries

#### Simulation and performance evaluation scripts

These are sort of haphazard at the moment. We haven't yet tried to provide a comprehensive set of tools for testing these functions within this package itself.

mediation_dcm_sim1 – Runs a DCM simulation designed to compare DCM as much as possible with mediation

mediation_power – Tor's simulation to determine mediation power; not well documented or developed

mediation_problem1 – script to illustrate possible conceptual flaws in mediation

mediation_sim1 - Simulation for power and false positive rates for mediation analysis

mediation_sim2_igls - Simulation for power and false positive rates for mediation analysis using (R)IGLS

#### Other functions of ambiguous status in the current toolbox

mediation_results – Threshold mediation_search results and write out thresholded images

#### Three-path mediation analysis

Reference: Taylor, MacKinnon, & Tein (2007). Tests of the Three-Path Mediated Effect. Organizational Research Methods, 11(2), 241-269

##### Functions:

- mediation_threepaths.m (main function)
- mediation_path_coefficients_threepaths.m

Three-path mediation analysis includes two mediators, M1, M2, and they mediates the effect of X on Y: X → M1 → M2 → Y.

As of now, this function is working only with the multilevel and bootstrap options. Each subject's data must be in a cell for the variables: X, Y, M1, and M2. Usage example)

for i = 1:30 X{i} = rand(50,1); M1{i} = rand(50,1); M2{i} = rand(50, 1); Y{i} = rand(50,1); cov{i} = rand(50,2); end

[paths, stats] = mediation_threepaths(X, Y, M1, M2, 'covs', cov, 'verbose', 'plots', 'bootsamples', 10000);