---
title: 'Study-Level Analysis: From Single Subjects to Group Studies'
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Study-Level Analysis: From Single Subjects to Group Studies}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
params:
family: red
preset: homage
css: albers.css
resource_files:
- albers.css
- albers.js
includes:
in_header: |-
---
```{r setup, include=FALSE}
if (requireNamespace("ggplot2", quietly = TRUE) && requireNamespace("albersdown", quietly = TRUE)) {
ggplot2::theme_set(
albersdown::theme_albers(family = params$family, preset = params$preset)
)
}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
eval = TRUE,
warning = FALSE,
message = FALSE
)
library(fmridataset)
```
```{r make-subjects, include=FALSE}
set.seed(42)
make_subject <- function(id, n_voxels = 500) {
set.seed(as.integer(chartr("0123456789", "0123456789", gsub("\\D", "", id))))
matrix_dataset(
datamat = matrix(rnorm(200 * n_voxels), nrow = 200, ncol = n_voxels),
TR = 2.0,
run_length = c(100, 100)
)
}
subject_ids <- c("sub-01", "sub-02", "sub-03", "sub-04", "sub-05")
subject_list <- setNames(lapply(subject_ids, make_subject), subject_ids)
```
Multi-subject fMRI studies require a unified interface that scales from a
single participant to a full cohort without rewriting analysis code.
`fmridataset` provides `fmri_study_dataset()` for exactly this purpose:
wrap per-subject datasets into one object, then apply the same methods you
already use on single subjects.
## Create per-subject datasets
Each subject is an ordinary `matrix_dataset` (or any other backend — NIfTI,
HDF5, Zarr). Create them once and collect them in a named list.
```{r per-subject}
ds_01 <- matrix_dataset(
datamat = matrix(rnorm(200 * 500), nrow = 200, ncol = 500),
TR = 2.0,
run_length = c(100, 100)
)
ds_02 <- matrix_dataset(
datamat = matrix(rnorm(200 * 500), nrow = 200, ncol = 500),
TR = 2.0,
run_length = c(100, 100)
)
```
## Compose into fmri_study_dataset()
Pass a named list of datasets and matching subject identifiers.
```{r compose-study}
study <- fmri_study_dataset(
datasets = subject_list,
subject_ids = subject_ids
)
print(study)
```
The study object stores a lazy reference to each subject's data; nothing is
loaded into memory until you explicitly request it.
## Access data
Retrieve the full concatenated matrix, or pull one subject at a time.
```{r access-all}
all_data <- get_data_matrix(study)
dim(all_data) # (total timepoints) x (voxels)
```
```{r access-one}
s01 <- get_data_matrix(study, subject_id = "sub-01")
dim(s01) # timepoints for sub-01 x voxels
```
Subject-specific access is the most memory-efficient pattern for large
cohorts: load, compute, discard, repeat.
## Chunked iteration
`data_chunks()` returns an iterator that yields successive voxel blocks.
Use it when the full data matrix does not fit in memory.
```{r data-chunks}
chunks <- data_chunks(study, nchunks = 4)
repeat {
ch <- tryCatch(iterators::nextElem(chunks), error = function(e) NULL)
if (is.null(ch)) break
cat("chunk", ch$chunk_num, ":", ncol(ch$data), "voxels\n")
}
```
Each chunk element exposes `$data` (timepoints x voxels) and
`$voxel_ind` (the column indices this chunk corresponds to).
## Group operations with fmri_group()
`fmri_group()` wraps a data-frame with one row per subject and a list
column of datasets. It is the entry point for group-level modelling and
participant-level metadata.
```{r fmri-group}
subject_table <- data.frame(
sub = subject_ids,
age = c(24, 28, 31, 25, 29),
dataset = I(lapply(subject_ids, function(id) list(subject_list[[id]]))),
stringsAsFactors = FALSE
)
grp <- fmri_group(subject_table, id = "sub", dataset_col = "dataset")
print(grp)
```
With `grp` in hand you can attach demographic covariates, filter subjects
with `filter_subjects()`, and pass the group object to modelling functions
in downstream packages such as **fmrireg**.
## Next steps
- `vignette("bids-h5-archive")` — build a compressed, chunked HDF5 archive
for an entire BIDS study, then read it back through the same interface
shown here.
- `?fmri_study_dataset` — full argument reference and constructor options.
- `?data_chunks` — chunk size strategies, run-wise vs. voxel-wise iteration.
- `?fmri_group` — group-level metadata and subject filtering.
```{r session-info, include=FALSE}
sessionInfo()
```