--- title: 'Study-Level Analysis: From Single Subjects to Group Studies' output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Study-Level Analysis: From Single Subjects to Group Studies} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} params: family: red preset: homage css: albers.css resource_files: - albers.css - albers.js includes: in_header: |- --- ```{r setup, include=FALSE} if (requireNamespace("ggplot2", quietly = TRUE) && requireNamespace("albersdown", quietly = TRUE)) { ggplot2::theme_set( albersdown::theme_albers(family = params$family, preset = params$preset) ) } knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = TRUE, warning = FALSE, message = FALSE ) library(fmridataset) ``` ```{r make-subjects, include=FALSE} set.seed(42) make_subject <- function(id, n_voxels = 500) { set.seed(as.integer(chartr("0123456789", "0123456789", gsub("\\D", "", id)))) matrix_dataset( datamat = matrix(rnorm(200 * n_voxels), nrow = 200, ncol = n_voxels), TR = 2.0, run_length = c(100, 100) ) } subject_ids <- c("sub-01", "sub-02", "sub-03", "sub-04", "sub-05") subject_list <- setNames(lapply(subject_ids, make_subject), subject_ids) ``` Multi-subject fMRI studies require a unified interface that scales from a single participant to a full cohort without rewriting analysis code. `fmridataset` provides `fmri_study_dataset()` for exactly this purpose: wrap per-subject datasets into one object, then apply the same methods you already use on single subjects. ## Create per-subject datasets Each subject is an ordinary `matrix_dataset` (or any other backend — NIfTI, HDF5, Zarr). Create them once and collect them in a named list. ```{r per-subject} ds_01 <- matrix_dataset( datamat = matrix(rnorm(200 * 500), nrow = 200, ncol = 500), TR = 2.0, run_length = c(100, 100) ) ds_02 <- matrix_dataset( datamat = matrix(rnorm(200 * 500), nrow = 200, ncol = 500), TR = 2.0, run_length = c(100, 100) ) ``` ## Compose into fmri_study_dataset() Pass a named list of datasets and matching subject identifiers. ```{r compose-study} study <- fmri_study_dataset( datasets = subject_list, subject_ids = subject_ids ) print(study) ``` The study object stores a lazy reference to each subject's data; nothing is loaded into memory until you explicitly request it. ## Access data Retrieve the full concatenated matrix, or pull one subject at a time. ```{r access-all} all_data <- get_data_matrix(study) dim(all_data) # (total timepoints) x (voxels) ``` ```{r access-one} s01 <- get_data_matrix(study, subject_id = "sub-01") dim(s01) # timepoints for sub-01 x voxels ``` Subject-specific access is the most memory-efficient pattern for large cohorts: load, compute, discard, repeat. ## Chunked iteration `data_chunks()` returns an iterator that yields successive voxel blocks. Use it when the full data matrix does not fit in memory. ```{r data-chunks} chunks <- data_chunks(study, nchunks = 4) repeat { ch <- tryCatch(iterators::nextElem(chunks), error = function(e) NULL) if (is.null(ch)) break cat("chunk", ch$chunk_num, ":", ncol(ch$data), "voxels\n") } ``` Each chunk element exposes `$data` (timepoints x voxels) and `$voxel_ind` (the column indices this chunk corresponds to). ## Group operations with fmri_group() `fmri_group()` wraps a data-frame with one row per subject and a list column of datasets. It is the entry point for group-level modelling and participant-level metadata. ```{r fmri-group} subject_table <- data.frame( sub = subject_ids, age = c(24, 28, 31, 25, 29), dataset = I(lapply(subject_ids, function(id) list(subject_list[[id]]))), stringsAsFactors = FALSE ) grp <- fmri_group(subject_table, id = "sub", dataset_col = "dataset") print(grp) ``` With `grp` in hand you can attach demographic covariates, filter subjects with `filter_subjects()`, and pass the group object to modelling functions in downstream packages such as **fmrireg**. ## Next steps - `vignette("bids-h5-archive")` — build a compressed, chunked HDF5 archive for an entire BIDS study, then read it back through the same interface shown here. - `?fmri_study_dataset` — full argument reference and constructor options. - `?data_chunks` — chunk size strategies, run-wise vs. voxel-wise iteration. - `?fmri_group` — group-level metadata and subject filtering. ```{r session-info, include=FALSE} sessionInfo() ```