fMRI analyses involve heterogeneous sources: NIfTI files, BIDS datasets, preprocessed matrices, and HDF5 archives. fmridataset provides a unified interface that abstracts format differences so the same functions work regardless of how data is stored.
The simplest starting point is a matrix dataset, which wraps an in-memory matrix with temporal metadata.
set.seed(42)
mat <- matrix(rnorm(150 * 1000), nrow = 150, ncol = 1000)
ds <- matrix_dataset(
datamat = mat,
TR = 2.0,
run_length = c(75, 75)
)
print(ds)
#>
#> === fMRI Dataset ===
#>
#> ** Dimensions:
#> - Timepoints: 150
#> - Runs: 2
#> - Matrix: 150 x 1000 (timepoints x voxels)
#> - Voxels in mask: (lazy)
#>
#> ** Temporal Structure:
#> - TR: 2 seconds
#> - Run lengths: 75, 75
#>
#> ** Event Table:
#> - Empty event tableThe dataset holds 150 timepoints split into two runs of 75, with a 2-second TR.
get_data_matrix() returns a standard
timepoints-by-voxels matrix. You can retrieve all runs or a single run
by index.
all_data <- get_data_matrix(ds)
cat("Full dimensions:", dim(all_data), "\n")
#> Full dimensions: 150 1000
run1 <- get_data_matrix(ds, run_id = 1)
cat("Run 1 dimensions:", dim(run1), "\n")
#> Run 1 dimensions: 150 1000The returned matrix is always in timepoints x voxels orientation.
Every dataset contains a sampling_frame that records run
boundaries, TR, and duration.
sf <- ds$sampling_frame
cat("TR:", get_TR(sf), "seconds\n")
#> TR: 2 seconds
cat("Runs:", n_runs(sf), "\n")
#> Runs: 2
cat("Run lengths:", get_run_lengths(sf), "timepoints\n")
#> Run lengths: 75 75 timepoints
cat("Total duration:", get_total_duration(sf), "seconds\n")
#> Total duration: 300 seconds
# First six acquisition times (seconds)
head(samples(sf))
#> [1] 1 2 3 4 5 6blockids() maps each timepoint to its run index, which
is useful for run-specific indexing.
Experimental design attaches to the dataset as a data frame in
$event_table.
events <- data.frame(
onset = c(10, 30, 50, 70, 110, 130, 150, 170),
duration = 2,
trial_type = rep(c("faces", "houses"), 4),
run = c(1, 1, 1, 1, 2, 2, 2, 2)
)
ds$event_table <- events
head(ds$event_table)
#> onset duration trial_type run
#> 1 10 2 faces 1
#> 2 30 2 houses 1
#> 3 50 2 faces 1
#> 4 70 2 houses 1
#> 5 110 2 faces 2
#> 6 130 2 houses 2Event onsets are in seconds and align with the sampling frame’s time axis.
For large datasets, data_chunks() partitions voxels into
memory-manageable pieces without loading the entire matrix at once.
chunks <- data_chunks(ds, nchunks = 4)
results <- lapply(chunks, function(chunk) {
colMeans(chunk$data)
})
voxel_means <- do.call(c, results)
cat("Computed means for", length(voxel_means), "voxels\n")
#> Computed means for 1000 voxelsEach element of results corresponds to one chunk;
do.call(c, ...) reassembles them in voxel order.
Set runwise = TRUE to get one chunk per run. This is
appropriate for analyses that must respect run boundaries such as
detrending or temporal filtering.
run_chunks <- data_chunks(ds, runwise = TRUE)
run_stats <- lapply(run_chunks, function(chunk) {
cat("Run", chunk$chunk_num, ":", nrow(chunk$data), "timepoints\n")
rowMeans(chunk$data)
})
#> Run 1 : 75 timepoints
#> Run 2 : 75 timepoints
cat("Processed", length(run_stats), "runs\n")
#> Processed 2 runsEach chunk’s $data contains only that run’s
timepoints.
When working with NIfTI files, use fmri_file_dataset().
Data remains on disk until explicitly accessed.
ds_files <- fmri_file_dataset(
scans = c("/path/to/run1.nii.gz", "/path/to/run2.nii.gz"),
mask = "/path/to/mask.nii.gz",
TR = 2.0,
run_length = c(180, 180)
)
# Inspect metadata without loading data
print(ds_files)
# Load one run only
run1 <- get_data_matrix(ds_files, run_id = 1)Use run_id to load only the run you need, which keeps
peak memory usage low.
vignette("architecture-overview") - Design principles
and backend extensibilityvignette("h5-backend-usage") - Efficient HDF5 storage
for large datasetsvignette("study-level-analysis") - Multi-subject
studies and group analysessessionInfo()
#> R version 4.6.0 (2026-04-24)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] fmridataset_0.8.9 rmarkdown_2.31
#>
#> loaded via a namespace (and not attached):
#> [1] gtable_0.3.6 xfun_0.58 bslib_0.11.0
#> [4] ggplot2_4.0.3 lattice_0.22-9 bigassertr_0.2.0
#> [7] numDeriv_2016.8-1.1 vctrs_0.7.3 tools_4.6.0
#> [10] generics_0.1.4 stats4_4.6.0 parallel_4.6.0
#> [13] tibble_3.3.1 pkgconfig_2.0.3 Matrix_1.7-5
#> [16] RColorBrewer_1.1-3 bigstatsr_1.6.2 S4Vectors_0.51.3
#> [19] S7_0.2.2 RcppParallel_5.1.11-2 assertthat_0.2.1
#> [22] lifecycle_1.0.5 compiler_4.6.0 neuroim2_0.16.0
#> [25] farver_2.1.2 stringr_1.6.0 RNifti_1.9.0
#> [28] bigparallelr_0.3.2 codetools_0.2-20 htmltools_0.5.9
#> [31] sys_3.4.3 buildtools_1.0.0 sass_0.4.10
#> [34] yaml_2.3.12 deflist_0.2.0 pillar_1.11.1
#> [37] jquerylib_0.1.4 RNiftyReg_2.8.5 cachem_1.1.0
#> [40] DelayedArray_0.39.3 dbscan_1.2.5 iterators_1.0.14
#> [43] abind_1.4-8 foreach_1.5.2 tidyselect_1.2.1
#> [46] digest_0.6.39 stringi_1.8.7 dplyr_1.2.1
#> [49] purrr_1.2.2 maketools_1.3.2 splines_4.6.0
#> [52] cowplot_1.2.0 fastmap_1.2.0 grid_4.6.0
#> [55] mmap_0.6-26 SparseArray_1.13.2 cli_3.6.6
#> [58] magrittr_2.0.5 S4Arrays_1.13.0 fmrihrf_0.3.1
#> [61] scales_1.4.0 XVector_0.53.0 albersdown_1.0.1
#> [64] matrixStats_1.5.0 rmio_0.4.0 otel_0.2.0
#> [67] memoise_2.0.1 evaluate_1.0.5 knitr_1.51
#> [70] IRanges_2.47.2 doParallel_1.0.17 rlang_1.2.0
#> [73] Rcpp_1.1.1-1.1 glue_1.8.1 BiocGenerics_0.59.7
#> [76] jsonlite_2.0.0 R6_2.6.1 MatrixGenerics_1.25.0
#> [79] fs_2.1.0 flock_0.7