Explicit vs decoder-backed latents

library(fmrilatent)

fmrilatent ships two latent object types that share a common interface but store data very differently. Knowing which one you have — and which one a given encoder returns — is the single most useful piece of mental orientation when reading the docs.

The two tiers

┌─────────────────────────────────────────┐
│  Explicit:  basis × loadings + offset   │   LatentNeuroVec  (S4)
│             matrices, on disk           │
├─────────────────────────────────────────┤
│  Decoder-backed:  coeff + decoder()     │   ImplicitLatent  (S3)
│             closure that materializes   │
│             on demand                   │
└─────────────────────────────────────────┘

Property	Explicit (`LatentNeuroVec`)	Decoder-backed (`ImplicitLatent`)
Class system	S4, inherits `neuroim2::NeuroVec`	S3, plain list
Storage	`@basis`, `@loadings`, `@offset` matrices (or lazy handles)	`$coeff`, `$decoder`, `$meta`, `$mask`
Reconstruction	`as.matrix(x)`, `series(x, …)`	`predict(x, time_idx, roi_mask)`
Latent factors	`basis(x)`, `loadings(x)`	`x$coeff` (heterogeneous)
Saved to disk	Matrix bytes	Closure (captures its environment)
Typical use	Compact storage of pre-computed factorization	External solver / non-separable codec

Which encoders return which?

Encoder family	Returns
`spec_time_dct` / `spec_time_slepian` / `spec_time_bspline`	Explicit `LatentNeuroVec`
`spec_space_slepian` / `spec_space_pca` / `spec_space_heat` / `spec_space_hrbf` / `spec_space_wavelet_active`	Explicit `LatentNeuroVec`
`spec_space_parcel` (with `parcel_basis_template`)	Explicit `LatentNeuroVec`
`spec_st(time = …, space = …)` (separable spatiotemporal)	Decoder-backed `ImplicitLatent`
`spec_hierarchical_template`	Explicit `LatentNeuroVec`
`encode_transport(...)`	Decoder-backed `ImplicitLatent`
`encode_awpt(...)`	Decoder-backed `ImplicitLatent`
`encode_operator(...)`	Decoder-backed `ImplicitLatent`
`haar_latent(...)`	Decoder-backed `ImplicitLatent` (subclass `HaarLatent`)

The rule of thumb: if the basis can be written down as a matrix with fewer rows than the time axis (or fewer columns than the voxel count), the encoder produces an explicit object. If the underlying contract requires a non-trivial decoder — separable Kronecker structure, operator transport, lifted wavelets, learned codecs — the encoder produces a decoder-backed object.

Working with explicit latents

mask  <- array(TRUE, dim = c(4, 4, 4))
mask_vol <- neuroim2::LogicalNeuroVol(mask, neuroim2::NeuroSpace(dim(mask)))
set.seed(7)
X     <- matrix(rnorm(20 * sum(mask)), nrow = 20)

lvec  <- encode(X, spec_time_dct(k = 6), mask = mask_vol, materialize = "matrix")
class(lvec)
#> [1] "LatentNeuroVec"
#> attr(,"package")
#> [1] "fmrilatent"
isS4(lvec)
#> [1] TRUE

# Direct factor access:
dim(basis(lvec))     # 20 x 6
#> [1] 20  6
dim(loadings(lvec))  # 64 x 6
#> [1] 64  6

# Reconstruction:
recon <- as.matrix(lvec)
dim(recon)
#> [1] 20 64

# Slicing — same as a NeuroVec:
ts1 <- series(lvec, 1L)
length(ts1)
#> [1] 20

LatentNeuroVec is a subclass of neuroim2::NeuroVec, so the standard neuroim2 operations work — dim(), series(), as.array(), [, [[. basis() and loadings() give you the latent matrices directly.

Working with decoder-backed latents

spec_separable <- spec_st(
  time  = spec_time_dct(k = 4),
  space = spec_space_hrbf(params = list(sigma0 = 2, levels = 0,
                                        radius_factor = 2.5))
)
ilat <- encode(X, spec_separable, mask = mask_vol)
class(ilat)
#> [1] "ImplicitLatent"
isS4(ilat)
#> [1] FALSE

# Coefficients + decoder, not basis × loadings:
names(ilat)
#> [1] "coeff"   "decoder" "meta"    "mask"    "domain"  "support"
str(ilat$coeff, max.level = 1)
#> List of 3
#>  $ core: num [1:4, 1:64] 1.98 -1.23 -2.04 2.02 0.44 ...
#>  $ B_t : num [1:20, 1:4] 0.224 0.224 0.224 0.224 0.224 ...
#>  $ L_s : num [1:64, 1:64] 1 0 0 0 0 0 0 0 0 0 ...
ilat$meta$family
#> [1] "st_separable"

# Reconstruction goes through predict():
recon_full <- predict(ilat)
dim(recon_full)         # n_time x n_voxels
#> [1] 20 64

# Partial decode — only the first 5 time points:
recon_part <- predict(ilat, time_idx = 1:5)
dim(recon_part)
#> [1]  5 64

predict() is the universal decoder API for the implicit tier. It accepts time_idx, roi_mask, and family-specific arguments (levels_keep for haar, etc.), and only materializes the slice you ask for.

Serialization implications

This is the most common gotcha. Both tiers can be saveRDS()’d, but the cost and reproducibility characteristics differ.

# Explicit: matrices serialize natively. With handle-backed slots
# (e.g. dct_basis_handle), the @id + @spec are saved and the basis is
# rematerialized on first access in the new session.
saveRDS(lvec, "lvec.rds")
lvec2 <- readRDS("lvec.rds")
identical(as.matrix(basis(lvec)), as.matrix(basis(lvec2)))  # TRUE

# Decoder-backed: $decoder is a closure. saveRDS captures its
# environment — including any data the closure references. This means:
#   - Self-contained decoders (haar, st-separable) round-trip cleanly.
#   - Decoders that reference large external assets (subject field
#     operators) save a copy of the asset by default.
saveRDS(ilat, "ilat.rds")
ilat2 <- readRDS("ilat.rds")
identical(predict(ilat), predict(ilat2))                     # TRUE

When in doubt: round-trip through tempfile() and check that predict() (or as.matrix()) returns the same numbers. The package test suite has dedicated coverage for this on the explicit side (test-latent_serialization.R) and the implicit decoders are exercised indirectly through their family-specific tests.

Shared structure is orthogonal

Both tiers can participate in the shared structure protocol (R/shared_structure.R), which lets multiple objects reference the same heavy data — a template basis, a parcel atlas, a precomputed graph — instead of each carrying its own copy. The protocol works through dictionary handles (BasisHandle / LoadingsHandle for the explicit tier, decoder-side asset references for the implicit tier) plus an in-session shared-reference registry. Use shared structures when:

You’re encoding many subjects against a common template.
Multiple LatentNeuroVecs in the same session would otherwise duplicate the same dictionary in memory.
You’re building a benchmark and want strict equality of the basis across runs.

See vignette("shared-spatial-dictionaries") for the parcel-template walkthrough, and ?fmrilatent_registry_enable for the in-session cache controls.

Choosing between tiers

Choose explicit if …	Choose decoder-backed if …
You want fast, predictable matrix access	You need partial decoding or operator transport
You’ll be slicing voxels or time often	The basis is non-separable or learned
You want to inspect / plot the basis directly	The decoder captures domain knowledge (Haar lifting, AWPT)
You’re storing many objects to disk and want straightforward bytes	You’re in a coefficient-space modeling pipeline

In practice, most users start with explicit (DCT or B-spline temporal encoding) and only reach for the implicit tier when they hit spec_st, the transport pipeline, or a wavelet codec.