Package 'multivarious' reference manual

Title:	Extensible Data Structures for Multivariate Analysis
Description:	Provides a set of basic and extensible data structures and functions for multivariate analysis, including dimensionality reduction techniques, projection methods, and preprocessing functions. The aim of this package is to offer a flexible and user-friendly framework for multivariate analysis that can be easily extended for custom requirements and specific data analysis tasks.
Authors:	Bradley Buchsbaum [aut, cre]
Maintainer:	Bradley Buchsbaum <[email protected]>
License:	MIT + file LICENSE
Version:	0.2.0
Built:	2025-03-20 20:41:58 UTC
Source:	https://github.com/bbuchsbaum/multivarious

add a pre-processing stage

Description

add a pre-processing stage

Usage

add_node(x, step, ...)
add_node(x, step, ...)

Arguments

`x`	the processing pipeline
`step`	the pre-processing step to add
`...`	extra args

Value

a new pre-processing pipeline with the added step

Add a pre-processing node to a pipeline

Description

Add a pre-processing node to a pipeline

Usage

## S3 method for class 'prepper'
add_node(x, step, ...)
## S3 method for class 'prepper'
add_node(x, step, ...)

Arguments

`x`	A `prepper` pipeline
`step`	The pre-processing step to add
`...`	Additional arguments

Apply rotation

Description

Apply a specified rotation to the fitted model

Usage

apply_rotation(x, rotation_matrix, ...)
apply_rotation(x, rotation_matrix, ...)

Arguments

`x`	A model object, possibly created using the `pca()` function.
`rotation_matrix`	`matrix` reprsenting the rotation.
`...`	extra args

Value

A modified object with updated components and scores after applying the specified rotation.

apply a pre-processing transform

Description

apply a pre-processing transform

Usage

apply_transform(x, X, colind, ...)
apply_transform(x, X, colind, ...)

Arguments

`x`	the pre_processor
`X`	the data matrix
`colind`	column indices
`...`	extra args

Value

the transformed data

Construct a bi_projector instance

Description

A bi_projector offers a two-way mapping from samples (rows) to scores and from variables (columns) to components. Thus, one can project from D-dimensional input space to d-dimensional subspace. And one can project (project_vars) from n-dimensional variable space to the d-dimensional component space. The singular value decomposition is a canonical example of such a two-way mapping.

Usage

bi_projector(v, s, sdev, preproc = prep(pass()), classes = NULL, ...)
bi_projector(v, s, sdev, preproc = prep(pass()), classes = NULL, ...)

Arguments

`v`	A matrix of coefficients with dimensions `nrow(v)` by `ncol(v)` (number of columns = number of components)
`s`	The score matrix
`sdev`	The standard deviations of the score matrix
`preproc`	(optional) A pre-processing pipeline, default is prep(pass())
`classes`	(optional) A character vector specifying the class attributes of the object, default is NULL
`...`	Extra arguments to be stored in the `projector` object.

Value

A bi_projector object

Examples

X <- matrix(rnorm(200), 10, 20)
svdfit <- svd(X)

p <- bi_projector(svdfit$v, s = svdfit$u %% diag(svdfit$d), sdev=svdfit$d)
X <- matrix(rnorm(200), 10, 20)
svdfit <- svd(X)

p <- bi_projector(svdfit$v, s = svdfit$u %% diag(svdfit$d), sdev=svdfit$d)

A Union of Concatenated `bi_projector` Fits

Description

This function combines a set of bi_projector fits into a single bi_projector instance. The new instance's weights and associated scores are obtained by concatenating the weights and scores of the input fits.

Usage

bi_projector_union(fits, outer_block_indices = NULL)
bi_projector_union(fits, outer_block_indices = NULL)

Arguments

`fits`	A list of `bi_projector` instances with the same row space. These instances will be combined to create a new `bi_projector` instance.
`outer_block_indices`	An optional list of indices for the outer blocks. If not provided, the function will compute the indices based on the dimensions of the input fits.

Value

A new bi_projector instance with concatenated weights, scores, and other properties from the input bi_projector instances.

Examples


X1 <- matrix(rnorm(5*5), 5, 5)
X2 <- matrix(rnorm(5*5), 5, 5)

bpu <- bi_projector_union(list(pca(X1), pca(X2)))

X1 <- matrix(rnorm(5*5), 5, 5)
X2 <- matrix(rnorm(5*5), 5, 5)

bpu <- bi_projector_union(list(pca(X1), pca(X2)))

get block_indices

Description

extract the list of indices associated with each block in a multiblock object

Usage

block_indices(x, ...)
block_indices(x, ...)

Arguments

`x`	the object
`...`	extra args

Value

a list of block indices

Extract the Block Indices from a Multiblock Projector

Description

Extract the Block Indices from a Multiblock Projector

Usage

## S3 method for class 'multiblock_projector'
block_indices(x, i, ...)
## S3 method for class 'multiblock_projector'
block_indices(x, i, ...)

Arguments

`x`	A `multiblock_projector` object.
`i`	Ignored.
`...`	Ignored.

Value

The list of block indices.

get block_lengths

Description

extract the lengths of each block in a multiblock object

Usage

block_lengths(x)
block_lengths(x)

Arguments

`x`	the object

Value

the block lengths

Bootstrap Resampling for Multivariate Models

Description

Perform bootstrap resampling on a multivariate model to estimate the variability of components and scores.

Usage

bootstrap(x, nboot, ...)
bootstrap(x, nboot, ...)

Arguments

`x`	A fitted model object, such as a `projector`, that has been fit to a training dataset.
`nboot`	An integer specifying the number of bootstrap resamples to perform.
`...`	Additional arguments to be passed to the specific model implementation of `bootstrap`.

Value

A list containing the bootstrap resampled components and scores for the model.

PCA Bootstrap Resampling

Description

Perform bootstrap resampling for Principal Component Analysis (PCA) to estimate component and score variability.

Usage

## S3 method for class 'pca'
bootstrap(x, nboot = 100, k = ncomp(x), ...)
## S3 method for class 'pca'
bootstrap(x, nboot = 100, k = ncomp(x), ...)

Arguments

`x`	A fitted PCA model object.
`nboot`	The number of bootstrap resamples (default: 100).
`k`	The number of components to bootstrap (default: all components in the fitted PCA model).
`...`	Additional arguments to be passed to the specific model implementation of `bootstrap`.

Value

A list containing bootstrap z-scores for the loadings (zboot_loadings) and scores (zboot_scores).

References

Fisher, Aaron, Brian Caffo, Brian Schwartz, and Vadim Zipunnikov. 2016. "Fast, Exact Bootstrap Principal Component Analysis for P > 1 Million." Journal of the American Statistical Association 111 (514): 846-60.

Examples

X <- matrix(rnorm(10*100), 10, 100)
x <- pca(X, ncomp=9)
bootstrap_results <- bootstrap(x)

X <- matrix(rnorm(10*100), 10, 100)
x <- pca(X, ncomp=9)
bootstrap_results <- bootstrap(x)

center a data matrix

Description

remove mean of all columns in matrix

Usage

center(preproc = prepper(), cmeans = NULL)
center(preproc = prepper(), cmeans = NULL)

Arguments

`preproc`	the pre-processing pipeline
`cmeans`	optional vector of precomputed column means

Value

a prepper list

Construct a Classifier

Description

Create a classifier from a given model object (e.g., projector). This classifier can generate predictions for new data points.

Usage

classifier(x, colind, ...)
classifier(x, colind, ...)

Arguments

`x`	A model object, such as a `projector`, that has been fit to a training dataset.
`colind`	Optional vector of column indices used for prediction. If not provided, all columns will be used.
`...`	Additional arguments to be passed to the specific model implementation of `classifier`.

Value

A classifier function that can be used to make predictions on new data points.

Create a k-NN classifier for a discriminant projector

Description

Create a k-NN classifier for a discriminant projector

Usage

## S3 method for class 'discriminant_projector'
classifier(x, colind = NULL, knn = 1, ...)
## S3 method for class 'discriminant_projector'
classifier(x, colind = NULL, knn = 1, ...)

Arguments

`x`	the discriminant projector object
`colind`	an optional vector specifying the column indices of the components
`knn`	the number of nearest neighbors (default=1)
`...`	extra arguments

Value

a classifier object

Multiblock Bi-Projector Classifier

Description

Constructs a classifier for a multiblock bi-projector model object. Either global or partial scores can be used. If colind or block are provided and global_scores=FALSE, partial projection is performed. Otherwise, global projection is used.

Usage

## S3 method for class 'multiblock_biprojector'
classifier(
  x,
  colind = NULL,
  labels,
  new_data = NULL,
  block = NULL,
  global_scores = TRUE,
  knn = 1,
  ...
)
## S3 method for class 'multiblock_biprojector'
classifier(
  x,
  colind = NULL,
  labels,
  new_data = NULL,
  block = NULL,
  global_scores = TRUE,
  knn = 1,
  ...
)

Arguments

`x`	A fitted multiblock bi-projector model object.
`colind`	An optional vector of column indices used for prediction (default: NULL).
`labels`	A factor or vector of class labels for the training data.
`new_data`	An optional data matrix for which to generate predictions (default: NULL).
`block`	An optional block index for prediction (default: NULL).
`global_scores`	Whether to use the global scores or the partial scores for reference space (default: TRUE).
`knn`	The number of nearest neighbors to consider in the classifier (default: 1).
`...`	Additional arguments.

Value

A multiblock classifier object.

create classifier from a projector

Description

create classifier from a projector

Usage

## S3 method for class 'projector'
classifier(
  x,
  colind = NULL,
  labels,
  new_data,
  knn = 1,
  global_scores = TRUE,
  ...
)
## S3 method for class 'projector'
classifier(
  x,
  colind = NULL,
  labels,
  new_data,
  knn = 1,
  global_scores = TRUE,
  ...
)

Arguments

`x`	projector
`colind`	...
`labels`	...
`new_data`	...
`knn`	...
`global_scores`	...
`...`	extra args

Extract coefficients from a cross_projector object

Description

Extract coefficients from a cross_projector object

Usage

## S3 method for class 'cross_projector'
coef(object, source = c("X", "Y"), ...)
## S3 method for class 'cross_projector'
coef(object, source = c("X", "Y"), ...)

Arguments

`object`	the model fit
`source`	the source of the data (X or Y block), either "X" or "Y"
`...`	extra args

Value

the coefficients

Coefficients for a Multiblock Projector

Description

Extracts the components (loadings) for a given block or the entire projector.

Usage

## S3 method for class 'multiblock_projector'
coef(object, block, ...)
## S3 method for class 'multiblock_projector'
coef(object, block, ...)

Arguments

`object`	A `multiblock_projector` object.
`block`	Optional block index. If missing, returns loadings for all variables.
`...`	Additional arguments.

Value

A matrix of loadings.

scale a data matrix

Description

normalize each column by a scale factor.

Usage

colscale(preproc = prepper(), type = c("unit", "z", "weights"), weights = NULL)
colscale(preproc = prepper(), type = c("unit", "z", "weights"), weights = NULL)

Arguments

`preproc`	the pre-processing pipeline
`type`	the kind of scaling, `unit` norm, `z`-scoring, or precomputed `weights`
`weights`	optional precomputed weights

Value

a prepper list

get the components

Description

Extract the component matrix of a fit.

Usage

components(x, ...)
components(x, ...)

Arguments

`x`	the model fit
`...`	extra args

Value

the component matrix

Compose Multiple Partial Projectors

Description

Creates a composed_partial_projector object that applies partial projections sequentially. If multiple projectors are composed, the column indices (colind) used at each stage must be considered.

Usage

compose_partial_projector(...)
compose_partial_projector(...)

Arguments

...

A sequence of projectors that implement partial_project().

Value

A composed_partial_projector object.

Examples

# Suppose pca1 and pca2 support partial_project().
# cpartial <- compose_partial_projector(pca1, pca2)
# partial_project(cpartial, new_data, colind=1:5)
# Suppose pca1 and pca2 support partial_project().
# cpartial <- compose_partial_projector(pca1, pca2)
# partial_project(cpartial, new_data, colind=1:5)

Compose Two Projectors

Description

Combine two projector models into a single projector by sequentially applying the first projector and then the second projector.

Usage

compose_projector(x, y, ...)
compose_projector(x, y, ...)

Arguments

`x`	A fitted model object (e.g., `projector`) that has been fit to a dataset and will be applied first in the composition.
`y`	A second fitted model object (e.g., `projector`) that has been fit to a dataset and will be applied after the first projector.
`...`	Additional arguments to be passed to the specific model implementation of `compose_projector`.

Value

A new projector object representing the composed projector, which can be used to project data onto the combined subspace.

bind together blockwise pre-processors

Description

concatenate a sequence of pre-processors, each applied to a block of data.

Usage

concat_pre_processors(preprocs, block_indices)
concat_pre_processors(preprocs, block_indices)

Arguments

`preprocs`	a list of initialized `pre_processor` objects
`block_indices`	a list of integer vectors specifying the global column indices for each block

Value

a new pre_processor object that applies the correct transformations blockwise

Examples


p1 <- center() |> prep()
p2 <- center() |> prep()

x1 <- rbind(1:10, 2:11)
x2 <- rbind(1:10, 2:11)

p1a <- init_transform(p1,x1)
p2a <- init_transform(p2,x2)

clist <- concat_pre_processors(list(p1,p2), list(1:10, 11:20))
t1 <- apply_transform(clist, cbind(x1,x2))

t2 <- apply_transform(clist, cbind(x1,x2[,1:5]), colind=1:15)
p1 <- center() |> prep()
p2 <- center() |> prep()

x1 <- rbind(1:10, 2:11)
x2 <- rbind(1:10, 2:11)

p1a <- init_transform(p1,x1)
p2a <- init_transform(p2,x2)

clist <- concat_pre_processors(list(p1,p2), list(1:10, 11:20))
t1 <- apply_transform(clist, cbind(x1,x2))

t2 <- apply_transform(clist, cbind(x1,x2[,1:5]), colind=1:15)

Transfer data from one input domain to another via common latent space

Description

Convert between data representations in a multiblock decomposition/alignment by projecting the input data onto a common latent space and then reconstructing it in the target domain.

Usage

convert_domain(x, new_data, i, j, comp, rowind, colind, ...)
convert_domain(x, new_data, i, j, comp, rowind, colind, ...)

Arguments

`x`	The model fit, typically an object of a class that implements a `transfer` method
`new_data`	The data to transfer, with the same number of rows as the source data block
`i`	The index of the source data block
`j`	The index of the destination data block
`comp`	A vector of component indices to use in the reconstruction
`rowind`	Optional set of row indices to transfer (default: all rows)
`colind`	Optional set of column indices to transfer (default: all columns)
`...`	Additional arguments passed to the underlying `convert_domain` method

Value

A matrix or data frame representing the transferred data in the target domain

Contrastive PCA (cPCA) with Adaptive Computation Methods

Description

Contrastive PCA (cPCA) finds directions that capture the variation in a "foreground" dataset $X_f$ that is not present (or less present) in a "background" dataset $X_b$ . This function adaptively chooses how to solve the generalized eigenvalue problem based on the dataset sizes and the chosen method:

Usage

cPCA(
  X_f,
  X_b,
  ncomp = min(dim(X_f)[2]),
  preproc = center(),
  lambda = 0,
  method = c("geigen", "primme", "sdiag", "corpcor"),
  allow_transpose = TRUE,
  ...
)
cPCA(
  X_f,
  X_b,
  ncomp = min(dim(X_f)[2]),
  preproc = center(),
  lambda = 0,
  method = c("geigen", "primme", "sdiag", "corpcor"),
  allow_transpose = TRUE,
  ...
)

Arguments

`X_f`	A numeric matrix representing the foreground dataset, with dimensions (samples x features).
`X_b`	A numeric matrix representing the background dataset, with dimensions (samples x features).
`ncomp`	Number of components to estimate. Defaults to `min(ncol(X_f))`.
`preproc`	A pre-processing function (default: `center()`), applied to both `X_f` and `X_b` before analysis.
`lambda`	Shrinkage parameter for covariance estimation. Defaults to 0. Used by `corpcor::cov.shrink` or `crossprod.powcor.shrink`.
`method`	A character string specifying the computation method. One of: "geigen" Use `geneig` for the generalized eigenvalue problem (default). "primme" Use `geneig` with the PRIMME library for potentially more efficient solvers. "sdiag" Use a spectral decomposition method for symmetric matrices in `geneig`. "corpcor" Use a corpcor-based whitening approach followed by PCA.
`...`	Additional arguments passed to underlying functions such as `geneig` or covariance estimation.

Details

method = "corpcor": Uses a corpcor-based whitening approach (crossprod.powcor.shrink) to transform the data, then performs a standard PCA on the transformed foreground data.
method \in {"geigen","primme","sdiag"} and moderate number of features (D): Directly forms covariance matrices and uses geneig to solve the generalized eigenvalue problem.
method \in {"geigen","primme","sdiag"} and large number of features (D >> N): Uses an SVD-based reduction on the background data to avoid forming large $D \times D$ matrices. This reduces the problem to $N \times N$ space.

Adaptive Strategy:

If method = "corpcor", no large covariance matrices are formed. Instead, the background data is used to "whiten" the foreground, followed by a simple PCA.
If ⁠method \neq "corpcor"⁠ and the number of features D is manageable (e.g. D <= max(N_f, N_b)), the function forms covariance matrices and directly solves the generalized eigenproblem.
If ⁠method \neq "corpcor"⁠ and D is large (e.g., tens of thousands, D > max(N_f, N_b)), it computes the SVD of the background data X_b to derive a smaller ⁠N x N⁠ eigenproblem, thereby avoiding the costly computation of $D \times D$ covariance matrices.

Note: If lambda != 0 and D is very large, the current implementation does not fully integrate shrinkage into the large-D SVD-based approach and will issue a warning.

Value

A bi_projector object containing:

v: A (features x ncomp) matrix of eigenvectors (loadings).
s: A (samples x ncomp) matrix of scores, i.e., projections of X_f onto the eigenvectors.
sdev: A vector of length ncomp giving the square-root of the eigenvalues.
preproc: The pre-processing object used.

Examples

set.seed(123)
X_f <- matrix(rnorm(2000), nrow=100, ncol=20) # Foreground: 100 samples, 20 features
X_b <- matrix(rnorm(2000), nrow=100, ncol=20) # Background: same size
# Default method (geigen), small dimension scenario
res <- cPCA(X_f, X_b, ncomp=5)
plot(res$s[,1], res$s[,2], main="cPCA scores (component 1 vs 2)")

set.seed(123)
X_f <- matrix(rnorm(2000), nrow=100, ncol=20) # Foreground: 100 samples, 20 features
X_b <- matrix(rnorm(2000), nrow=100, ncol=20) # Background: same size
# Default method (geigen), small dimension scenario
res <- cPCA(X_f, X_b, ncomp=5)
plot(res$s[,1], res$s[,2], main="cPCA scores (component 1 vs 2)")

Two-way (cross) projection to latent components

Description

A projector that reduces two blocks of data, X and Y, yielding a pair of weights for each component. This structure can be used, for example, to store weights derived from canonical correlation analysis.

Usage

cross_projector(
  vx,
  vy,
  preproc_x = prep(pass()),
  preproc_y = prep(pass()),
  ...,
  classes = NULL
)
cross_projector(
  vx,
  vy,
  preproc_x = prep(pass()),
  preproc_y = prep(pass()),
  ...,
  classes = NULL
)

Arguments

`vx`	the X coefficients
`vy`	the Y coefficients
`preproc_x`	the X pre-processor
`preproc_y`	the Y pre-processor
`...`	extra parameters or results to store
`classes`	additional class names

Details

This class extends projector and therefore basic operations such as project, shape, reprocess, and coef work, but by default, it is assumed that the X block is primary. To access Y block operations, an additional argument source must be supplied to the relevant functions, e.g., coef(fit, source = "Y")

Value

a cross_projector object

Examples

# Create two scaled matrices X and Y
X <- scale(matrix(rnorm(10 * 5), 10, 5))
Y <- scale(matrix(rnorm(10 * 5), 10, 5))

# Perform canonical correlation analysis on X and Y
cres <- cancor(X, Y)
sx <- X %*% cres$xcoef
sy <- Y %*% cres$ycoef

# Create a cross_projector object using the canonical correlation analysis results
canfit <- cross_projector(cres$xcoef, cres$ycoef, cor = cres$cor,
                          sx = sx, sy = sy, classes = "cancor")
# Create two scaled matrices X and Y
X <- scale(matrix(rnorm(10 * 5), 10, 5))
Y <- scale(matrix(rnorm(10 * 5), 10, 5))

# Perform canonical correlation analysis on X and Y
cres <- cancor(X, Y)
sx <- X %*% cres$xcoef
sy <- Y %*% cres$ycoef

# Create a cross_projector object using the canonical correlation analysis results
canfit <- cross_projector(cres$xcoef, cres$ycoef, cor = cres$cor,
                          sx = sx, sy = sy, classes = "cancor")

Construct a Discriminant Projector

Description

A discriminant_projector is an instance that extends bi_projector with a projection that maximizes class separation. This can be useful for dimensionality reduction techniques that take class labels into account, such as Linear Discriminant Analysis (LDA).

Usage

discriminant_projector(
  v,
  s,
  sdev,
  preproc = prep(pass()),
  labels,
  classes = NULL,
  ...
)
discriminant_projector(
  v,
  s,
  sdev,
  preproc = prep(pass()),
  labels,
  classes = NULL,
  ...
)

Arguments

`v`	A matrix of coefficients with dimensions `nrow(v)` by `ncol(v)` (number of columns = number of components)
`s`	The score matrix
`sdev`	The standard deviations of the score matrix
`preproc`	(optional) A pre-processing pipeline, default is prep(pass())
`labels`	A factor or character vector of class labels corresponding to the rows of the score matrix `s`.
`classes`	(optional) A character vector specifying the class attributes of the object, default is NULL
`...`	Extra arguments to be stored in the `projector` object.

Value

A discriminant_projector object.

Examples

# Simulate data and labels
set.seed(123)
X <- matrix(rnorm(100 * 10), 100, 10)
labels <- factor(rep(1:2, each = 50))

# Perform LDA and create a discriminant projector
lda_fit <- MASS::lda(X, labels)

dp <- discriminant_projector(lda_fit$scaling, X %*% lda_fit$scaling, sdev = lda_fit$svd, 
labels = labels)
# Simulate data and labels
set.seed(123)
X <- matrix(rnorm(100 * 10), 100, 10)
labels <- factor(rep(1:2, each = 50))

# Perform LDA and create a discriminant projector
lda_fit <- MASS::lda(X, labels)

dp <- discriminant_projector(lda_fit$scaling, X %*% lda_fit$scaling, sdev = lda_fit$svd, 
labels = labels)

Evaluate feature importance

Description

Calculate the importance of features in a model

Usage

feature_importance(x, ...)
feature_importance(x, ...)

Arguments

`x`	the model fit
`...`	extra args

Value

the feature importance scores

Evaluate Feature Importance

Description

Uses "marginal" or "standalone" approaches:

marginal: remove block and see change in accuracy
standalone: use only that block and measure accuracy

Usage

## S3 method for class 'classifier'
feature_importance(
  x,
  new_data,
  ncomp = NULL,
  blocks = NULL,
  metric = c("cosine", "euclidean", "ejaccard"),
  fun = rank_score,
  normalize_probs = FALSE,
  approach = c("marginal", "standalone"),
  ...
)
## S3 method for class 'classifier'
feature_importance(
  x,
  new_data,
  ncomp = NULL,
  blocks = NULL,
  metric = c("cosine", "euclidean", "ejaccard"),
  fun = rank_score,
  normalize_probs = FALSE,
  approach = c("marginal", "standalone"),
  ...
)

Arguments

`x`	classifier
`new_data`	new data
`ncomp`	...
`blocks`	a list of feature indices
`metric`	...
`fun`	a function to compute accuracy (default rank_score)
`normalize_probs`	logical
`approach`	"marginal" or "standalone"
`...`	args to projection

Value

a data.frame with block and importance

Get a fresh pre-processing node cleared of any cached data

Description

Get a fresh pre-processing node cleared of any cached data

Usage

fresh(x, ...)
fresh(x, ...)

Arguments

`x`	the processing pipeline
`...`	extra args

Value

a fresh pre-processing pipeline

Create a fresh pipeline from an existing prepper

Description

Recreates the pipeline structure without any learned parameters.

Usage

## S3 method for class 'prepper'
fresh(x, ...)
## S3 method for class 'prepper'
fresh(x, ...)

Generalized Eigenvalue Decomposition

Description

Computes the generalized eigenvalues and eigenvectors for the problem: A x = λ B x. Various methods are available and differ in their assumptions about A and B.

Usage

geneig(A, B, ncomp, method = c("robust", "sdiag", "geigen", "primme"), ...)
geneig(A, B, ncomp, method = c("robust", "sdiag", "geigen", "primme"), ...)

Arguments

`A`	The left-hand side square matrix.
`B`	The right-hand side square matrix, same dimension as A.
`ncomp`	Number of eigenpairs to return.
`method`	Method to compute the eigenvalues and eigenvectors: "robust": Uses a stable decomposition via a whitening transform (requires B to be symmetric positive-definite). "sdiag": Uses a spectral decomposition of B and transforms the problem, works when B is symmetric positive-definite. "geigen": Uses the `geigen` package for a general solution. "primme": Uses the `PRIMME` package for large sparse matrices.
`...`	Additional arguments passed to the underlying methods.

Value

An object of class projector with eigenvalues stored in values and standard deviations in sdev = sqrt(values).

Examples

if (requireNamespace("geigen", quietly = TRUE)) {
  A <- matrix(c(14, 10, 12, 10, 12, 13, 12, 13, 14), nrow=3, byrow=TRUE)
  B <- matrix(c(48, 17, 26, 17, 33, 32, 26, 32, 34), nrow=3, byrow=TRUE)
  res <- geneig(A, B, ncomp=3, method="geigen")
  # res$values and coefficients(res)
}
if (requireNamespace("geigen", quietly = TRUE)) {
  A <- matrix(c(14, 10, 12, 10, 12, 13, 12, 13, 14), nrow=3, byrow=TRUE)
  B <- matrix(c(48, 17, 26, 17, 33, 32, 26, 32, 34), nrow=3, byrow=TRUE)
  res <- geneig(A, B, ncomp=3, method="geigen")
  # res$values and coefficients(res)
}

Compute column-wise mean in X for each factor level of Y

Description

This function computes group means for each factor level of Y in the provided data matrix X.

Usage

group_means(Y, X)
group_means(Y, X)

Arguments

`Y`	a vector of labels to compute means over disjoint sets
`X`	a data matrix from which to compute means

Value

a matrix with row names corresponding to factor levels of Y and column-wise means for each factor level

Examples

# Example data
X <- matrix(rnorm(50), 10, 5)
Y <- factor(rep(1:2, each = 5))

# Compute group means
gm <- group_means(Y, X)
# Example data
X <- matrix(rnorm(50), 10, 5)
Y <- factor(rep(1:2, each = 5))

# Compute group means
gm <- group_means(Y, X)

Inverse of the Component Matrix

Description

Return the inverse projection matrix, which can be used to map back to data space. If the component matrix is orthogonal, then the inverse projection is the transpose of the component matrix.

Usage

inverse_projection(x, ...)
inverse_projection(x, ...)

Arguments

`x`	The model fit.
`...`	Extra arguments.

Value

The inverse projection matrix.

is it orthogonal

Description

test whether components are orthogonal

Usage

is_orthogonal(x)
is_orthogonal(x)

Arguments

`x`	the object

Value

a logical value indicating whether the transformation is orthogonal

Create a Multiblock Bi-Projector

Description

Constructs a multiblock bi-projector using the given component matrix (v), score matrix (s), singular values (sdev), a preprocessing function, and a list of block indices. This allows for two-way mapping with multiblock data.

Usage

multiblock_biprojector(
  v,
  s,
  sdev,
  preproc = prep(pass()),
  ...,
  block_indices,
  classes = NULL
)
multiblock_biprojector(
  v,
  s,
  sdev,
  preproc = prep(pass()),
  ...,
  block_indices,
  classes = NULL
)

Arguments

`v`	A matrix of components (nrow = number of variables, ncol = number of components).
`s`	A matrix of scores (nrow = samples, ncol = components).
`sdev`	A numeric vector of singular values or standard deviations.
`preproc`	A pre-processing object (default: `prep(pass())`).
`...`	Extra arguments.
`block_indices`	A list of numeric vectors specifying data block variable indices.
`classes`	Additional class attributes (default NULL).

Value

A multiblock_biprojector object.

Create a Multiblock Projector

Description

Constructs a multiblock projector using the given component matrix (v), a preprocessing function, and a list of block indices. This allows for the projection of multiblock data, where each block represents a different set of variables or features.

Usage

multiblock_projector(
  v,
  preproc = prep(pass()),
  ...,
  block_indices,
  classes = NULL
)
multiblock_projector(
  v,
  preproc = prep(pass()),
  ...,
  block_indices,
  classes = NULL
)

Arguments

`v`	A matrix of components with dimensions `nrow(v)` by `ncol(v)` (columns = number of components).
`preproc`	A pre-processing function for the data (default: `prep(pass())`).
`...`	Extra arguments.
`block_indices`	A list of numeric vectors specifying the indices of each data block.
`classes`	(optional) A character vector specifying additional class attributes of the object, default is NULL.

Value

A multiblock_projector object.

Examples

# Generate some example data
X1 <- matrix(rnorm(10 * 5), 10, 5)
X2 <- matrix(rnorm(10 * 5), 10, 5)
X <- cbind(X1, X2)

# Compute PCA on the combined data
pc <- pca(X, ncomp = 8)

# Create a multiblock projector using PCA components and block indices
mb_proj <- multiblock_projector(pc$v, block_indices = list(1:5, 6:10))

# Project multiblock data using the multiblock projector
mb_scores <- project(mb_proj, X)
# Generate some example data
X1 <- matrix(rnorm(10 * 5), 10, 5)
X2 <- matrix(rnorm(10 * 5), 10, 5)
X <- cbind(X1, X2)

# Compute PCA on the combined data
pc <- pca(X, ncomp = 8)

# Create a multiblock projector using PCA components and block indices
mb_proj <- multiblock_projector(pc$v, block_indices = list(1:5, 6:10))

# Project multiblock data using the multiblock projector
mb_scores <- project(mb_proj, X)

get the number of blocks

Description

The number of data blocks in a multiblock element

Usage

nblocks(x)
nblocks(x)

Arguments

`x`	the object

Value

the number of blocks

Get the number of components

Description

This function returns the total number of components in the fitted model.

Usage

ncomp(x)
ncomp(x)

Arguments

`x`	A fitted model object.

Value

The number of components in the fitted model.

Examples

# Example using the svd_wrapper function
data(iris)
X <- iris[, 1:4]
fit <- svd_wrapper(X, ncomp = 3, preproc = center(), method = "base")
ncomp(fit) # Should return 3
# Example using the svd_wrapper function
data(iris)
X <- iris[, 1:4]
fit <- svd_wrapper(X, ncomp = 3, preproc = center(), method = "base")
ncomp(fit) # Should return 3

Nyström approximation for kernel-based decomposition (Unified Version)

Description

Approximate the eigen-decomposition of a large kernel matrix using either the standard Nyström method or the Double Nyström method.

Usage

nystrom_approx(
  X,
  kernel_func = NULL,
  ncomp = min(dim(X)),
  landmarks = NULL,
  nlandmarks = 10,
  preproc = pass(),
  method = c("standard", "double"),
  l = NULL,
  use_RSpectra = TRUE,
  ...
)
nystrom_approx(
  X,
  kernel_func = NULL,
  ncomp = min(dim(X)),
  landmarks = NULL,
  nlandmarks = 10,
  preproc = pass(),
  method = c("standard", "double"),
  l = NULL,
  use_RSpectra = TRUE,
  ...
)

Arguments

`X`	A numeric matrix or data frame of size (N x D), where N is number of samples.
`kernel_func`	A kernel function with signature `kernel_func(X, Y, ...)`. If NULL, defaults to a linear kernel: X %*% t(Y).
`ncomp`	Number of components (eigenvectors/eigenvalues) to return.
`landmarks`	A vector of row indices (of X) specifying the landmark points. If NULL, `nlandmarks` points are sampled uniformly at random.
`nlandmarks`	The number of landmark points to sample if `landmarks` is NULL. Default is 10.
`preproc`	A pre-processing pipeline (default `prep(pass())`) to apply before computing the kernel.
`method`	Either "standard" (the classic single-stage Nyström) or "double" (the two-stage Double Nyström method).
`l`	Intermediate rank for the double Nyström method. Ignored if `method="standard"`. Typically, `l < length(landmarks)` to reduce complexity.
`use_RSpectra`	Logical. If TRUE, use `RSpectra::svds` for partial SVD. Recommended for large problems.
`...`	Additional arguments passed to `kernel_func`.

Details

The Double Nyström method introduces an intermediate step that reduces the size of the decomposition problem, potentially improving efficiency and scalability.

Value

A bi_projector object with fields:

v: The eigenvectors (N x ncomp) approximating the kernel eigenbasis.
s: The scores (N x ncomp) = v * diag(sdev), analogous to principal component scores.
sdev: The square roots of the eigenvalues.
preproc: The pre-processing pipeline used.

Examples

set.seed(123)
X <- matrix(rnorm(1000*1000), 1000, 1000)
# Standard Nyström
res_std <- nystrom_approx(X, ncomp=5, nlandmarks=20, method="standard")
# Double Nyström
res_db <- nystrom_approx(X, ncomp=5, nlandmarks=20, method="double", l=10)
set.seed(123)
X <- matrix(rnorm(1000*1000), 1000, 1000)
# Standard Nyström
res_std <- nystrom_approx(X, ncomp=5, nlandmarks=20, method="standard")
# Double Nyström
res_db <- nystrom_approx(X, ncomp=5, nlandmarks=20, method="double", l=10)

Partial Inverse Projection of a Columnwise Subset of Component Matrix

Description

Compute the inverse projection of a columnwise subset of the component matrix (e.g., a sub-block). Even when the full component matrix is orthogonal, there is no guarantee that the partial component matrix is orthogonal.

Usage

partial_inverse_projection(x, colind, ...)
partial_inverse_projection(x, colind, ...)

Arguments

`x`	A fitted model object, such as a `projector`, that has been fit to a dataset.
`colind`	A numeric vector specifying the column indices of the component matrix to consider for the partial inverse projection.
`...`	Additional arguments to be passed to the specific model implementation of `partial_inverse_projection`.

Value

A matrix representing the partial inverse projection.

Partially project a new sample onto subspace

Description

Project a selected subset of column indices onto the subspace. This function allows for the projection of new data onto a lower-dimensional space using only a subset of the variables, as specified by the column indices.

Usage

partial_project(x, new_data, colind)
partial_project(x, new_data, colind)

Arguments

`x`	The model fit, typically an object of class `bi_projector` or any other class that implements a `partial_project` method
`new_data`	A matrix or vector of new observations with a subset of columns equal to length of `colind`. Rows represent observations and columns represent variables
`colind`	A numeric vector of column indices to select in the projection matrix. These indices correspond to the variables used for the partial projection

Value

A matrix or vector of the partially projected observations, where rows represent observations and columns represent the lower-dimensional space

Examples

# Example with the bi_projector class
X <- matrix(rnorm(10*20), 10, 20)
svdfit <- svd(X)
p <- bi_projector(svdfit$v, s = svdfit$u %*% diag(svdfit$d), sdev=svdfit$d)

# Partially project new_data onto the same subspace as the original data 
# using only the first 10 variables
new_data <- matrix(rnorm(5*20), 5, 20)
colind <- 1:10
partially_projected_data <- partial_project(p, new_data[,colind], colind)
# Example with the bi_projector class
X <- matrix(rnorm(10*20), 10, 20)
svdfit <- svd(X)
p <- bi_projector(svdfit$v, s = svdfit$u %*% diag(svdfit$d), sdev=svdfit$d)

# Partially project new_data onto the same subspace as the original data 
# using only the first 10 variables
new_data <- matrix(rnorm(5*20), 5, 20)
colind <- 1:10
partially_projected_data <- partial_project(p, new_data[,colind], colind)

Partial Project Through a Composed Partial Projector

Description

Applies partial_project() through each projector in the composition. If colind is a single vector, it applies to the first projector only. Subsequent projectors apply full columns. If colind is a list, each element specifies the colind for the corresponding projector in the chain.

Usage

## S3 method for class 'composed_partial_projector'
partial_project(x, new_data, colind, ...)
## S3 method for class 'composed_partial_projector'
partial_project(x, new_data, colind, ...)

Arguments

`x`	A `composed_partial_projector` object.
`new_data`	The input data matrix or vector.
`colind`	A numeric vector or a list of numeric vectors. If a single vector, applies to the first projector. If a list, its length must match the number of projectors in `x`.
`...`	Additional arguments passed to `partial_project()` methods.

Value

The partially projected data after all projectors are applied.

Construct a partial projector

Description

Create a new projector instance restricted to a subset of input columns. This function allows for the generation of a new projection object that focuses only on the specified columns, enabling the projection of data using a limited set of variables.

Usage

partial_projector(x, colind, ...)
partial_projector(x, colind, ...)

Arguments

`x`	The original `projector` instance, typically an object of class `bi_projector` or any other class that implements a `partial_projector` method
`colind`	A numeric vector of column indices to select in the projection matrix. These indices correspond to the variables used for the partial projector
`...`	Additional arguments passed to the underlying `partial_projector` method

Value

A new projector instance, with the same class as the original object, that is restricted to the specified subset of input columns

Examples

# Example with the bi_projector class
X <- matrix(rnorm(10*20), 10, 20)
svdfit <- svd(X)
p <- bi_projector(svdfit$v, s = svdfit$u %*% diag(svdfit$d), sdev=svdfit$d)

# Create a partial projector using only the first 10 variables
colind <- 1:10
partial_p <- partial_projector(p, colind)
# Example with the bi_projector class
X <- matrix(rnorm(10*20), 10, 20)
svdfit <- svd(X)
p <- bi_projector(svdfit$v, s = svdfit$u %*% diag(svdfit$d), sdev=svdfit$d)

# Create a partial projector using only the first 10 variables
colind <- 1:10
partial_p <- partial_projector(p, colind)

construct a partial_projector from a `projector` instance

Description

construct a partial_projector from a projector instance

Usage

## S3 method for class 'projector'
partial_projector(x, colind, ...)
## S3 method for class 'projector'
partial_projector(x, colind, ...)

Arguments

`x`	The original `projector` instance, typically an object of class `bi_projector` or any other class that implements a `partial_projector` method
`colind`	A numeric vector of column indices to select in the projection matrix. These indices correspond to the variables used for the partial projector
`...`	Additional arguments passed to the underlying `partial_projector` method

Value

A partial_projector instance

Examples


# Assuming pfit is a projector with many components:
# pp <- partial_projector(pfit, 1:5)
# Assuming pfit is a projector with many components:
# pp <- partial_projector(pfit, 1:5)

a no-op pre-processing step

Description

pass simply passes its data through the chain

Usage

pass(preproc = prepper())
pass(preproc = prepper())

Arguments

preproc

the pre-processing pipeline

Value

a prepper list

Principal Components Analysis (PCA)

Description

Compute the directions of maximal variance in a data matrix using the Singular Value Decomposition (SVD).

Usage

pca(
  X,
  ncomp = min(dim(X)),
  preproc = center(),
  method = c("fast", "base", "irlba", "propack", "rsvd", "svds"),
  ...
)
pca(
  X,
  ncomp = min(dim(X)),
  preproc = center(),
  method = c("fast", "base", "irlba", "propack", "rsvd", "svds"),
  ...
)

Arguments

`X`	The data matrix.
`ncomp`	The number of requested components to estimate (default is the minimum dimension of the data matrix).
`preproc`	The pre-processing function to apply to the data matrix (default is centering).
`method`	The SVD method to use, passed to `svd_wrapper` (default is "fast").
`...`	Extra arguments to send to `svd_wrapper`.

Value

A bi_projector object containing the PCA results.

Examples

data(iris)
X <- as.matrix(iris[, 1:4])
res <- pca(X, ncomp = 4)
tres <- truncate(res, 3)
data(iris)
X <- as.matrix(iris[, 1:4])
res <- pca(X, ncomp = 4)
tres <- truncate(res, 3)

Permutation Confidence Intervals

Description

Estimate confidence intervals for model parameters using permutation testing.

Usage

perm_ci(x, X, nperm, ...)
perm_ci(x, X, nperm, ...)

Arguments

`x`	A model fit object.
`X`	The original data matrix used to fit the model.
`nperm`	The number of permutations to perform for the confidence interval estimation.
`...`	Additional arguments to be passed to the specific model implementation of `perm_ci`.

Value

A list containing the estimated lower and upper bounds of the confidence intervals for model parameters.

Permutation-Based Confidence Intervals for PCA Components

Description

Perform a permutation test to assess the significance of variance explained by PCA components.

Usage

## S3 method for class 'pca'
perm_ci(x, X, nperm = 100, k = 4, distr = "gamma", parallel = FALSE, ...)
## S3 method for class 'pca'
perm_ci(x, X, nperm = 100, k = 4, distr = "gamma", parallel = FALSE, ...)

Arguments

`x`	A PCA object from `pca()`.
`X`	The original data matrix used for PCA.
`nperm`	Number of permutations.
`k`	Number of components (beyond the first) to test. Default tests up to `min(Q-1, k)`.
`distr`	Distribution to fit to the permutation results ("gamma", "norm", or "empirical").
`parallel`	Logical, whether to use parallel processing for permutations.
`...`	Additional arguments passed to `fitdistrplus::fitdist` or parallelization.

Details

The function computes a statistic F_a for each component a, representing the fraction of variance explained relative to the remaining components. It then uses permutations of the preprocessed data to generate a null distribution. The first component uses the full data; subsequent components are tested by partialing out previously identified components and permuting the residuals.

By default, a gamma distribution is fit to the permuted values to derive CIs and p-values. If distr="empirical", it uses empirical quantiles instead.

Value

A list containing:

observed: The observed F_a values for tested components.
perm_values: A matrix of permuted F-values. Each column corresponds to a component.
fit: A list of fit objects or NULL if empirical chosen.
ci: Computed confidence intervals for each component.
p: p-values for each component.

predict with a classifier object

Description

predict with a classifier object

Usage

## S3 method for class 'classifier'
predict(
  object,
  new_data,
  ncomp = NULL,
  colind = NULL,
  metric = c("euclidean", "cosine", "ejaccard"),
  normalize_probs = FALSE,
  ...
)
## S3 method for class 'classifier'
predict(
  object,
  new_data,
  ncomp = NULL,
  colind = NULL,
  metric = c("euclidean", "cosine", "ejaccard"),
  normalize_probs = FALSE,
  ...
)

Arguments

`object`	classifier
`new_data`	new data
`ncomp`	number of components
`colind`	column indices
`metric`	similarity metric
`normalize_probs`	logical
`...`	extra args

Value

list with class and prob

prepare a dataset by applying a pre-processing pipeline

Description

prepare a dataset by applying a pre-processing pipeline

Usage

prep(x, ...)
prep(x, ...)

Arguments

`x`	the pipeline
`...`	extra args

Value

the pre-processed data

finalize a prepper pipeline

Description

Prepares a pre-processing pipeline for application by creating init, transform, and reverse_transform functions.

Usage

## S3 method for class 'prepper'
prep(x, ...)
## S3 method for class 'prepper'
prep(x, ...)

Compute principal angles for a set of subspaces

Description

This function calculates the principal angles between subspaces derived from a list of bi_projector instances.

Usage

prinang(fits)
prinang(fits)

Arguments

fits

a list of bi_projector instances

Value

a numeric vector of principal angles with length equal to the minimum dimension of input subspaces

Examples


data(iris)
X <- as.matrix(iris[, 1:4])
res <- pca(X, ncomp = 4)
fits_list <- list(res,res,res)
principal_angles <- prinang(fits_list)
data(iris)
X <- as.matrix(iris[, 1:4])
res <- pca(X, ncomp = 4)
fits_list <- list(res,res,res)
principal_angles <- prinang(fits_list)

Pretty Print S3 Method for bi_projector Class

Description

Pretty Print S3 Method for bi_projector Class

Usage

## S3 method for class 'bi_projector'
print(x, ...)
## S3 method for class 'bi_projector'
print(x, ...)

Arguments

`x`	A `bi_projector` object
`...`	Additional arguments passed to the print function

Value

Invisible bi_projector object

Pretty Print S3 Method for bi_projector_union Class

Description

Pretty Print S3 Method for bi_projector_union Class

Usage

## S3 method for class 'bi_projector_union'
print(x, ...)
## S3 method for class 'bi_projector_union'
print(x, ...)

Arguments

`x`	A `bi_projector_union` object
`...`	Additional arguments passed to the print function

Value

Invisible bi_projector_union object

Pretty Print Method for `classifier` Objects

Description

Display a human-readable summary of a classifier object.

Usage

## S3 method for class 'classifier'
print(x, ...)
## S3 method for class 'classifier'
print(x, ...)

Arguments

`x`	A `classifier` object.
`...`	Additional arguments.

Value

classifier object.

Print a concat_pre_processor object

Description

Print a concat_pre_processor object

Usage

## S3 method for class 'concat_pre_processor'
print(x, ...)
## S3 method for class 'concat_pre_processor'
print(x, ...)

Arguments

`x`	A `concat_pre_processor` object.
`...`	Additional arguments (ignored).

Pretty Print Method for `multiblock_biprojector` Objects

Description

Display a summary of a multiblock_biprojector object.

Usage

## S3 method for class 'multiblock_biprojector'
print(x, ...)
## S3 method for class 'multiblock_biprojector'
print(x, ...)

Arguments

`x`	A `multiblock_biprojector` object.
`...`	Additional arguments passed to `print()`.

Value

Invisible multiblock_biprojector object.

Print a pre_processor object

Description

Display information about a pre_processor using crayon-based formatting.

Usage

## S3 method for class 'pre_processor'
print(x, ...)
## S3 method for class 'pre_processor'
print(x, ...)

Arguments

`x`	A `pre_processor` object.
`...`	Additional arguments (ignored).

Print a prepper pipeline

Description

Uses crayon to produce a colorful and readable representation of the pipeline steps.

Usage

## S3 method for class 'prepper'
print(x, ...)
## S3 method for class 'prepper'
print(x, ...)

Arguments

`x`	A `prepper` object.
`...`	Additional arguments (ignored).

Pretty Print Method for `projector` Objects

Description

Display a human-readable summary of a projector object using crayon formatting, including information about the dimensions of the projection matrix and the pre-processing pipeline.

Usage

## S3 method for class 'projector'
print(x, ...)
## S3 method for class 'projector'
print(x, ...)

Arguments

`x`	A `projector` object.
`...`	Additional arguments passed to `print()`.

Examples

X <- matrix(rnorm(10*10), 10, 10)
svdfit <- svd(X)
p <- projector(svdfit$v)
print(p)
X <- matrix(rnorm(10*10), 10, 10)
svdfit <- svd(X)
p <- projector(svdfit$v)
print(p)

Pretty Print Method for `regress` Objects

Description

Display a human-readable summary of a regress object using crayon formatting, including information about the method and dimensions.

Usage

## S3 method for class 'regress'
print(x, ...)
## S3 method for class 'regress'
print(x, ...)

Arguments

`x`	A `regress` object (a bi_projector with regression info).
`...`	Additional arguments passed to `print()`.

New sample projection

Description

Project one or more samples onto a subspace. This function takes a model fit and new observations, and projects them onto the subspace defined by the model. This allows for the transformation of new data into the same lower-dimensional space as the original data.

Usage

project(x, new_data, ...)
project(x, new_data, ...)

Arguments

`x`	The model fit, typically an object of class bi_projector or any other class that implements a project method
`new_data`	A matrix or vector of new observations with the same number of columns as the original data. Rows represent observations and columns represent variables
`...`	Extra arguments to be passed to the specific project method for the object's class

Value

A matrix or vector of the projected observations, where rows represent observations and columns represent the lower-dimensional space

Examples

# Example with the bi_projector class
X <- matrix(rnorm(10*20), 10, 20)
svdfit <- svd(X)
p <- bi_projector(svdfit$v, s = svdfit$u %% diag(svdfit$d), sdev=svdfit$d)

# Project new_data onto the same subspace as the original data
new_data <- matrix(rnorm(5*20), 5, 20)
projected_data <- project(p, new_data)
# Example with the bi_projector class
X <- matrix(rnorm(10*20), 10, 20)
svdfit <- svd(X)
p <- bi_projector(svdfit$v, s = svdfit$u %% diag(svdfit$d), sdev=svdfit$d)

# Project new_data onto the same subspace as the original data
new_data <- matrix(rnorm(5*20), 5, 20)
projected_data <- project(p, new_data)

Project a single "block" of data onto the subspace

Description

When observations are concatenated into "blocks", it may be useful to project one block from the set. This function facilitates the projection of a specific block of data onto a subspace. It is a convenience method for multi-block fits and is equivalent to a "partial projection" where the column indices are associated with a given block.

Usage

project_block(x, new_data, block, ...)
project_block(x, new_data, block, ...)

Arguments

`x`	The model fit, typically an object of a class that implements a `project_block` method
`new_data`	A matrix or vector of new observation(s) with the same number of columns as the original data
`block`	An integer representing the block ID to select in the block projection matrix. This ID corresponds to the specific block of data to be projected
`...`	Additional arguments passed to the underlying `project_block` method

Value

A matrix or vector of the projected data for the specified block

Project Data onto a Specific Block

Description

Projects the new data onto the subspace defined by a specific block of variables.

Usage

## S3 method for class 'multiblock_projector'
project_block(x, new_data, block, ...)
## S3 method for class 'multiblock_projector'
project_block(x, new_data, block, ...)

Arguments

`x`	A `multiblock_projector` object.
`new_data`	The new data to be projected.
`block`	The block index (1-based) to project onto.
`...`	Additional arguments passed to `partial_project`.

Value

The projected scores for the specified block.

Project one or more variables onto a subspace

Description

This function projects one or more variables onto a subspace. It is often called supplementary variable projection and can be computed for a biorthogonal decomposition, such as Singular Value Decomposition (SVD).

Usage

project_vars(x, new_data, ...)
project_vars(x, new_data, ...)

Arguments

`x`	The model fit, typically an object of a class that implements a `project_vars` method
`new_data`	A matrix or vector of new observation(s) with the same number of rows as the original data
`...`	Additional arguments passed to the underlying `project_vars` method

Value

A matrix or vector of the projected variables in the subspace

project a cross_projector instance

Description

project a cross_projector instance

Usage

## S3 method for class 'cross_projector'
project(x, new_data, source = c("X", "Y"), ...)
## S3 method for class 'cross_projector'
project(x, new_data, source = c("X", "Y"), ...)

Arguments

`x`	The model fit, typically an object of class bi_projector or any other class that implements a project method
`new_data`	A matrix or vector of new observations with the same number of columns as the original data. Rows represent observations and columns represent variables
`source`	the source of the data (X or Y block)
`...`	Extra arguments to be passed to the specific project method for the object's class

Value

the projected data

Construct a `projector` instance

Description

A projector maps a matrix from an N-dimensional space to d-dimensional space, where d may be less than N. The projection matrix, v, is not necessarily orthogonal. This function constructs a projector instance which can be used for various dimensionality reduction techniques like PCA, LDA, etc.

Usage

projector(v, preproc = prep(pass()), ..., classes = NULL)
projector(v, preproc = prep(pass()), ..., classes = NULL)

Arguments

`v`	A matrix of coefficients with dimensions `nrow(v)` by `ncol(v)` (number of columns = number of components)
`preproc`	A prepped pre-processing object. Default is the no-processing `pass()` preprocessor.
`...`	Extra arguments to be stored in the `projector` object.
`classes`	Additional class information used for creating subtypes of `projector`. Default is NULL.

Value

An instance of type projector.

Examples

X <- matrix(rnorm(10*10), 10, 10)
svdfit <- svd(X)
p <- projector(svdfit$v)
proj <- project(p, X)

X <- matrix(rnorm(10*10), 10, 10)
svdfit <- svd(X)
p <- projector(svdfit$v)
proj <- project(p, X)

Calculate Rank Score for Predictions

Description

Calculate Rank Score for Predictions

Usage

rank_score(prob, observed)
rank_score(prob, observed)

Arguments

`prob`	matrix of predicted probabilities (observations x classes)
`observed`	vector of observed class labels

Value

data.frame with prank and observed

Reconstruct the data

Description

Reconstruct a data set from its (possibly) low-rank representation. This can be useful when analyzing the impact of dimensionality reduction or when visualizing approximations of the original data.

Usage

reconstruct(x, comp, rowind, colind, ...)
reconstruct(x, comp, rowind, colind, ...)

Arguments

`x`	The model fit, typically an object of a class that implements a `reconstruct` method
`comp`	A vector of component indices to use in the reconstruction
`rowind`	The row indices to reconstruct (optional). If not provided, all rows are used.
`colind`	The column indices to reconstruct (optional). If not provided, all columns are used.
`...`	Additional arguments passed to the underlying `reconstruct` method

Value

A reconstructed data set based on the selected components, rows, and columns

refit a model

Description

refit a model given new data or new parameter(s)

Usage

refit(x, new_data, ...)
refit(x, new_data, ...)

Arguments

`x`	the original model fit object
`new_data`	the new data to process
`...`	extra args

Value

a refit model object

Multi-output linear regression

Description

Fit a multivariate regression model for a matrix of basis functions, X, and a response matrix Y. The goal is to find a projection matrix that can be used for mapping and reconstruction.

Usage

regress(
  X,
  Y,
  preproc = NULL,
  method = c("lm", "enet", "mridge", "pls"),
  intercept = FALSE,
  lambda = 0.001,
  alpha = 0,
  ncomp = ceiling(ncol(X)/2),
  ...
)
regress(
  X,
  Y,
  preproc = NULL,
  method = c("lm", "enet", "mridge", "pls"),
  intercept = FALSE,
  lambda = 0.001,
  alpha = 0,
  ncomp = ceiling(ncol(X)/2),
  ...
)

Arguments

`X`	the set of independent (basis) variables
`Y`	the response matrix
`preproc`	the pre-processor (currently unused)
`method`	the regression method: `lm`, `enet`, `mridge`, or `pls`
`intercept`	whether to include an intercept term
`lambda`	ridge shrinkage parameter (for methods `mridge` and `enet`)
`alpha`	the elastic net mixing parameter if method is `enet`
`ncomp`	number of PLS components if method is `pls`
`...`	extra arguments sent to the underlying fitting function

Value

a bi-projector of type regress

Examples

# Generate synthetic data
Y <- matrix(rnorm(100 * 10), 10, 100)
X <- matrix(rnorm(10 * 9), 10, 9)
# Fit regression models and reconstruct the response matrix
r_lm <- regress(X, Y, intercept = FALSE, method = "lm")
recon_lm <- reconstruct(r_lm)
r_mridge <- regress(X, Y, intercept = TRUE, method = "mridge", lambda = 0.001)
recon_mridge <- reconstruct(r_mridge)
r_enet <- regress(X, Y, intercept = TRUE, method = "enet", lambda = 0.001, alpha = 0.5)
recon_enet <- reconstruct(r_enet)
r_pls <- regress(X, Y, intercept = TRUE, method = "pls", ncomp = 5)
recon_pls <- reconstruct(r_pls)
# Generate synthetic data
Y <- matrix(rnorm(100 * 10), 10, 100)
X <- matrix(rnorm(10 * 9), 10, 9)
# Fit regression models and reconstruct the response matrix
r_lm <- regress(X, Y, intercept = FALSE, method = "lm")
recon_lm <- reconstruct(r_lm)
r_mridge <- regress(X, Y, intercept = TRUE, method = "mridge", lambda = 0.001)
recon_mridge <- reconstruct(r_mridge)
r_enet <- regress(X, Y, intercept = TRUE, method = "enet", lambda = 0.001, alpha = 0.5)
recon_enet <- reconstruct(r_enet)
r_pls <- regress(X, Y, intercept = TRUE, method = "pls", ncomp = 5)
recon_pls <- reconstruct(r_pls)

Relative Eigenanalysis with Ecosystem Integration

Description

Perform a relative eigenanalysis between two groups, fully integrated with the pre-processing and projector ecosystem. The function computes the directions that maximize the variance ratio between two groups and returns a bi_projector object.

Usage

relative_eigen(
  XA,
  XB,
  ncomp = NULL,
  preproc = center(),
  reg_param = 1e-05,
  threshold = 2000,
  ...
)
relative_eigen(
  XA,
  XB,
  ncomp = NULL,
  preproc = center(),
  reg_param = 1e-05,
  threshold = 2000,
  ...
)

Arguments

`XA`	A numeric matrix or data frame of observations for group A (n_A x p).
`XB`	A numeric matrix or data frame of observations for group B (n_B x p).
`ncomp`	The number of components to compute. If NULL (default), computes up to `min(n_A, n_B, p) - 1`.
`preproc`	A pre-processing pipeline created with `prepper()`. Defaults to `center()`.
`reg_param`	A small regularization parameter to ensure numerical stability. Defaults to 1e-5.
`threshold`	An integer specifying the dimension threshold to switch between direct and iterative solvers. Defaults to 2000.
`...`	Additional arguments passed to lower-level functions.

Details

This function computes the leading eigenvalues and eigenvectors of the generalized eigenvalue problem $\Sigma_A v = \lambda \Sigma_B v$ , fully integrated with the pre-processing ecosystem. It uses a direct solver when the number of variables $p$ is less than or equal to threshold, and switches to an iterative method when $p$ is greater than threshold.

Value

A bi_projector object containing the components, scores, and other relevant information.

Examples

# Simulate data for two groups
set.seed(123)
n_A <- 100
n_B <- 80
p <- 500  # Number of variables
XA <- matrix(rnorm(n_A * p), nrow = n_A, ncol = p)
XB <- matrix(rnorm(n_B * p), nrow = n_B, ncol = p)
# Perform relative eigenanalysis
res <- relative_eigen(XA, XB, ncomp = 5)
# Simulate data for two groups
set.seed(123)
n_A <- 100
n_B <- 80
p <- 500  # Number of variables
XA <- matrix(rnorm(n_A * p), nrow = n_A, ncol = p)
XB <- matrix(rnorm(n_B * p), nrow = n_B, ncol = p)
# Perform relative eigenanalysis
res <- relative_eigen(XA, XB, ncomp = 5)

apply pre-processing parameters to a new data matrix

Description

Given a new dataset, process it in the same way the original data was processed (e.g. centering, scaling, etc.)

Usage

reprocess(x, new_data, colind, ...)
reprocess(x, new_data, colind, ...)

Arguments

`x`	the model fit object
`new_data`	the new data to process
`colind`	the column indices of the new data
`...`	extra args

Value

the reprocessed data

reprocess a cross_projector instance

Description

reprocess a cross_projector instance

Usage

## S3 method for class 'cross_projector'
reprocess(x, new_data, colind = NULL, source = c("X", "Y"), ...)
## S3 method for class 'cross_projector'
reprocess(x, new_data, colind = NULL, source = c("X", "Y"), ...)

Arguments

`x`	the model fit object
`new_data`	the new data to process
`colind`	the column indices of the new data
`source`	the source of the data (X or Y block)
`...`	extra args

Value

the re(pre-)processed data

Compute a regression model for each column in a matrix and return residual matrix

Description

Compute a regression model for each column in a matrix and return residual matrix

Usage

residualize(form, X, design, intercept = FALSE)
residualize(form, X, design, intercept = FALSE)

Arguments

`form`	the formula defining the model to fit for residuals
`X`	the response matrix
`design`	the `data.frame` containing the design variables specified in `form` argument.
`intercept`	add an intercept term (default is FALSE)

Value

a matrix of residuals

Examples


X <- matrix(rnorm(20*10), 20, 10)
des <- data.frame(a=rep(letters[1:4], 5), b=factor(rep(1:5, each=4)))
xresid <- residualize(~ a+b, X, design=des)

## design is saturated, residuals should be zero
xresid2 <- residualize(~ a*b, X, design=des)
sum(xresid2) == 0
X <- matrix(rnorm(20*10), 20, 10)
des <- data.frame(a=rep(letters[1:4], 5), b=factor(rep(1:5, each=4)))
xresid <- residualize(~ a+b, X, design=des)

## design is saturated, residuals should be zero
xresid2 <- residualize(~ a*b, X, design=des)
sum(xresid2) == 0

Obtain residuals of a component model fit

Description

Calculate the residuals of a model after removing the effect of the first ncomp components. This function is useful to assess the quality of the fit or to identify patterns that are not captured by the model.

Usage

residuals(x, ncomp, xorig, ...)
residuals(x, ncomp, xorig, ...)

Arguments

`x`	The model fit object.
`ncomp`	The number of components to factor out before calculating residuals.
`xorig`	The original data matrix (X) used to fit the model.
`...`	Additional arguments passed to the method.

Value

A matrix of residuals, with the same dimensions as the original data matrix.

reverse a pre-processing transform

Description

reverse a pre-processing transform

Usage

reverse_transform(x, X, colind, ...)
reverse_transform(x, X, colind, ...)

Arguments

`x`	the pre_processor
`X`	the data matrix
`colind`	column indices
`...`	extra args

Value

the reverse-transformed data

construct a random forest wrapper classifier

Description

Given a model object (e.g. projector construct a random forest classifier that can generate predictions for new data points.

Usage

rf_classifier(x, colind, ...)
rf_classifier(x, colind, ...)

Arguments

`x`	the model object
`colind`	the (optional) column indices used for prediction
`...`	extra arguments to `randomForest` function

Value

a random forest classifier

Create a random forest classifier

Description

Uses randomForest to train a random forest on the provided scores and labels.

Usage

## S3 method for class 'projector'
rf_classifier(x, colind = NULL, labels, scores, ...)
## S3 method for class 'projector'
rf_classifier(x, colind = NULL, labels, scores, ...)

Arguments

`x`	a projector object
`colind`	optional col indices
`labels`	class labels
`scores`	reference scores
`...`	passed to `randomForest`

Value

a rf_classifier object with rfres (rf model), labels, scores

Rotate a Component Solution

Description

Perform a rotation of the component loadings to improve interpretability.

Usage

rotate(x, ncomp, type)
rotate(x, ncomp, type)

Arguments

`x`	The model fit, typically a result from a dimensionality reduction method like PCA.
`ncomp`	The number of components to rotate.
`type`	The type of rotation to apply (e.g., "varimax", "quartimax", "promax").

Value

A modified model fit with the rotated components.

Rotate PCA Loadings

Description

Apply a specified rotation to the component loadings of a PCA model. This function leverages the GPArotation package to apply orthogonal or oblique rotations.

Usage

## S3 method for class 'pca'
rotate(
  x,
  ncomp,
  type = c("varimax", "quartimax", "promax"),
  loadings_type = c("pattern", "structure"),
  score_method = c("auto", "recompute", "original"),
  ...
)
## S3 method for class 'pca'
rotate(
  x,
  ncomp,
  type = c("varimax", "quartimax", "promax"),
  loadings_type = c("pattern", "structure"),
  score_method = c("auto", "recompute", "original"),
  ...
)

Arguments

`x`	A PCA model object, typically created using the `pca()` function.
`ncomp`	The number of components to rotate. Must be <= ncomp(x).
`type`	The type of rotation to apply. Supported rotation types: "varimax" Orthogonal Varimax rotation "quartimax" Orthogonal Quartimax rotation "promax" Oblique Promax rotation
`...`	Additional arguments passed to GPArotation functions.

Value

A modified PCA object with class rotated_pca and additional fields:

`v`	Rotated loadings
`s`	Rotated scores
`sdev`	Updated standard deviations of rotated components
`explained_variance`	Proportion of explained variance for each rotated component
`rotation`	A list with rotation details: type, R (orth) or Phi (oblique), and loadings_type

Examples

# Perform PCA on iris dataset
data(iris)
X <- as.matrix(iris[,1:4])
res <- pca(X, ncomp=4)

# Apply varimax rotation to the first 3 components
rotated_res <- rotate(res, ncomp=3, type="varimax")
# Perform PCA on iris dataset
data(iris)
X <- as.matrix(iris[,1:4])
res <- pca(X, ncomp=4)

# Apply varimax rotation to the first 3 components
rotated_res <- rotate(res, ncomp=3, type="varimax")

Retrieve the component scores

Description

Extract the factor score matrix from a fitted model. The factor scores represent the projections of the data onto the components, which can be used for further analysis or visualization.

Usage

scores(x, ...)
scores(x, ...)

Arguments

`x`	The model fit object.
`...`	Additional arguments passed to the method.

Value

A matrix of factor scores, with rows corresponding to samples and columns to components.

standard deviations

Description

The standard deviations of the projected data matrix

Usage

sdev(x)
sdev(x)

Arguments

`x`	the model fit

Value

the standard deviations

Shape of the Projector

Description

Get the input/output shape of the projector.

Usage

shape(x, ...)
shape(x, ...)

Arguments

`x`	The model fit.
`...`	Extra arguments.

Details

This function retrieves the dimensions of the sample loadings matrix v in the form of a vector with two elements. The first element is the number of rows in the v matrix, and the second element is the number of columns.

Value

A vector containing the dimensions of the sample loadings matrix v (number of rows and columns).

shape of a cross_projector instance

Description

shape of a cross_projector instance

Usage

## S3 method for class 'cross_projector'
shape(x, source = c("X", "Y"), ...)
## S3 method for class 'cross_projector'
shape(x, source = c("X", "Y"), ...)

Arguments

`x`	The model fit.
`source`	the source of the data (X or Y block)
`...`	Extra arguments.

Value

the shape of the data

center and scale each vector of a matrix

Description

center and scale each vector of a matrix

Usage

standardize(preproc = prepper(), cmeans = NULL, sds = NULL)
standardize(preproc = prepper(), cmeans = NULL, sds = NULL)

Arguments

`preproc`	the pre-processing pipeline
`cmeans`	an optional vector of column means
`sds`	an optional vector of sds

Value

a prepper list

Compute standardized component scores

Description

Calculate standardized factor scores from a fitted model. Standardized scores are useful for comparing the contributions of different components on the same scale, which can help in interpreting the results.

Usage

std_scores(x, ...)
std_scores(x, ...)

Arguments

`x`	The model fit object.
`...`	Additional arguments passed to the method.

Value

A matrix of standardized factor scores, with rows corresponding to samples and columns to components.

Singular Value Decomposition (SVD) Wrapper

Description

Computes the singular value decomposition of a matrix using one of the specified methods. It is designed to be an easy-to-use wrapper for various SVD methods available in R.

Usage

svd_wrapper(
  X,
  ncomp = min(dim(X)),
  preproc = pass(),
  method = c("fast", "base", "irlba", "propack", "rsvd", "svds"),
  q = 2,
  p = 10,
  tol = .Machine$double.eps,
  ...
)
svd_wrapper(
  X,
  ncomp = min(dim(X)),
  preproc = pass(),
  method = c("fast", "base", "irlba", "propack", "rsvd", "svds"),
  q = 2,
  p = 10,
  tol = .Machine$double.eps,
  ...
)

Arguments

`X`	the input matrix
`ncomp`	the number of components to estimate (default: min(dim(X)))
`preproc`	the pre-processor to apply on the input matrix (e.g., `center()`, `standardize()`, `pass()`)
`method`	the SVD method to use: 'base', 'fast', 'irlba', 'propack', 'rsvd', or 'svds'
`q`	parameter passed to method `rsvd` (default: 2)
`p`	parameter passed to method `rsvd` (default: 10)
`tol`	minimum eigenvalue magnitude, otherwise component is dropped (default: .Machine$double.eps)
`...`	extra arguments passed to the selected SVD function

Value

an SVD object that extends projector

Examples

# Load iris dataset and select the first four columns
data(iris)
X <- iris[, 1:4]

# Compute SVD using the base method and 3 components
fit <- svd_wrapper(X, ncomp = 3, preproc = center(), method = "base")
# Load iris dataset and select the first four columns
data(iris)
X <- iris[, 1:4]

# Compute SVD using the base method and 3 components
fit <- svd_wrapper(X, ncomp = 3, preproc = center(), method = "base")

Transpose a model

Description

This function transposes a model by switching coefficients and scores. It is useful when you want to reverse the roles of samples and variables in a model, especially in the context of dimensionality reduction methods.

Usage

transpose(x, ...)
transpose(x, ...)

Arguments

`x`	The model fit, typically an object of a class that implements a `transpose` method
`...`	Additional arguments passed to the underlying `transpose` method

Value

A transposed model with coefficients and scores switched

truncate a component fit

Description

take the first n components of a decomposition

Usage

truncate(x, ncomp)
truncate(x, ncomp)

Arguments

`x`	the object to truncate
`ncomp`	number of components to retain

Value

a truncated object (e.g. PCA with 'ncomp' components)

Package 'multivarious'

Help Index

add a pre-processing stage

Description

Usage

Arguments

Value

Add a pre-processing node to a pipeline

Description

Usage

Arguments

Apply rotation

Description

Usage

Arguments

Value

apply a pre-processing transform

Description

Usage

Arguments

Value

Construct a bi_projector instance

Description

Usage

Arguments

Value

Examples

A Union of Concatenated bi_projector Fits

Description

Usage

Arguments

Value

Examples

get block_indices

Description

Usage

Arguments

Value

Extract the Block Indices from a Multiblock Projector

Description

Usage

Arguments

Value

get block_lengths

Description

Usage

Arguments

Value

Bootstrap Resampling for Multivariate Models

Description

Usage

Arguments

Value

PCA Bootstrap Resampling

Description

Usage

Arguments

Value

References

Examples

center a data matrix

Description

Usage

Arguments

Value

Construct a Classifier

Description

Usage

Arguments

Value

Create a k-NN classifier for a discriminant projector

Description

Usage

Arguments

Value

Multiblock Bi-Projector Classifier

Description

Usage

Arguments

Value

A Union of Concatenated `bi_projector` Fits