--- title: "Getting started with backend-aware matrices in amatrix" output: rmarkdown::html_vignette: toc: yes toc_depth: 2 css: albers.css includes: in_header: albers-header.html params: family: red preset: homage resource_files: - albers.css - albers.js - albers-header.html vignette: > %\VignetteIndexEntry{Getting started with backend-aware matrices in amatrix} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} if (requireNamespace("ggplot2", quietly = TRUE)) { ggplot2::theme_set(ggplot2::theme_minimal()) } knitr::opts_chunk$set( collapse = TRUE, comment = "#>", message = FALSE, warning = FALSE ) ``` ```{r albers-classes, echo=FALSE, results='asis'} cat(sprintf( paste0( '' ), params$family, params$preset )) ``` ```{r load-amatrix} library(amatrix) ``` `amatrix` is for matrix-heavy code that already fits the `Matrix` idiom but needs a cleaner path to accelerated execution. You keep ordinary matrix inputs, wrap them once, and then use Matrix-compatible objects that carry backend preferences into linear algebra and model fitting. This vignette uses one common workload: one design matrix, many response columns. By the end, you will have a backend-aware design matrix, a shared-design QR fit from `many_lm()`, and a compact way to inspect where `amatrix` plans to run each operation. ## What do your inputs look like? ```{r example-data, include = FALSE} set.seed(42) n <- 120L p <- 6L q <- 8L X <- scale( matrix(rnorm(n * p), nrow = n, ncol = p), center = TRUE, scale = FALSE ) colnames(X) <- paste0("feature_", seq_len(p)) beta <- matrix(rnorm(p * q, sd = 0.7), nrow = p, ncol = q) noise <- matrix(rnorm(n * q, sd = 0.2), nrow = n, ncol = q) Y <- X %*% beta + noise colnames(Y) <- paste0("response_", seq_len(q)) ``` ```{r data-shape} c( observations = nrow(X), predictors = ncol(X), responses = ncol(Y) ) ``` The design matrix `X` is an ordinary dense R matrix with one row per observation and one column per predictor. The response matrix `Y` has the same number of rows, but each column is a separate regression target that shares the same design. ## What is the quickest end-to-end path? ```{r first-fit} X_am <- adgeMatrix(X) fit <- many_lm(X_am, Y, method = "qr", include_fitted = TRUE, include_residuals = TRUE) dim(coef(fit)) ``` `adgeMatrix()` is the main dense-matrix constructor. `many_lm()` then fits all response columns against the same design in one call and returns a coefficient matrix with one column per response. ```{r first-fit-checks, include = FALSE} qr_ref <- qr.coef(qr(X), Y) coef_err <- max(abs(as.matrix(coef(fit)) - qr_ref)) r2 <- 1 - fit$rss / colSums(Y^2) stopifnot(is.finite(coef_err), coef_err < 1e-8) stopifnot(all(is.finite(fit$rss)), all(is.finite(fit$sigma2))) stopifnot(all(fit$rss > 0), all(fit$sigma2 > 0)) stopifnot(all(is.finite(r2)), all(r2 > 0.85), all(r2 < 1)) ``` ## How does `amatrix` decide where to run? ```{r backend-status} amatrix_backend_status()[, c("name", "available", "precision_modes", "residency_capable")] ``` The status table tells you which backends are registered on the current machine and whether they are usable right now. A backend can be registered but unavailable, in which case the same code still has a predictable CPU fallback. ```{r backend-plan} plan <- amatrix_backend_plan(X_am, "qr") plan[c("chosen", "chosen_path", "requested_precision")] ``` This plan is the compact view of a dispatch decision. You can see which backend was chosen, whether the operation will run as a cold or resident path, and which precision contract the object is requesting. ```{r backend-plan-checks, include = FALSE} status <- amatrix_backend_status() stopifnot(plan$chosen %in% status$name) stopifnot(plan$requested_precision %in% c("strict", "fast")) ``` ## When does shared-design caching help? The main reason to use `amatrix` for repeated regression is that the design-matrix factorization can be reused across calls when `X` stays the same. That matters when you fit many batches of responses against one shared design. ```{r cache-demo} X_cache_am <- adgeMatrix(X) fit_a <- many_lm(X_cache_am, Y[, 1:4, drop = FALSE], method = "qr", cache = TRUE) fit_b <- many_lm(X_cache_am, Y[, 5:8, drop = FALSE], method = "qr", cache = TRUE) c(first_call = fit_a$cache_reused, second_call = fit_b$cache_reused) ``` With a fresh backend-aware copy of `X`, the first call computes and stores the QR factorization for that design. The second call reuses it because the design matrix is unchanged. ```{r cache-demo-checks, include = FALSE} coef_a_ref <- qr.coef(qr(X), Y[, 1:4, drop = FALSE]) coef_b_ref <- qr.coef(qr(X), Y[, 5:8, drop = FALSE]) stopifnot(isFALSE(fit_a$cache_reused)) stopifnot(isTRUE(fit_b$cache_reused)) stopifnot(max(abs(as.matrix(coef(fit_a)) - coef_a_ref)) < 1e-8) stopifnot(max(abs(as.matrix(coef(fit_b)) - coef_b_ref)) < 1e-8) ``` ## What should you inspect after a fit? ```{r fit-summary} summary_tbl <- data.frame( response = colnames(Y), rss = round(fit$rss, 3), sigma2 = round(fit$sigma2, 4), r2 = round(r2, 3) ) knitr::kable(summary_tbl, align = "lrrr") ``` `rss` and `sigma2` tell you how much unexplained variation remains in each response. Here the fitted models recover most of the signal because the responses were generated from a shared linear design with modest noise. ```{r fitted-response-plot, echo = FALSE, fig.height = 4, fig.width = 5} plot( as.matrix(fitted(fit))[, 1], Y[, 1], xlab = "Fitted values", ylab = "Observed response_1", main = "Shared-design fit for one response", pch = 19, col = "#1B5E20" ) abline(0, 1, lty = 2, col = "grey40") ``` The points fall close to the identity line, which matches the high `r2` values from the hidden checks and the small residual variances in the summary table. ## How do you ask for a fast backend? The CPU path above is the safest default because it runs everywhere. On a machine with an available accelerator backend, the code shape stays the same and you only change the constructor metadata: ```{r fast-backend-example, eval = FALSE} X_fast <- adgeMatrix(X, mode = "fast") fit_fast <- many_lm(X_fast, Y, method = "qr", cache = TRUE) coef(fit_fast) ``` If you are writing library code, the default path is still to wrap once at the boundary and keep the rest of the code generic: ```{r fast-library-path, eval = FALSE} X_am <- as_adgeMatrix(X, mode = "fast") fit <- many_lm(X_am, Y, method = "qr", cache = TRUE) ``` That per-object path does not require session-global setters or a hardcoded backend name. `amatrix` will prefer an available fast-capable accelerator automatically and fall back to CPU when none is available. If a caller wants to flip defaults for one local block instead of one object, use `with_amatrix()`: ```{r with-amatrix-example, eval = FALSE} with_amatrix(policy = "auto", precision = "fast", { X_am <- as_adgeMatrix(X) fit <- many_lm(X_am, Y, method = "qr", cache = TRUE) }) ``` Use `amatrix_bind_resident()` only when you need to optimize a particularly hot repeated path. ## Where next? The next functions to explore are `adgeMatrix()`, `many_lm()`, `amatrix_backend_status()`, and `amatrix_explain()`. If you want to audit a single operation more deeply, start with `amatrix_backend_plan()` and then compare that compact plan to the printed explanation from `amatrix_explain()`.