---
title: "Getting Started with delarr"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting Started with delarr}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  message = FALSE,
  warning = FALSE
)
```

```{r load-delarr}
library(delarr)
```

## What problem does `delarr` solve?

`delarr` lets you write matrix pipelines as if everything were already in
memory while deferring the actual work until `collect()`. That matters when the
source matrix lives on disk, when you want to avoid intermediate allocations,
or when you need to stream the result directly into another backend.

This vignette covers one small lazy pipeline, one streaming write to HDF5, and
one custom backend. For chunk planning, profiling, and optional shared-memory
workers, see `vignette("advanced", package = "delarr")`.

## What does a lazy pipeline look like?

```{r build-lazy-pipeline}
set.seed(1)
mat <- matrix(
  rnorm(24),
  nrow = 6,
  ncol = 4,
  dimnames = list(paste0("sample_", 1:6), paste0("feature_", 1:4))
)

lazy_mean <- delarr(mat) |>
  d_center(dim = "rows") |>
  d_map(~ .x * 0.5) |>
  d_reduce(mean, dim = "rows")

lazy_mean
```

Nothing has been materialized yet. The object is still a `delarr`, and the work
is only a recorded plan.

```{r collect-lazy-pipeline}
row_summary <- collect(lazy_mean, chunk_size = 2L)
row_summary
stopifnot(all(abs(row_summary) < 1e-10))
```

After row-centering, every row has mean zero. The `stopifnot()` turns that
claim into a check that would fail if the pipeline changed unexpectedly.

## How do row and column vectors broadcast?

Scalars, row-sized vectors, and column-sized vectors stay lazy too. `delarr`
infers whether a vector should broadcast across rows or columns from its
length.

```{r broadcast-vectors}
row_bias <- c(-1, 0, 1, 2, 3, 4)
col_scale <- c(1, 0.5, 2, 1.5)

broadcasted <- collect((delarr(mat) + row_bias) * col_scale, chunk_size = 2L)
expected <- sweep(sweep(mat, 1L, row_bias, "+"), 2L, col_scale, "*")

stopifnot(isTRUE(all.equal(broadcasted, expected)))
broadcasted[1:3, , drop = FALSE]
```

## How do you stream a result to HDF5?

`delarr_hdf5()` reads a dataset lazily, and `hdf5_writer()` lets you stream the
transformed result back to disk without materializing the full output matrix in
R.

```{r stream-hdf5}
tf_in <- tempfile(fileext = ".h5")
tf_out <- tempfile(fileext = ".h5")

input <- matrix(runif(30), 5, 6)
write_hdf5(input, tf_in, "X")

X <- delarr_hdf5(tf_in, "X")
writer <- hdf5_writer(tf_out, "X_z", ncol = ncol(X), chunk = c(5L, 3L))

collect(X |> d_zscore(dim = "cols"), into = writer, chunk_size = 3L)

z <- read_hdf5(tf_out, "X_z")
round(colMeans(z), 6)
stopifnot(all(abs(colMeans(z)) < 1e-8))

unlink(c(tf_in, tf_out))
```

## How do you wrap your own storage layer?

A custom backend only needs matrix dimensions and a `pull()` function that can
return arbitrary row and column slices. Here the backing store is just another
matrix, but the same pattern works for databases, APIs, or memory-mapped files.

```{r custom-backend}
source_mat <- matrix(
  seq_len(60),
  nrow = 10,
  ncol = 6,
  dimnames = list(paste0("row_", 1:10), paste0("col_", 1:6))
)

seed <- delarr_seed(
  nrow = nrow(source_mat),
  ncol = ncol(source_mat),
  pull = function(rows = NULL, cols = NULL) {
    if (is.null(rows)) rows <- seq_len(nrow(source_mat))
    if (is.null(cols)) cols <- seq_len(ncol(source_mat))
    source_mat[rows, cols, drop = FALSE]
  },
  dimnames = dimnames(source_mat)
)

custom_result <- delarr(seed)[1:4, 2:5] |>
  d_map(~ .x^2) |>
  collect(chunk_size = 2L)

stopifnot(isTRUE(all.equal(custom_result, source_mat[1:4, 2:5]^2)))
custom_result
```

## Where should you go next?

Use `?collect` when you want to control chunk size or stream into a writer,
`?delarr_seed` when you need a custom backend, and
`vignette("advanced", package = "delarr")` for execution plans, streamed
multi-reducer summaries, block-wise workflows, delayed matrix products, and
optional shared-memory workers.