Getting Started with citestR

Overview

citestR is a lightweight R client for the citest Python package. It lets you run the conditional-independence-of-missingness test without using reticulate at runtime — all communication happens over a local HTTP connection to a FastAPI server that wraps the Python package.

1. Install the Python backend

If you don’t already have a Python environment with citest installed, the package provides a helper:

library(citestR)

# Creates a virtualenv called "citest_env" and installs the citest API backend
install_backend(method = "pip")

You only need to do this once.

2. Run a test

library(citestR)

# Example data frame with some missing values
set.seed(42)
n <- 500
df <- data.frame(
  Y  = rnorm(n),
  X1 = rnorm(n),
  X2 = rnorm(n),
  X3 = rnorm(n)
)
# Introduce MAR missingness on X2
df$X2[df$X1 > 0.5] <- NA

# Run the CI test (server starts automatically)
result <- ci_test(
  data       = df,
  y          = "Y",
  imputer    = "iterative",
  classifier = "rf",
  m          = 5L,
  n_folds    = 5L
)

result$results

The first call starts the Python server in the background; subsequent calls reuse the running process.

3. Retrieve a summary

summary_info <- get_summary(result$model_id)
summary_info

4. Imputer diagnostics

r2 <- imputer_r2(result$model_id, mask_frac = 0.2, m_eval = 1L)
r2$mean_r2
r2$per_variable

5. Sensitivity calibration

# Single kappa value
compute_kappa(r2_x_z = 0.5, beta_yx = 0.3, gamma_x = 0.2)

# Full calibration table
cal <- kappa_calibration_table()
head(cal)

# Pivot for a fixed beta
calibration_pivot(beta_yx = 0.3)

6. Simulated datasets

sim <- simulate_data("single_mar", n = 300, ci = TRUE)
sim$dataset_id
sim$pct_missing

7. Stopping the server

The server shuts down automatically when the R session ends. To stop it manually:

stop_server()