Evaluate Out-of-Sample Genomic Prediction

Description

Computes prediction accuracy metrics for test individuals. Designed to complement predict.masreml() for out-of-sample evaluation. Works with any GEBV source (masreml, masbayes, or other models).

Usage

evaluate_prediction(gebv, y, h2 = NULL, tbv = NULL, fitted_prob = NULL)

Arguments

`gebv`	numeric vector of predicted GEBV for test individuals
`y`	numeric vector of observed phenotypes for test individuals
`h2`	numeric, heritability from fitted model for r_MG computation. Typically `fit$varcomp$h2` (single component) or a specific component. If NULL, r_MG is returned as NA.
`tbv`	numeric vector of true breeding values (simulation only). If provided, `r_MG_true = cor(gebv, tbv)` is computed directly without requiring h2. If NULL, r_MG_true is NA.
`fitted_prob`	numeric vector of fitted probabilities P(y=1) for binary trait, typically `pred$fitted` from `predict.masreml()`. If NULL, AUC is returned as NA.

Value

data.frame with columns:

r_test_y: predictive ability. Continuous: cor(gebv, y). Binary: cor(fitted_prob, y) on the observed (probability) scale.
r_test_g: cor(gebv, tbv) — accuracy vs true BV on the genetic-value scale (uses gebv for both continuous and binary). NA if tbv NULL.
bias: regression slope. Continuous: lm(y ~ gebv); binary: lm(y ~ fitted_prob) = calibration slope. Both interpret 1.0 = unbiased / well-calibrated, <1 over-dispersion, >1 under-dispersion.
r_MG: cor(gebv, y) / sqrt(h2) — heritability-adjusted accuracy on the GEBV scale (uses gebv not fitted_prob even for binary, so h2 is on the same scale). NA if h2 NULL.
AUC: area under ROC curve, computed from fitted_prob. Rank-invariant so unaffected by inverse-link transformation. NA if fitted_prob NULL.
RMSE: continuous: sqrt(mean((gebv - y)^2)); binary: sqrt(mean((fitted_prob - y)^2)) $\approx$ sqrt(Brier).

Examples

library("masreml")

pred <- predict(fit, G_full = list(g = G_full),
                train_ids = train_ids, test_ids = test_ids)

# Continuous trait
evaluate_prediction(
  gebv = pred$total_gebv,
  y    = y_test,
  h2   = fit$varcomp$h2["g"],
  tbv  = tbv_test
)

# Binary trait
evaluate_prediction(
  gebv        = pred$total_gebv,
  y           = y_test,
  h2          = fit$varcomp$h2["g"],
  fitted_prob = pred$fitted
)