Cross-Validation for Genomic Prediction Models
Cross-Validation for Genomic Prediction Models
Description
Performs k-fold or leave-one-out (LOO) cross-validation to estimate prediction accuracy of genomic models fitted with masreml(). Individuals are split into training and validation sets; the model is trained on each training set and used to predict the held-out validation individuals.
Usage
cv_masreml(
y,
X = NULL,
markers = NULL,
G = NULL,
folds = 5L,
scheme = "random",
method = "auto",
solver = "auto",
max_iter = 100L,
tol = 1e-08,
n_threads = NULL,
seed = NULL
)
Arguments
y
|
numeric vector of phenotypes (length n) |
X
|
fixed effects matrix (n x c). NULL = intercept only |
markers
|
list of raw marker inputs (see masreml)
|
G
|
list of pre-built G matrices (see masreml)
|
folds
|
integer, number of CV folds (default 5). Use folds = length(y) for leave-one-out (LOO). Larger values give less biased but more variable estimates.
|
scheme
|
character, fold assignment scheme:
|
method
|
character, REML method (see masreml)
|
solver
|
character, EBV solver (see masreml)
|
max_iter
|
integer, max REML iterations |
tol
|
numeric, convergence tolerance |
n_threads
|
integer, number of threads |
seed
|
integer, random seed for reproducibility |
Value
Object of class “masreml_cv” with elements:
-
accuracy: mean prediction accuracy (r_MG) -
accuracy_fold: accuracy per fold -
bias: mean regression slope (bias check) -
bias_fold: slope per fold -
gebv_all: GEBV for all individuals -
fold_assignments: fold ID per individual -
folds: number of folds used -
scheme: fold scheme used -
call: matched call
See Also
masreml, compute_accuracy