library("masreml")
# ── Continuous trait ─────────────────────────────────────
# SNP additive GBLUP
fit <- masreml(y = y, markers = list(snp_add = W))
summary(fit)
# MH additive only
fit <- masreml(y = y, markers = list(mh_add = mh_list))
# Combined SNP additive + dominance
fit <- masreml(
y = y,
markers = list(snp_add = W, snp_dom = W),
method = "AI"
)
# Pre-built G matrix
fit <- masreml(y = y, G = list(snp_add = G))
# ── Binary trait ─────────────────────────────────────────
# Binary GBLUP (logit link, default HE)
fit <- masreml(
y = y_binary,
markers = list(snp_add = W),
trait = "binary"
)
summary(fit)
# Binary with probit link
fit <- masreml(
y = y_binary,
markers = list(snp_add = W),
trait = "binary",
link = "probit"
)
# Binary with EM-REML for variance components
fit <- masreml(
y = y_binary,
markers = list(snp_add = W),
trait = "binary",
method = "EM"
)Universal REML-BLUP for SNP and Microhaplotype Genomic Prediction
Universal REML-BLUP for SNP and Microhaplotype Genomic Prediction
Description
BLUP-based model for genomic prediction using SNP or microhaplotype (MH) markers. Estimates variance components via REML and predicts genomic estimated breeding values (GEBV) via BLUP. Supports SNP additive, SNP dominance, and MH additive relationship matrices, for both continuous and binary traits. For GWAS-assisted prediction, see run_gwas and gwablup.
Usage
masreml(
y,
X = NULL,
markers = NULL,
G = NULL,
method = "auto",
solver = "auto",
max_iter = 100L,
tol = 1e-06,
n_threads = NULL,
trait = "continuous",
link = "logit",
verbose = TRUE,
save_rds = FALSE,
save_path = NULL
)
Arguments
y
|
numeric vector of phenotypes (length n). Named vector recommended for ID alignment. For binary trait, must contain only 0 and 1. |
X
|
fixed effects design matrix (n x c). If NULL, intercept only is used. |
markers
|
list of raw marker inputs (optional):
|
G
|
list of pre-built relationship matrices (optional):
|
method
|
character, REML algorithm for variance component estimation:
“HE” is recommended for speed. Any method can be used but “AI” may be slow on working response.
|
solver
|
character, EBV solver:
|
max_iter
|
integer, maximum REML iterations (default 100) |
tol
|
numeric, convergence tolerance (default 1e-6) |
n_threads
|
integer, number of threads (default: all physical cores) |
trait
|
character, trait distribution:
|
link
|
character, link function for binary trait:
|
verbose
|
logical, if TRUE (default) prints a brief post-fit summary banner (algorithm, runtime, convergence, variance components, h2, GEBV summary) when masreml() returns. The existing progress messages (“Building…”, “Running REML…”, “Solving EBV…”) are unaffected by this flag.
|
save_rds
|
logical, if TRUE the fit is also written to disk as an RDS file. Default FALSE (differs from masbayes::run_bayesr()) because masreml() is often called inside cv_masreml loops where auto-saving would clobber per-fold output. Set to TRUE for one-off full-data fits you want to persist.
|
save_path
|
optional explicit RDS path. If NULL (default), defaults to “results_masreml.Rds” in the current working directory. Ignored unless save_rds = TRUE.
|
Details
Auto-summary. On completion, masreml() prints a brief banner mirroring masbayes::run_bayesr()/ masbayes::run_bayesa(). Set verbose = FALSE to suppress. The full summary.masreml method is unchanged and is still the canonical detailed report.
Runtime field. The returned object now contains a runtime field (elapsed seconds for REML + BLUP), populated even when verbose = FALSE.
Value
Object of class “masreml” with elements:
-
gebv: named list of GEBV vectors per component -
total_gebv: total GEBV (sum across components) -
fixed_effects: fixed effect estimates -
varcomp: list withsigma2andh2 -
loglik: restricted log-likelihood -
algorithm: REML algorithm used -
solver: EBV solver used -
converged: logical, convergence status -
n_iter: number of REML iterations -
n: number of individuals -
runtime: elapsed seconds for REML + BLUP -
rds_path: path of saved RDS file, orNULLwhensave_rds = FALSE -
training_metrics: list of training-set fit metrics. Continuous:R2,RMSE,accuracy(Pearson r),bias(slope of \(y \sim \hat{y}\)). Binary: same fields plusAUC, computed on the observed (probability) scale (y_01vsmu_hat) so thatbiasis the calibration slope (1.0 = perfectly calibrated) andRMSE^2approximates the Brier score.AUCis rank-invariant under inverse-link transformation. -
call: matched call
For binary trait (trait = “binary”), additional binary slot contains:
-
link: link function used -
prevalence: proportion of y = 1 -
h2_liability: h2 on liability scale -
h2_observed: h2 on observed (0/1) scale via Dempster-Falconer transformation -
auc: Area Under ROC Curve (Wilcoxon-MWW) -
fitted: fitted probabilities P(y = 1) -
converged: always TRUE for single-step
References
VanRaden (2008) Efficient methods to compute genomic predictions. J. Dairy Sci. 91:4414-4423.
Da (2015) Multi-allelic haplotype model for genomic prediction. BMC Genet. 16:144.
Dempster & Falconer (1950) Interpretation of high and low liability. Ann. Hum. Genet. 31:195-203.
Johnson & Thompson (1995) Restricted maximum likelihood estimation of variance components. J. Dairy Sci. 78:449-456.
See Also
run_gwas, gwablup, cv_masreml, compute_accuracy