| Title: | Interpretable Survival Machine Learning Framework |
|---|---|
| Description: | A modular toolkit for interpretable survival machine learning with a unified interface for fitting, prediction, evaluation, and interpretation. It includes semiparametric, parametric, tree-based, ensemble, boosting, kernel, and deep-learning survival learners, together with benchmarking, scoring, calibration, and model-agnostic interpretation utilities. Representative methodological anchors include Cox (1972) <doi:10.1111/j.2517-6161.1972.tb00899.x>, Royston and Parmar (2002) <doi:10.1002/sim.1203>, Ishwaran et al. (2008) <doi:10.1214/08-AOAS169>, Jaeger et al. (2019) <doi:10.1214/19-AOAS1261>, Harrell et al. (1982) <doi:10.1001/jama.1982.03320430047030>, Graf et al. (1999) <doi:10.1002/(SICI)1097-0258(19990915/30)18:17/18%3C2529::AID-SIM274%3E3.0.CO;2-5>, Friedman (2001) <doi:10.1214/aos/1013203451>, Apley and Zhu (2020) <doi:10.1111/rssb.12377>, and Lundberg and Lee (2017) <https://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions>, and other related methods for survival modeling, prediction, and interpretation. |
| Authors: | Imad El Badisy [aut, cre] |
| Maintainer: | Imad El Badisy <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.7.1 |
| Built: | 2026-05-24 08:22:19 UTC |
| Source: | https://github.com/ielbadisy/survalis |
Computes a cumulative/dynamic time-dependent AUC using predicted survival
probabilities at a specified time point (or the last column if t_star
is NULL). Cases are subjects with an observed event by t_star;
controls are subjects known to survive beyond t_star. Subjects
censored before t_star are handled through IPCW weighting.
auc_survmat(object, predicted, t_star = NULL)auc_survmat(object, predicted, t_star = NULL)
object |
A |
predicted |
An |
t_star |
Optional numeric time at which to evaluate AUC; if omitted,
the rightmost column of |
Risk scores are defined as 1 - S(t) at the chosen time. The AUC is
computed over case-control pairs using inverse-probability-of-censoring
weights for cases and partial credit (0.5) for ties.
A named numeric scalar: "auc".
y <- survival::Surv(time = veteran$time, event = veteran$status) sp <- matrix( stats::plogis(scale(veteran$karno)), ncol = 1, dimnames = list(NULL, "t=100") ) auc_survmat(y, predicted = sp, t_star = 100)y <- survival::Surv(time = veteran$time, event = veteran$status) sp <- matrix( stats::plogis(scale(veteran$karno)), ncol = 1, dimnames = list(NULL, "t=100") ) auc_survmat(y, predicted = sp, t_star = 100)
Runs cv_survlearner() for a set of learner names (e.g., "ranger",
"coxph") by dynamically dispatching fit_<learner> and
predict_<learner> functions. Returns the row‑bound CV results across
all requested learners.
benchmark_default_survlearners( formula, data, learners, times, metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, verbose = FALSE, suppress_errors = TRUE, ... )benchmark_default_survlearners( formula, data, learners, times, metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, verbose = FALSE, suppress_errors = TRUE, ... )
formula |
A survival formula of the form |
data |
A data frame containing the variables in |
learners |
Character vector of learner ids (without prefixes), e.g.
|
times |
Numeric vector of evaluation time points for survival predictions. |
metrics |
Character vector of metrics to compute in CV (e.g., |
folds |
Integer number of CV folds (default |
seed |
Integer random seed for fold generation. |
ncores |
Integer; number of CPU cores passed to |
verbose |
Logical; if |
suppress_errors |
Logical; if |
... |
Additional arguments forwarded to each learner's |
Learners are run independently using identical CV splits and scoring settings.
Any learner whose fit_*() or predict_*() function is missing will
be skipped with a warning. At least one learner must complete successfully or an
error is raised.
A data frame of CV results (as returned by cv_survlearner())
with an extra column learner identifying the source learner.
cv_survlearner(), plot_benchmark(), summarise_benchmark()
res <- benchmark_default_survlearners( Surv(time, status) ~ age + karno + trt, data = veteran, learners = c("coxph", "rpart"), times = c(80, 160), metrics = c("cindex", "ibs"), folds = 2, seed = 1 ) head(res)res <- benchmark_default_survlearners( Surv(time, status) ~ age + karno + trt, data = veteran, learners = c("coxph", "rpart"), times = c(80, 160), metrics = c("cindex", "ibs"), folds = 2, seed = 1 ) head(res)
Runs nested cross-validation for one or more learners that expose
tune_<learner>(), fit_<learner>(), and predict_<learner>().
The outer folds estimate performance, while each inner training split is tuned
using the learner's existing tune_*() implementation only on the outer
training data.
benchmark_tuned_survlearners( formula, data, learners, times, metrics = c("cindex", "ibs"), outer_folds = 5, inner_folds = 5, seed = 123, inner_ncores = 1, learner_args = list(), refit_final = FALSE, verbose = FALSE, suppress_errors = TRUE, ... )benchmark_tuned_survlearners( formula, data, learners, times, metrics = c("cindex", "ibs"), outer_folds = 5, inner_folds = 5, seed = 123, inner_ncores = 1, learner_args = list(), refit_final = FALSE, verbose = FALSE, suppress_errors = TRUE, ... )
formula |
A survival formula of the form |
data |
A data frame containing the variables in |
learners |
Character vector of learner ids (without prefixes), e.g. |
times |
Numeric vector of evaluation time points for survival predictions. |
metrics |
Character vector of metrics to optimize and report. |
outer_folds |
Integer number of outer CV folds used for performance estimation. |
inner_folds |
Integer number of inner CV folds used by each learner's tuning routine. |
seed |
Integer random seed used for outer and inner resampling. |
inner_ncores |
Integer; number of CPU cores passed to each learner's inner
|
learner_args |
Optional named list of learner-specific arguments. Each entry can be either a list of tuning arguments passed to |
refit_final |
Logical; if |
verbose |
Logical; if |
suppress_errors |
Logical; if |
... |
Additional arguments passed to each learner's |
A list of class "nested_surv_benchmark" with components outer_results,
outer_summary, selected_params, final_models, and settings.
benchmark_default_survlearners, cv_survlearner
res <- benchmark_tuned_survlearners( Surv(time, status) ~ age + karno + trt, data = veteran, learners = c("ranger", "glmnet"), times = c(100, 200), outer_folds = 3, inner_folds = 2 ) res$outer_summaryres <- benchmark_tuned_survlearners( Surv(time, status) ~ age + karno + trt, data = veteran, learners = c("ranger", "glmnet"), times = c(100, 200), outer_folds = 3, inner_folds = 2 ) res$outer_summary
Extracts the top‑performing learner(s) under a chosen metric from benchmark results, using the average value across folds.
best_survlearner(benchmark_results, metric, maximize = NULL)best_survlearner(benchmark_results, metric, maximize = NULL)
benchmark_results |
A data frame with columns |
metric |
Character name of the metric to optimize (e.g., |
maximize |
Logical; whether to maximize the metric. If |
A tibble with columns learner, metric, and the selected
average value for the best learner(s). Ties are returned as multiple rows.
benchmark_default_survlearners(), summarise_benchmark()
res <- tibble::tibble( learner = c("coxph", "coxph", "rpart", "rpart"), metric = c("cindex", "ibs", "cindex", "ibs"), value = c(0.64, 0.19, 0.60, 0.23) ) best_survlearner(res, metric = "cindex") best_survlearner(res, metric = "ibs")res <- tibble::tibble( learner = c("coxph", "coxph", "rpart", "rpart"), metric = c("cindex", "ibs", "cindex", "ibs"), value = c(0.64, 0.19, 0.60, 0.23) ) best_survlearner(res, metric = "cindex") best_survlearner(res, metric = "ibs")
Computes the inverse-probability-of-censoring weighted (IPCW) Brier score
at a single time t_star.
brier(object, pre_sp, t_star)brier(object, pre_sp, t_star)
object |
A |
pre_sp |
Numeric vector of predicted survival probabilities |
t_star |
Numeric evaluation time. |
The censoring distribution is estimated via Kaplan-Meier on
1 - status. Observed events before t_star contribute
; those at risk at t_star contribute
. Returns NA if is
undefined or zero.
A named numeric scalar: "brier".
y <- survival::Surv(time = veteran$time, event = veteran$status) pre_sp <- stats::plogis(scale(veteran$karno)) brier(y, pre_sp = pre_sp, t_star = 100)y <- survival::Surv(time = veteran$time, event = veteran$status) pre_sp <- stats::plogis(scale(veteran$karno)) brier(y, pre_sp = pre_sp, t_star = 100)
Computes Harrell's concordance index using predicted survival probabilities
at a specified time point (or the last column if t_star is NULL).
cindex_survmat(object, predicted, t_star = NULL)cindex_survmat(object, predicted, t_star = NULL)
object |
A |
predicted |
An |
t_star |
Optional numeric time at which to evaluate the c-index; if omitted,
the rightmost column of |
Risk scores are defined as 1 - S(t) at the chosen time. Ties receive
partial credit (0.5). Pairs not comparable due to censoring are excluded.
A named numeric scalar: "C index".
y <- survival::Surv(time = veteran$time, event = veteran$status) sp <- matrix( stats::plogis(scale(veteran$karno)), ncol = 1, dimnames = list(NULL, "t=100") ) cindex_survmat(y, predicted = sp, t_star = 100)y <- survival::Surv(time = veteran$time, event = veteran$status) sp <- matrix( stats::plogis(scale(veteran$karno)), ncol = 1, dimnames = list(NULL, "t=100") ) cindex_survmat(y, predicted = sp, t_star = 100)
Computes ALE curves for a numeric (continuous) feature with respect to survival probabilities at one or more evaluation times. ALE summarizes the average local effect of changing a feature within small intervals, is robust to correlated features, and is centered to have mean zero.
compute_ale(model, newdata, feature, times, grid.size = 20)compute_ale(model, newdata, feature, times, grid.size = 20)
model |
An |
newdata |
Data frame used to compute ALE (typically the training set or a representative sample). |
feature |
Single numeric/continuous feature name for which to compute ALE. Categorical features are not supported here (use PDP/ICE). |
times |
Numeric vector of time points at which to evaluate survival probabilities. |
grid.size |
Integer number of quantile cut points used to build the ALE
grid (default 20). The algorithm uses quantiles of |
For consecutive quantile bins of the target feature,
ALE integrates the local change in the model prediction when moving the
feature from to while holding all other features at
their observed values, and then accumulates these differences across bins.
For survival models, predictions are survival probabilities at times.
The returned ALE curves are centered (mean zero across the grid) per time.
A list with:
A data frame with columns feature_value and one column per
time ("t=<time>") containing centered ALE effects.
If multiple times were provided, a data frame with
columns feature_value and integrated_ale equal to the mean of per-time
ALE effects across times; otherwise NULL.
mod <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) ale_res <- compute_ale( model = mod, newdata = veteran, feature = "karno", times = c(80, 160), grid.size = 8 ) head(ale_res$ale)mod <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) ale_res <- compute_ale( model = mod, newdata = veteran, feature = "karno", times = c(80, 160), grid.size = 8 ) head(ale_res$ale)
Computes a nonparametric calibration curve for a survival model at one evaluation time by binning predicted survival probabilities and comparing bin-wise means to Kaplan-Meier-based observed survival, with bootstrap CIs.
compute_calibration( model, data, time, status, eval_time, n_bins = 10, n_boot = 100, seed = 123, learner_name = NULL )compute_calibration( model, data, time, status, eval_time, n_bins = 10, n_boot = 100, seed = 123, learner_name = NULL )
model |
An |
data |
A data frame with predictors and survival outcome columns. |
time |
Survival time; either a numeric vector of the same length as
|
status |
Event indicator; either a numeric/logical vector or a single
string giving the column name in |
eval_time |
Single numeric time at which to assess calibration. |
n_bins |
Integer number of quantile-based bins used to group predictions. |
n_boot |
Integer number of bootstrap resamples for confidence intervals. |
seed |
Integer seed for reproducibility of binning/bootstrap. |
learner_name |
Optional character override for labeling the learner in
downstream plots (defaults to |
Predicted survival at eval_time is obtained from the appropriate
predict_<learner>(). Predictions are split into n_bins quantile bins.
For each bin, the function reports:
mean predicted survival, observed survival at eval_time from a Kaplan-Meier
fit on the bin's rows, and bootstrap percentile (2.5%, 97.5%) CIs on the
observed survival computed by resampling rows with replacement.
A list with components:
A data frame with columns bin, mean_pred_surv,
observed_surv, lower_ci, upper_ci.
The evaluation time used.
Number of bins.
Number of bootstrap resamples.
The learner label (from learner_name or model$learner).
mod <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) calib <- compute_calibration( model = mod, data = veteran, time = "time", status = "status", eval_time = 80, n_bins = 4, n_boot = 5, seed = 1 ) head(calib$calibration_table)mod <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) calib <- compute_calibration( model = mod, data = veteran, time = "time", status = "status", eval_time = 80, n_bins = 4, n_boot = 5, seed = 1 ) head(calib$calibration_table)
For a single individual (one-row newdata), propose feature changes
that maximize survival probability at a target time, subject to optional
per-feature change costs and bounds inferred from the training data in
model$data.
compute_counterfactual( model, newdata, times, target_time, features_to_change = NULL, grid.size = 100, max.change = NULL, cost_penalty = 0.01 )compute_counterfactual( model, newdata, times, target_time, features_to_change = NULL, grid.size = 100, max.change = NULL, cost_penalty = 0.01 )
model |
A fitted survival model (e.g., an |
newdata |
A data frame with exactly one row representing the individual. |
times |
Numeric vector of time points used for prediction. Required unless
the model's predict function can infer times; used together with |
target_time |
Numeric scalar time at which to optimize survival. If missing
and |
features_to_change |
Optional character vector of feature names allowed to change. Defaults to all predictors (non-outcome columns). |
grid.size |
Integer grid size for numeric features (default 100). |
max.change |
Optional named list of numeric bounds for per-feature absolute change,
e.g., |
cost_penalty |
Numeric penalty weight applied to magnitude of change (default 0.01). |
For each candidate feature, the function sweeps over plausible values
(numeric grid between observed min/max; all other levels for categorical),
predicts survival at target_time, and reports the best penalized gain
relative to the original value. Survival predictions are obtained via a
corresponding predict_* function inferred from model$learner
(e.g., predict_coxph for learner = "coxph").
A data.frame with one row per feature considered and columns:
feature, original_value, suggested_value,
survival_gain, change_cost, penalized_gain.
df <- veteran df$A <- df$trt mod <- fit_coxph(survival::Surv(time, status) ~ A + age + karno, data = df) cf <- compute_counterfactual( model = mod, newdata = df[1, , drop = FALSE], times = c(50, 100, 150), target_time = 100, features_to_change = c("A", "age", "karno"), grid.size = 10, cost_penalty = 0.02 ) head(cf)df <- veteran df$A <- df$trt mod <- fit_coxph(survival::Surv(time, status) ~ A + age + karno, data = df) cf <- compute_counterfactual( model = mod, newdata = df[1, , drop = FALSE], times = c(50, 100, 150), target_time = 100, features_to_change = c("A", "age", "karno"), grid.size = 10, cost_penalty = 0.02 ) head(cf)
Estimates global and time-varying interaction strengths of model predictions,
using a Friedman-H style decomposition adapted to survival partial dependence.
Works with any mlsurv_model that has a matching predict_*() method returning
survival probabilities.
compute_interactions( model, data, times, target_time = NULL, features = NULL, type = c("1way", "heatmap", "time"), grid.size = 30, batch.size = 100 )compute_interactions( model, data, times, target_time = NULL, features = NULL, type = c("1way", "heatmap", "time"), grid.size = 30, batch.size = 100 )
model |
An |
data |
Data frame used to probe the model (typically training data). |
times |
Numeric vector of evaluation times used for prediction. |
target_time |
Single time at which to quantify interactions for
|
features |
Optional character vector of feature names to evaluate.
Defaults to all predictors in |
type |
One of:
|
grid.size |
Integer; number of random grid values / replicates used for Monte Carlo marginalization (default 30). |
batch.size |
Reserved for future batching support (currently unused). |
For a target time , let be the predicted survival probability.
For feature , we approximate a decomposition:
by Monte Carlo marginalization over subsets of features using random sampling from the
empirical distribution in data. The reported interaction strength is:
clipped to [0, 1]. Pairwise heatmaps are computed analogously for .
Larger values indicate stronger non-additivity (interaction) involving the feature(s).
The "time" mode repeats the computation across all times to show dynamics.
A data frame whose structure depends on type:
"1way": columns feature, interaction.
"heatmap": columns feature1, feature2, interaction
(symmetric with zeros on the diagonal).
"time": columns feature, time, interaction.
Friedman, J. H., and Popescu, B. E. (2008). Predictive learning via rule ensembles.
Annals of Applied Statistics. (Friedman's interaction measure.)
mod <- fit_coxph(Surv(time, status) ~ age + karno + trt, data = veteran) times <- c(80, 160) compute_interactions( model = mod, data = veteran, times = times, target_time = 80, features = c("age", "karno"), type = "1way", grid.size = 6 )mod <- fit_coxph(Surv(time, status) ~ age + karno + trt, data = veteran) times <- c(80, 160) compute_interactions( model = mod, data = veteran, times = times, target_time = 80, features = c("age", "karno"), type = "1way", grid.size = 6 )
Computes partial dependence (PDP) and/or individual conditional expectation (ICE)
curves of predicted survival probabilities for a single feature at one or more
evaluation times. Works with any learner fitted via fit_*() that exposes a
matching predict_*() method returning survival probabilities.
compute_pdp(model, data, feature, times, method = "pdp+ice", grid.size = 20)compute_pdp(model, data, feature, times, method = "pdp+ice", grid.size = 20)
model |
An |
data |
A data frame used to construct PDP/ICE profiles (typically the training data). |
feature |
Character scalar; the feature name to analyze (numeric or categorical). |
times |
Numeric vector of evaluation times at which survival probabilities are computed. |
method |
One of |
grid.size |
Integer number of grid points for numeric features (default 20). Ignored for categorical features (levels are used). |
For numeric features, a regular grid over the observed range is used; for
categorical features, all observed levels are used. For each grid value,
predictions are made for every row of data with the feature forced to the
grid value, yielding ICE curves per row and PDP as the average across rows.
If multiple times are supplied, outputs are stacked in long format with a
time column; an additional integrated PDP is computed via trapezoidal rule
when more than one time is provided.
A list with elements:
Long data frame with columns:
surv_prob, time, type (pdp/ice),
.id (row id for ICE), and the analyzed feature.
(Optional) Data frame with feature and
integrated_surv (time-integrated PDP), present only if
length(times) > 1 and PDP was requested.
mod <- fit_coxph(Surv(time, status) ~ age + karno + trt, data = veteran) pdp_age <- compute_pdp( model = mod, data = veteran, feature = "age", times = c(80, 160), method = "pdp+ice", grid.size = 8 ) head(pdp_age$results)mod <- fit_coxph(Surv(time, status) ~ age + karno + trt, data = veteran) pdp_age <- compute_pdp( model = mod, data = veteran, feature = "age", times = c(80, 160), method = "pdp+ice", grid.size = 8 ) head(pdp_age$results)
Estimates per-feature contributions (à la SHAP) to the predicted survival probability for a single observation at one or more time points. For each time, the method samples random feature orders, marginalizes future features with values from a baseline dataset, and accumulates the marginal effects of adding each feature back.
compute_shap( model, newdata, baseline_data, times, sample.size = 100, aggregate = FALSE, method = c("meanabs", "integral") )compute_shap( model, newdata, baseline_data, times, sample.size = 100, aggregate = FALSE, method = c("meanabs", "integral") )
model |
A fitted survival model produced by |
newdata |
A data frame with exactly one row (the instance to explain). |
baseline_data |
A data frame to sample background values from (typically
the training data used to fit |
times |
Numeric vector of evaluation times (same scale as the outcome). |
sample.size |
Integer, number of random feature orderings to sample per
time (default |
aggregate |
Logical; if |
method |
Character; aggregation method if |
The prediction function is inferred from model$learner as
predict_<learner> and called with signature
predict_fun(model, newdata, times = times). Factor levels in
newdata are harmonized to those in model$data.
If aggregate = FALSE: a data frame with columns feature, phi,
and time (one row per feature per time). If aggregate = TRUE:
a data frame with columns feature and phi (one row per feature),
with attribute "shap_method" set to the aggregation used.
plot_shap(), the various predict_* methods (e.g. predict_coxph())
mod <- fit_coxph(survival::Surv(time, status) ~ age + karno + celltype, data = veteran) shap_td <- compute_shap( model = mod, newdata = veteran[100, , drop = FALSE], baseline_data = veteran, times = c(100, 200), sample.size = 5, aggregate = FALSE ) head(shap_td) shap_meanabs <- compute_shap( model = mod, newdata = veteran[100, , drop = FALSE], baseline_data = veteran, times = c(100, 200), sample.size = 5, aggregate = TRUE, method = "meanabs" ) head(shap_meanabs)mod <- fit_coxph(survival::Surv(time, status) ~ age + karno + celltype, data = veteran) shap_td <- compute_shap( model = mod, newdata = veteran[100, , drop = FALSE], baseline_data = veteran, times = c(100, 200), sample.size = 5, aggregate = FALSE ) head(shap_td) shap_meanabs <- compute_shap( model = mod, newdata = veteran[100, , drop = FALSE], baseline_data = veteran, times = c(100, 200), sample.size = 5, aggregate = TRUE, method = "meanabs" ) head(shap_meanabs)
Builds a sparse, locally weighted linear surrogate model to explain a fitted
survival model's prediction at a specific target time for a single instance.
Categorical features are binarized relative to the instance of interest; local
weights are computed from feature-space proximity (Gower by default); and a
penalized (lasso) or unpenalized linear model is fit to approximate the model's
predicted survival probability at target_time.
compute_surrogate( model, newdata, baseline_data, times, target_time, k = 5, dist.fun = "gower", gower.power = 5, kernel.width = NULL, penalized = TRUE, exclude = NULL )compute_surrogate( model, newdata, baseline_data, times, target_time, k = 5, dist.fun = "gower", gower.power = 5, kernel.width = NULL, penalized = TRUE, exclude = NULL )
model |
An |
newdata |
A one-row data frame: the instance to explain (must have the same
predictors as |
baseline_data |
A data frame used to define the local neighborhood and to fit the surrogate (typically the model's training data). |
times |
Numeric vector of times passed to the prediction function;
must include |
target_time |
Numeric time at which to explain the prediction. |
k |
Desired number of non-zero coefficients for the penalized surrogate
(used only when |
dist.fun |
Distance function for locality weighting. Default |
gower.power |
Power applied to |
kernel.width |
Positive numeric bandwidth used for non-Gower kernels (Gaussian weighting on pairwise distances). |
penalized |
Logical; if |
exclude |
Optional character vector of column names to exclude from the surrogate (in addition to survival outcome columns). |
Target: The surrogate approximates returned by the
underlying model at target_time. The nearest column in times is used.
Weights: Locality weights are computed from distances between
baseline_data rows and newdata. With "gower", weights are
; otherwise Gaussian
.
Sparsity: When penalized=TRUE, a glmnet path is fit and the
solution with degrees of freedom closest to k is selected (preferring
exactly k if available).
A data frame with one row per selected feature containing:
Feature name (after recoding).
Value of the feature in newdata.
Local contribution at target_time.
The explanation time (copied for plotting).
Rows are ordered by decreasing .
mod <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) local_expl <- compute_surrogate( model = mod, newdata = veteran[2, , drop = FALSE], baseline_data = veteran, times = c(80, 160), target_time = 80, k = 3 ) head(local_expl)mod <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) local_expl <- compute_surrogate( model = mod, newdata = veteran[2, , drop = FALSE], baseline_data = veteran, times = c(80, 160), target_time = 80, k = 3 ) head(local_expl)
Fits decision tree surrogate models to approximate the predictions of a fitted survival model at one or more evaluation times. This allows users to gain interpretable, rule-based approximations of complex survival models.
compute_tree_surrogate(model, data, times, minsplit = 10, cp = 0.01)compute_tree_surrogate(model, data, times, minsplit = 10, cp = 0.01)
model |
A fitted survival model object created with a |
data |
A data frame containing predictor variables (and optional survival outcome columns). |
times |
A numeric vector of evaluation times at which to approximate model predictions. Must contain at least one value. |
minsplit |
Minimum number of observations required to attempt a split in the surrogate tree.
Passed to |
cp |
Complexity parameter for the surrogate tree. Passed to |
For each evaluation time, the function:
Predicts survival probabilities from the fitted model.
Excludes survival outcome columns (time, status, event) from the predictors.
Fits a decision tree to approximate the predicted probabilities.
Computes the R between the model predictions and the surrogate predictions.
Counts the number of splits per feature.
If multiple times are provided, results are stored for each time point.
An object of class "tree_surrogate", containing:
times: the evaluation times.
results: a list with one element per time, each containing:
tree: the fitted rpart object.
r_squared: the R of the surrogate model vs. the original predictions.
split_count: a table of feature split counts.
dynamic: logical indicating if more than one time was used.
mod_ranger <- fit_ranger(Surv(time, status) ~ age + karno + celltype, data = veteran) tree_ranger <- compute_tree_surrogate( model = mod_ranger, data = veteran, times = c(100, 200, 300) )mod_ranger <- fit_ranger(Surv(time, status) ~ age + karno + celltype, data = veteran) tree_ranger <- compute_tree_surrogate( model = mod_ranger, data = veteran, times = c(100, 200, 300) )
Estimates feature importance by measuring the change in a survival metric after permuting each feature.
compute_varimp( model, times, metric = "ibs", n_repetitions = 10, seed = NULL, subset = NULL, importance_type = c("delta", "mean") )compute_varimp( model, times, metric = "ibs", n_repetitions = 10, seed = NULL, subset = NULL, importance_type = c("delta", "mean") )
model |
An |
times |
Numeric vector of evaluation times. |
metric |
Character string, e.g. |
n_repetitions |
Integer; number of permutations per feature. |
seed |
Optional integer seed for reproducibility. |
subset |
Optional row indices or logical vector to subset |
importance_type |
One of |
For each feature, rows are permuted n_repetitions times, predictions are recomputed,
and the chosen metric is compared to the baseline (unpermuted) value. The
scaled_importance rescales values to sum to 100%.
A data.frame with columns:
feature: feature name,
value: importance value (change in metric),
scaled_importance: percent-scaled importance (see Details).
mod <- fit_coxph(survival::Surv(time, status) ~ age + karno + celltype, data = veteran) imp <- compute_varimp( model = mod, times = 80, metric = "brier", n_repetitions = 3, seed = 1, subset = 40 ) head(imp)mod <- fit_coxph(survival::Surv(time, status) ~ age + karno + celltype, data = veteran) imp <- compute_varimp( model = mod, times = 80, metric = "brier", n_repetitions = 3, seed = 1, subset = 40 ) head(imp)
Visualizes per-fold metric values from cv_survlearner using a
boxplot with jittered points.
cv_plot(cv_results)cv_plot(cv_results)
cv_results |
A tibble/data frame as returned by |
A ggplot2 object.
cv_results <- tibble::tibble( metric = c("cindex", "cindex", "ibs", "ibs"), value = c(0.62, 0.66, 0.19, 0.21) ) cv_plot(cv_results)cv_results <- tibble::tibble( metric = c("cindex", "cindex", "ibs", "ibs"), value = c(0.62, 0.66, 0.19, 0.21) ) cv_plot(cv_results)
Produces mean, standard deviation, standard error, and 95\
for each metric returned by cv_survlearner.
cv_summary(cv_results)cv_summary(cv_results)
cv_results |
A tibble/data frame as returned by |
A tibble with columns: metric, mean, sd, n,
se, lower, upper.
cv_results <- tibble::tibble( metric = c("cindex", "cindex", "ibs", "ibs"), value = c(0.62, 0.66, 0.19, 0.21) ) cv_summary(cv_results)cv_results <- tibble::tibble( metric = c("cindex", "cindex", "ibs", "ibs"), value = c(0.62, 0.66, 0.19, 0.21) ) cv_summary(cv_results)
fmapn)Runs k-fold cross-validation for any pair of fit_fun/pred_fun
that follow the package's learner contracts, and returns tidy per-fold metric values.
Fold iteration is handled by functionals::fmapn() with optional parallel
execution and a progress bar.
cv_survlearner( formula, data, fit_fun, pred_fun, times, metrics = c("cindex", "ibs"), folds = 5, seed = 123, verbose = FALSE, ncores = 1, pb = interactive(), ... )cv_survlearner( formula, data, fit_fun, pred_fun, times, metrics = c("cindex", "ibs"), folds = 5, seed = 123, verbose = FALSE, ncores = 1, pb = interactive(), ... )
formula |
A survival formula |
data |
A data frame containing all variables in |
fit_fun |
Function with signature |
pred_fun |
Function with signature |
times |
Numeric vector of evaluation times (passed to |
metrics |
Character vector of metrics to compute. Supported:
|
folds |
Integer; number of folds (default |
seed |
Integer random seed for reproducibility (default |
verbose |
Logical; print row-dropping due to missingness (default |
ncores |
Integer; number of CPU cores for |
pb |
Logical; show a progress bar during fold mapping (default
|
... |
Additional arguments forwarded to |
The routine:
Validates Surv(...) on the LHS and warns against using . in formulas.
Drops rows with missing values in any variables referenced by formula.
Supports Surv(time, status == k) by recoding the status to 0/1.
Builds stratified v-folds on the status indicator (rsample).
For each fold: fits on the analysis set, predicts on the assessment set, and computes metrics.
Fold iteration is performed via functionals::fmapn(), which preserves
per-fold identifiers (id, fold) and returns a list ready for
dplyr::bind_rows().
A tibble with columns: splits (rsample split object),
id, fold, metric, and value.
cv_results <- cv_survlearner( formula = Surv(time, status) ~ age + karno, data = veteran, fit_fun = fit_coxph, pred_fun = predict_coxph, times = c(80, 160), metrics = c("cindex", "ibs"), folds = 2, seed = 1 )cv_results <- cv_survlearner( formula = Surv(time, status) ~ age + karno, data = veteran, fit_fun = fit_coxph, pred_fun = predict_coxph, times = c(80, 160), metrics = c("cindex", "ibs"), folds = 2, seed = 1 )
Performs v‑fold cross‑validation for the NNLS stacking meta‑learner over a fixed set of base learners and their predictions. On each fold, stacking weights are learned on the analysis set and evaluated on the assessment set.
cv_survmetalearner( formula, data, times, base_models, base_preds, folds = 5, metrics = c("cindex", "ibs"), seed = 123, verbose = TRUE )cv_survmetalearner( formula, data, times, base_models, base_preds, folds = 5, metrics = c("cindex", "ibs"), seed = 123, verbose = TRUE )
formula |
A survival formula |
data |
A data frame containing all variables referenced in |
times |
Numeric vector of evaluation times for stacking and scoring. |
base_models |
A named list of fitted base learner models (used to predict on assessment folds and for the final refit on full data). |
base_preds |
A named list of training‑set prediction matrices
(rows align with |
folds |
Integer; number of CV folds (default |
metrics |
Character vector of metrics to compute (default |
seed |
Integer random seed (default |
verbose |
Logical; if |
For each fold: (1) subset base_preds to the analysis indices;
(2) learn time‑specific NNLS weights with fit_survmetalearner;
(3) predict stacked survival on the assessment set with predict_survmetalearner;
(4) compute requested metrics. After CV, a final meta‑learner is fit on the full data.
An object of class "cv_survmetalearner_result" with components:
Final "survmetalearner" fit on all data.
Per‑fold metric values (tibble).
Fold‑aggregated mean and sd by metric (tibble).
CV settings.
fit_survmetalearner, predict_survmetalearner,
plot_survmetalearner_weights
form <- Surv(time, status) ~ age + karno + trt times <- c(80, 160) mod_cox <- fit_coxph(form, data = veteran) mod_rpart <- fit_rpart(form, data = veteran) base_models <- list(coxph = mod_cox, rpart = mod_rpart) base_preds <- list( coxph = predict_coxph(mod_cox, veteran, times), rpart = predict_rpart(mod_rpart, veteran, times) ) cv_res <- cv_survmetalearner( formula = form, data = veteran, times = times, base_models = base_models, base_preds = base_preds, folds = 2, metrics = c("cindex", "ibs"), seed = 1, verbose = FALSE ) cv_res$summary plot_survmetalearner_weights(cv_res$model)form <- Surv(time, status) ~ age + karno + trt times <- c(80, 160) mod_cox <- fit_coxph(form, data = veteran) mod_rpart <- fit_rpart(form, data = veteran) base_models <- list(coxph = mod_cox, rpart = mod_rpart) base_preds <- list( coxph = predict_coxph(mod_cox, veteran, times), rpart = predict_rpart(mod_rpart, veteran, times) ) cv_res <- cv_survmetalearner( formula = form, data = veteran, times = times, base_models = base_models, base_preds = base_preds, folds = 2, metrics = c("cindex", "ibs"), seed = 1, verbose = FALSE ) cv_res$summary plot_survmetalearner_weights(cv_res$model)
Computes a binned Expected Calibration Error at time t_star, using
quantile-based bins of predicted survival probabilities.
ece_survmat(object, sp_matrix, t_star, n_bins = 10L, p = 1, weighted = TRUE)ece_survmat(object, sp_matrix, t_star, n_bins = 10L, p = 1, weighted = TRUE)
object |
A |
sp_matrix |
Matrix or data frame of survival probabilities with rows =
subjects and (optionally) columns named |
t_star |
Numeric evaluation time (must be a single value). |
n_bins |
Integer number of quantile bins (default |
p |
Power for the L^p error (default |
weighted |
Logical; if |
Let be the mean predicted survival in bin and
the Kaplan-Meier survival at in that bin.
The (weighted) ECE is
with .
A named numeric scalar: "ece".
y <- survival::Surv( time = c(1, 2, 3, 4, 6, 7, 8, 9), event = c(1, 1, 0, 1, 0, 1, 1, 0) ) sp <- cbind("t=5" = c(0.15, 0.20, 0.35, 0.40, 0.55, 0.65, 0.75, 0.80)) ece_survmat(y, sp_matrix = sp, t_star = 5, n_bins = 4)y <- survival::Surv( time = c(1, 2, 3, 4, 6, 7, 8, 9), event = c(1, 1, 0, 1, 0, 1, 1, 0) ) sp <- cbind("t=5" = c(0.15, 0.20, 0.35, 0.40, 0.55, 0.65, 0.75, 0.80)) ece_survmat(y, sp_matrix = sp, t_star = 5, n_bins = 4)
Fits an additive hazards regression model using timereg's
aalen and returns an mlsurv_model compatible
with the survalis workflow.
fit_aalen(formula, data, max.time = NULL, n.sim = 0, resample.iid = 1)fit_aalen(formula, data, max.time = NULL, n.sim = 0, resample.iid = 1)
formula |
A survival formula |
data |
A data frame containing the variables in |
max.time |
Optional maximum follow-up time used by the fitting routine. |
n.sim |
Integer; number of simulations for variance estimation (default |
resample.iid |
Integer; indicator for iid resampling (passed to |
The Aalen model assumes an additive hazard:
with nonparametric cumulative coefficient functions.
A list of class "mlsurv_model" with elements:
model, learner="aalen", engine="timereg", formula,
data, time, status.
mod_aalen <- fit_aalen( Surv(time, status) ~ trt + karno + age, data = veteran ) head(predict_aalen(mod_aalen, newdata = veteran[1:5, ], times = c(50, 100, 150)))mod_aalen <- fit_aalen( Surv(time, status) ~ trt + karno + age, data = veteran ) head(predict_aalen(mod_aalen, newdata = veteran[1:5, ], times = c(50, 100, 150)))
Fits an accelerated failure time (AFT) model via aftgee, which implements
GEE methodology for censored survival data. Returns an mlsurv_model-compatible
object used throughout the package.
fit_aftgee(formula, data, corstr = "independence")fit_aftgee(formula, data, corstr = "independence")
formula |
A survival formula of the form |
data |
A data.frame containing the variables in the model. |
corstr |
Working correlation structure; one of |
The AFT model assumes , where is survival time,
is the covariate matrix, and is an error term whose distribution
determines the baseline survival. aftgee estimates using GEE.
In this wrapper we set id = NULL, assuming one observation per subject.
An object of class mlsurv_model with components:
model: the fitted aftgee model.
learner: "aftgee".
formula: the survival formula.
data: the training data.
Jin, Z., Lin, D. Y., Wei, L. J., & Ying, Z. (2003). Rank-based inference for the accelerated failure time model. Biometrika, 90(2), 341-353.
mod <- fit_aftgee( Surv(time, status) ~ trt + karno + age, data = veteran ) head(predict_aftgee(mod, newdata = veteran[1:5, ], times = c(20, 60, 120)))mod <- fit_aftgee( Surv(time, status) ~ trt + karno + age, data = veteran ) head(predict_aftgee(mod, newdata = veteran[1:5, ], times = c(20, 60, 120)))
Fits a right‐censored survival model using BART's
surv.bart and returns an mlsurv_model compatible with
the survalis pipeline.
fit_bart(formula, data, K = 3, ...)fit_bart(formula, data, K = 3, ...)
formula |
A survival formula of the form |
data |
A |
K |
Integer; number of internal time grid intervals used by the BART
survival engine (default |
... |
Further args to BART::surv.bart |
The response must be of class Surv(..., type = "right"). The fitted
object stores the engine's internal evaluation times in $eval_times
(from bart_fit$times) for downstream prediction alignment.
An object of class mlsurv_model with elements:
Fitted BART::surv.bart object.
"bart".
Original inputs.
Engine's internal time grid used for prediction.
ex_data <- veteran[1:40, c("time", "status", "age", "karno", "celltype")] mod_bart <- fit_bart( Surv(time, status) ~ age + karno + celltype, data = ex_data, K = 1, ntree = 5, ndpost = 20, nskip = 5, mc.cores = 1, seed = 42 )ex_data <- veteran[1:40, c("time", "status", "age", "karno", "celltype")] mod_bart <- fit_bart( Surv(time, status) ~ age + karno + celltype, data = ex_data, K = 1, ntree = 5, ndpost = 20, nskip = 5, mc.cores = 1, seed = 42 )
Fits a survival boosting model using mboost's blackboost with
a Cox proportional hazards loss (mboost::CoxPH()) and shallow tree
base-learners (partykit). Returns an mlsurv_model compatible with the
survalis pipeline.
fit_blackboost( formula, data, weights = NULL, mstop = 100, nu = 0.1, minsplit = 10, minbucket = 4, maxdepth = 2, ... )fit_blackboost( formula, data, weights = NULL, mstop = 100, nu = 0.1, minsplit = 10, minbucket = 4, maxdepth = 2, ... )
formula |
A survival formula of the form |
data |
A |
weights |
Optional case weights passed to |
mstop |
Integer; number of boosting iterations (stopping iteration). Default |
nu |
Learning rate/shrinkage in |
minsplit |
minbucket, maxdepth Tree control parameters passed via
|
minbucket |
Minimum bucket size per tree node. |
maxdepth |
Maximum tree depth. |
... |
Additional arguments forwarded to |
The base-learner is a conditional inference tree controlled by
partykit::ctree_control(). The loss is the partial likelihood for Cox PH
via mboost::CoxPH(). Use mstop and nu to control complexity.
An object of class mlsurv_model with elements:
Fitted mboost model.
"blackboost".
Original inputs/metadata.
Bühlmann P, Hothorn T (2007). Boosting algorithms: regularization, prediction and model fitting. Statistical Science.
mod <- fit_blackboost( Surv(time, status) ~ age + karno + celltype, data = veteran ) head(predict_blackboost(mod, newdata = veteran[1:5, ], times = c(100, 200)))mod <- fit_blackboost( Surv(time, status) ~ age + karno + celltype, data = veteran ) head(predict_blackboost(mod, newdata = veteran[1:5, ], times = c(100, 200)))
Fits an ensemble of k–nearest neighbour survival learners via
bnnSurvival. The fitted object is standardized to the
mlsurv_model contract used in survalis.
fit_bnnsurv( formula, data, k = 5, num_base_learners = 10, num_features_per_base_learner = NULL, metric = "mahalanobis", weighting_function = function(x) x * 0 + 1, replace = TRUE, sample_fraction = NULL )fit_bnnsurv( formula, data, k = 5, num_base_learners = 10, num_features_per_base_learner = NULL, metric = "mahalanobis", weighting_function = function(x) x * 0 + 1, replace = TRUE, sample_fraction = NULL )
formula |
A survival formula of the form |
data |
A data frame containing all variables referenced in |
k |
Integer, number of neighbours for each base learner. Default |
num_base_learners |
Integer, number of base learners in the ensemble.
Default |
num_features_per_base_learner |
Integer or |
metric |
Character distance metric for neighbour search (for example,
|
weighting_function |
Function used to weight neighbours. Defaults to a
constant weighting |
replace |
Logical; sample with replacement when drawing observations for a base learner. Passed to the engine. |
sample_fraction |
Optional numeric in |
The native engine returns full survival curves on an internal time grid.
See predict_bnnsurv for how these are post–processed and
(optionally) interpolated to user–requested times.
An object of class "mlsurv_model" with elements:
The underlying bnnSurvival fit.
Scalar "bnnsurv".
Scalar "bnnSurvival".
Inputs preserved for downstream use.
Character names of the survival outcome fields.
Uses bnnSurvival::bnnSurvival. This wrapper calls the engine via
requireNamespace("bnnSurvival", quietly = TRUE) and stores the native
model in $model.
predict_bnnsurv(), tune_bnnsurv()
mod <- fit_bnnsurv( Surv(time, status) ~ age + karno + diagtime + prior, data = veteran ) head(predict_bnnsurv(mod, newdata = veteran[1:5, ], times = c(50, 100)))mod <- fit_bnnsurv( Surv(time, status) ~ age + karno + diagtime + prior, data = veteran ) head(predict_bnnsurv(mod, newdata = veteran[1:5, ], times = c(50, 100)))
Fits a survival forest model using the party::cforest() implementation
of conditional inference trees for right-censored survival data.
fit_cforest( formula, data, teststat = "quad", testtype = "Univariate", mincriterion = 0, ntree = 500, mtry = 5, replace = TRUE, fraction = 0.632, ... )fit_cforest( formula, data, teststat = "quad", testtype = "Univariate", mincriterion = 0, ntree = 500, mtry = 5, replace = TRUE, fraction = 0.632, ... )
formula |
A |
data |
A data frame containing the variables in the model. |
teststat |
Character string specifying the test statistic to use
(default = |
testtype |
Character string specifying the type of test
(default = |
mincriterion |
Numeric, the value of the test statistic that must be
exceeded for a split to be performed (default = |
ntree |
Integer, number of trees to grow (default = |
mtry |
Integer, number of variables randomly selected at each node
(default = |
replace |
Logical, whether sampling of cases is with replacement
(default = |
fraction |
Proportion of samples to draw if |
... |
Additional arguments passed to |
This function wraps party::cforest() to fit a conditional inference
forest for survival analysis, returning an mlsurv_model object compatible
with predict_cforest() and the survalis framework.
An object of class "mlsurv_model", containing:
model |
The fitted |
learner |
Character string |
formula |
The model formula. |
data |
The training data. |
mod <- fit_cforest(Surv(time, status) ~ age + celltype + karno, data = veteran) head(predict_cforest(mod, newdata = veteran[1:5, ], times = c(100, 200)))mod <- fit_cforest(Surv(time, status) ~ age + celltype + karno, data = veteran) head(predict_cforest(mod, newdata = veteran[1:5, ], times = c(100, 200)))
Fits a Cox proportional hazards regression model using the
survival::coxph() function, and returns an object compatible
with the mlsurv_model interface.
fit_coxph(formula, data, ...)fit_coxph(formula, data, ...)
formula |
A survival formula of the form
|
data |
A data frame containing the variables in the model. |
... |
Additional arguments passed to |
The fitted object is stored along with metadata such as the
learner name ("coxph"), original formula, data, and names
of the time and status variables. The function requires the
survival and pec packages.
An object of class "mlsurv_model" containing:
model – the fitted coxph object
learner – the string "coxph"
formula – the survival formula used
data – the training dataset
time, status – names of the survival outcome variables
mod_cox <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) summary(mod_cox)mod_cox <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) summary(mod_cox)
This function fits a fully parametric survival regression model using the
flexsurv package. It supports a variety of parametric distributions
(e.g., Weibull, exponential, log-normal) and returns a model object compatible
with the mlsurv_model interface for downstream survival predictions.
fit_flexsurvreg(formula, data, dist = "weibull", ...)fit_flexsurvreg(formula, data, dist = "weibull", ...)
formula |
A |
data |
A data frame containing the variables in the model. |
dist |
Character string specifying the parametric distribution to use
(default = |
... |
Additional arguments passed to |
An object of class mlsurv_model containing:
model - the fitted flexsurvreg model
learner - string identifier "flexsurvreg"
formula - the model formula
data - the training data
time - the name of the time-to-event column
status - the name of the event indicator column
mod_flex <- fit_flexsurvreg(Surv(time, status) ~ age + celltype + karno, data = veteran, dist = "weibull")mod_flex <- fit_flexsurvreg(Surv(time, status) ~ age + celltype + karno, data = veteran, dist = "weibull")
Fits a Cox proportional hazards model with elastic net regularization
using glmnet. The function wraps cv.glmnet to
select the optimal penalty parameter by cross-validation.
fit_glmnet(formula, data, alpha = 1, ...)fit_glmnet(formula, data, alpha = 1, ...)
formula |
A survival formula of the form |
data |
A |
alpha |
Numeric value in
|
... |
Additional arguments passed to |
An object of class "mlsurv_model" containing:
model – the fitted cv.glmnet object
learner – the string "glmnet"
formula, data, time, and status metadata
mod_glmnet <- fit_glmnet( Surv(time, status) ~ age + karno + celltype, data = veteran ) summary(mod_glmnet$model)mod_glmnet <- fit_glmnet( Surv(time, status) ~ age + karno + celltype, data = veteran ) summary(mod_glmnet$model)
Fits an Oblique Random Survival Forest using the aorsf package. This method builds an ensemble of oblique decision trees for survival analysis, where splits are based on linear combinations of features, allowing for improved performance in high-dimensional or correlated feature settings.
fit_orsf(formula, data, ...)fit_orsf(formula, data, ...)
formula |
A survival formula of the form |
data |
A data frame containing the variables specified in |
... |
Additional arguments passed to |
ORSF models extend traditional Random Survival Forests by allowing oblique splits,
which can improve prediction accuracy when predictors are correlated.
Missing data are omitted by default (na_action = "omit").
An object of class "mlsurv_model" containing:
model: The fitted aorsf ORSF model object.
learner: Character string, always "orsf".
formula: The survival formula used.
data: The training dataset.
time: Name of the survival time variable.
status: Name of the event indicator variable.
Jaeger BC, Long DL, Long DM, Sims M, Szychowski JM, Min YI, Bandyopadhyay D. Oblique random survival forests. Annals of Applied Statistics. 2019;13(3):1847-1883. doi:10.1214/19-AOAS1261
mod <- fit_orsf(Surv(time, status) ~ age + karno, data = veteran) summary(mod)mod <- fit_orsf(Surv(time, status) ~ age + karno, data = veteran) summary(mod)
This function fits a survival random forest model using the
ranger package and returns an object compatible with the
mlsurv_model class.
fit_ranger(formula, data, ...)fit_ranger(formula, data, ...)
formula |
A survival formula of the form |
data |
A |
... |
Additional arguments passed to |
This function wraps ranger for survival analysis and
stores the result in a standardized mlsurv_model object.
An object of class "mlsurv_model" containing:
model - the fitted ranger model
learner - character string "ranger"
formula - the model formula
data - training data used to fit the model
predict_ranger, tune_ranger, ranger
mod <- fit_ranger( Surv(time, status) ~ age + karno + celltype, data = veteran, num.trees = 25 ) summary(mod)mod <- fit_ranger( Surv(time, status) ~ age + karno + celltype, data = veteran, num.trees = 25 ) summary(mod)
rpart
Fits a survival tree using the rpart package with an exponential splitting rule.
This learner is compatible with the mlsurv_model interface and can be used with
the cv_survlearner() and tune_*() functions in the survalis framework.
fit_rpart( formula, data, minsplit = 20, minbucket = round(minsplit/3), cp = 0.01, maxcompete = 4, maxsurrogate = 5, usesurrogate = 2, xval = 10, surrogatestyle = 0, maxdepth = 30 )fit_rpart( formula, data, minsplit = 20, minbucket = round(minsplit/3), cp = 0.01, maxcompete = 4, maxsurrogate = 5, usesurrogate = 2, xval = 10, surrogatestyle = 0, maxdepth = 30 )
formula |
A survival formula of the form |
data |
A |
minsplit |
Minimum number of observations that must exist in a node in order for a split to be attempted.
Default is |
minbucket |
Minimum number of observations in any terminal node. Default is |
cp |
Complexity parameter for pruning. Default is |
maxcompete |
Number of competitor splits retained in the output. Default is |
maxsurrogate |
Number of surrogate splits retained in the output. Default is |
usesurrogate |
How surrogates are used in the splitting process. Default is |
xval |
Number of cross-validations to perform in |
surrogatestyle |
Controls selection of surrogate splits. Default is |
maxdepth |
Maximum depth of any node of the final tree. Default is |
This function fixes the method = "exp" argument to fit an exponential
splitting survival tree, which models the hazard function assuming
exponential survival within terminal nodes.
An object of class "mlsurv_model" containing:
model – fitted rpart survival tree object
learner – "rpart"
formula, data, time, and status
Therneau TM, Atkinson EJ. (2019). An Introduction to Recursive Partitioning Using the RPART Routines. Mayo Clinic.
mod_rpart <- fit_rpart(Surv(time, status) ~ age + karno + celltype, data = veteran) pred_rpart <- predict_rpart(mod_rpart, newdata = veteran[1:5, ], times = c(100, 200, 300)) head(pred_rpart)mod_rpart <- fit_rpart(Surv(time, status) ~ age + karno + celltype, data = veteran) pred_rpart <- predict_rpart(mod_rpart, newdata = veteran[1:5, ], times = c(100, 200, 300)) head(pred_rpart)
Fits a Random Survival Forest for right-censored time-to-event data using
randomForestSRC and returns a standardized mlsurv_model compatible
with the survalis framework.
fit_rsf(formula, data, ntree = 500, mtry = NULL, nodesize = 15, ...)fit_rsf(formula, data, ntree = 500, mtry = NULL, nodesize = 15, ...)
formula |
A survival formula of the form |
data |
A |
ntree |
Integer; number of trees to grow (default: 500). |
mtry |
Integer or |
nodesize |
Integer; minimum terminal node size (default: 15). |
... |
Additional arguments forwarded to |
RSF extends random forests to survival data by growing an ensemble of survival trees on bootstrap samples and aggregating survival functions across trees.
An object of class mlsurv_model, a named list with elements:
The fitted randomForestSRC::rfsrc object.
Character scalar identifying the learner ("rsf").
Character scalar naming the engine ("randomForestSRC").
The original survival formula.
The training dataset (or a minimal subset needed for prediction).
Name of the survival time variable.
Name of the event indicator (1 = event, 0 = censored).
Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS (2008). Random survival forests. Annals of Applied Statistics, 2(3):841-860.
mod_rsf <- fit_rsf(Surv(time, status) ~ age + celltype + karno, data = veteran, ntree = 200) times <- c(100, 200, 300) pred_probs <- predict_rsf(mod_rsf, newdata = veteran[1:5, ], times = times) print(round(pred_probs, 3))mod_rsf <- fit_rsf(Surv(time, status) ~ age + celltype + karno, data = veteran, ntree = 200) times <- c(100, 200, 300) pred_probs <- predict_rsf(mod_rsf, newdata = veteran[1:5, ], times = times) print(round(pred_probs, 3))
Fits a Cox proportional hazards model with automated predictor selection
using pec's selectCox(), returning an mlsurv_model object
compatible with the survalis evaluation and cross‑validation pipeline.
fit_selectcox(formula, data, rule = "aic")fit_selectcox(formula, data, rule = "aic")
formula |
A survival formula of the form |
data |
A |
rule |
Selection rule passed to |
This wrapper standardizes the return object to the mlsurv_model contract so
downstream prediction and evaluation behave consistently across learners.
An object of class mlsurv_model, a named list with elements:
The fitted pec::selectCox model.
Character scalar identifying the learner ("selectcox").
Character scalar naming the engine ("selectCox").
The original survival formula.
The training dataset (or a minimal subset needed for prediction).
The selection rule used.
Uses pec::selectCox. Selection can be driven by Akaike's Information
Criterion (rule = "aic") or by p‑value thresholds (rule = "p"),
as implemented by pec.
Mogensen UB, Ishwaran H, Gerds TA (2012). Evaluating Random Forests for Survival Analysis using pec. Gerds TA, et al. pec: Prediction Error Curves for Survival Models.
predict_selectcox(), tune_selectcox(), pec::selectCox()
mod <- fit_selectcox(Surv(time, status) ~ age + celltype + karno, data = veteran)mod <- fit_selectcox(Surv(time, status) ~ age + celltype + karno, data = veteran)
Fits a parametric/penalised generalised survival model using
rstpm2's stpm2(), returning an mlsurv_model object that
integrates with the survalis evaluation and cross‑validation pipeline.
The baseline hazard is modeled with restricted cubic splines controlled by
the degrees of freedom.
fit_stpm2(formula, data, df = 4, ...)fit_stpm2(formula, data, df = 4, ...)
formula |
A survival formula of the form |
data |
A |
df |
Integer degrees of freedom for the restricted cubic spline baseline.
Default is |
... |
Additional arguments forwarded to |
This wrapper standardizes the model object to the mlsurv_model contract so
downstream prediction and evaluation behave consistently across learners.
An object of class mlsurv_model, a named list with elements:
The fitted rstpm2::stpm2 object.
Character scalar identifying the learner ("stpm2").
Character scalar naming the engine ("rstpm2").
The original survival formula.
The training dataset (or a minimal subset needed for prediction).
Name of the survival time variable.
Name of the event indicator (1 = event, 0 = censored).
Uses rstpm2::stpm2. Spline complexity is governed by df;
larger values allow more flexibility in the baseline hazard. Additional
engine arguments can be passed via ....
Royston P, Parmar MKB (2002). Flexible parametric proportional‑hazards and proportional‑odds models for censored survival data. Statistics in Medicine. Lambert PC, Royston P, Crowther MJ (2009). Restricted cubic splines for non‑proportional hazards. Statistics in Medicine.
mod_stpm2 <- fit_stpm2(Surv(time, status) ~ age + karno + celltype, data = veteran, df = 4) predict_stpm2(mod_stpm2, newdata = veteran[1:5, ], times = c(100, 200, 300)) summary(mod_stpm2)mod_stpm2 <- fit_stpm2(Surv(time, status) ~ age + karno + celltype, data = veteran, df = 4) predict_stpm2(mod_stpm2, newdata = veteran[1:5, ], times = c(100, 200, 300)) summary(mod_stpm2)
Fits a deep neural network survival model using survdnn (backed by
torch). Supports the losses currently exposed by survdnn
("cox", "cox_l2", "aft", and "coxtime"). Returns an
mlsurv_model object that integrates with the survalis evaluation and
cross-validation pipeline.
fit_survdnn( formula, data, loss = "cox", hidden = c(32L, 32L, 16L), activation = "relu", lr = 1e-04, epochs = 300L, optimizer = "adam", optim_args = list(), verbose = FALSE, dropout = 0.3, batch_norm = TRUE, callbacks = NULL, .seed = NULL, .device = "auto", na_action = "omit", ... )fit_survdnn( formula, data, loss = "cox", hidden = c(32L, 32L, 16L), activation = "relu", lr = 1e-04, epochs = 300L, optimizer = "adam", optim_args = list(), verbose = FALSE, dropout = 0.3, batch_norm = TRUE, callbacks = NULL, .seed = NULL, .device = "auto", na_action = "omit", ... )
formula |
A survival formula of the form |
data |
A |
loss |
Loss function name understood by survdnn. One of
|
|
Integer vector of hidden layer sizes (e.g., |
|
activation |
Activation function name. Supported options depend on the
installed survdnn version and include |
lr |
Learning rate. |
epochs |
Number of training epochs. |
optimizer |
Optimizer name. One of |
optim_args |
Optional named list of extra optimizer arguments passed to torch via survdnn. |
verbose |
Logical; print training progress. |
dropout |
Numeric dropout rate in |
batch_norm |
Logical; whether to use batch normalization in hidden layers. |
callbacks |
Optional list of callback functions used by survdnn. |
.seed |
Optional integer random seed passed through to survdnn. |
.device |
Computation device. One of |
na_action |
How to handle missing values. One of |
... |
Additional arguments forwarded to the underlying engine. |
Design contract. All fit_*() functions in survalis:
(i) return a named list with model, learner, engine, formula,
data, time, and status; and (ii) retain information required by
predict_*() to build a consistent design for new data.
An object of class mlsurv_model, a named list with elements:
The underlying fitted survdnn model.
Character scalar identifying the learner ("survdnn").
Character scalar naming the engine ("survdnn").
The original survival formula.
The training dataset (or a minimal subset needed for prediction).
Name of the survival time variable.
Name of the event indicator (1 = event, 0 = censored).
Uses survdnn::survdnn with torch. The wrapper exposes the core training arguments used by the engine, including optimizer choice, dropout, batch normalization, callbacks, device selection, and missing-value handling.
survdnn documentation; torch for deep learning in R.
predict_survdnn(), tune_survdnn()
if (requireNamespace("survdnn", quietly = TRUE) && requireNamespace("torch", quietly = TRUE) && torch::torch_is_installed()) { mod <- fit_survdnn(Surv(time, status) ~ age + karno + celltype, data = veteran, loss = "cox", epochs = 50, verbose = FALSE) pred <- predict_survdnn(mod, newdata = veteran[1:5, ], times = c(30, 90, 180)) print(pred) }if (requireNamespace("survdnn", quietly = TRUE) && requireNamespace("torch", quietly = TRUE) && torch::torch_is_installed()) { mod <- fit_survdnn(Surv(time, status) ~ age + karno + celltype, data = veteran, loss = "cox", epochs = 50, verbose = FALSE) pred <- predict_survdnn(mod, newdata = veteran[1:5, ], times = c(30, 90, 180)) print(pred) }
Learns nonnegative, time‑specific stacking weights over a set of base survival
learners by solving a nonnegative least squares (NNLS) problem at each
requested time . For each , the target is the indicator
; the features are the base learners' predicted survival
probabilities . Weights are constrained to be
nonnegative and are normalized to sum to 1 per time point.
fit_survmetalearner( base_preds, time, status, times, base_models, formula, data )fit_survmetalearner( base_preds, time, status, times, base_models, formula, data )
base_preds |
A named list of matrices/data frames, one per base learner,
each of dimension |
time |
Numeric vector of observed event/censoring times (length |
status |
Numeric/binary vector of event indicators (1=event, 0=censor) (length |
times |
Numeric vector of evaluation times at which to learn weights. |
base_models |
A named list of fitted base learner objects; names must
match |
formula |
A |
data |
The training data frame used for the base models (stored for metadata). |
For each t in times, this function fits
where and is the matrix of base survival
probabilities at . The NNLS solution from nnls is renormalized
to sum to 1 (if the solution is all zeros, weights remain NA for that time).
An object of class c("mlsurv_model","survmetalearner") with elements:
Matrix L x T of nonnegative stacking weights, rows=learners, cols="t=<time>".
Named list of base learner fits.
Named list of base prediction matrices on training data.
Character vector of learner names (from names(base_preds)).
Training metadata for scoring/reporting.
The string "survmetalearner" (for predict_* dispatch).
predict_survmetalearner, plot_survmetalearner_weights,
cv_survmetalearner
form <- Surv(time, status) ~ age + karno + trt times <- c(80, 160) mod_cox <- fit_coxph(form, data = veteran) mod_rpart <- fit_rpart(form, data = veteran) base_models <- list(coxph = mod_cox, rpart = mod_rpart) base_preds <- list( coxph = predict_coxph(mod_cox, newdata = veteran, times = times), rpart = predict_rpart(mod_rpart, newdata = veteran, times = times) ) meta_model <- fit_survmetalearner( base_preds = base_preds, time = veteran$time, status = veteran$status, times = times, base_models = base_models, formula = form, data = veteran ) meta_model$weightsform <- Surv(time, status) ~ age + karno + trt times <- c(80, 160) mod_cox <- fit_coxph(form, data = veteran) mod_rpart <- fit_rpart(form, data = veteran) base_models <- list(coxph = mod_cox, rpart = mod_rpart) base_preds <- list( coxph = predict_coxph(mod_cox, newdata = veteran, times = times), rpart = predict_rpart(mod_rpart, newdata = veteran, times = times) ) meta_model <- fit_survmetalearner( base_preds = base_preds, time = veteran$time, status = veteran$status, times = times, base_models = base_models, formula = form, data = veteran ) meta_model$weights
Fits a survival support vector machine using survivalsvm. The default
setup uses the regression-type loss with quadratic programming optimization
and an additive or linear kernel, but all survivalsvm options are exposed.
Returns an mlsurv_model object that integrates with the survalis
evaluation and cross-validation pipeline.
fit_survsvm( formula, data, type = "regression", gamma.mu = 0.1, opt.meth = "quadprog", kernel = "add_kernel", diff.meth = NULL )fit_survsvm( formula, data, type = "regression", gamma.mu = 0.1, opt.meth = "quadprog", kernel = "add_kernel", diff.meth = NULL )
formula |
A survival formula of the form |
data |
A |
type |
SVM loss type used by |
gamma.mu |
Regularization parameter for the margin/hinge component. |
opt.meth |
Optimization method (e.g., |
kernel |
Kernel type (e.g., |
diff.meth |
Optional differentiation method passed to the engine. |
Design contract. All fit_*() functions in survalis:
(i) return a named list with model, learner, engine, formula,
data, time, and status; (ii) preserve terms(formula) or equivalent
for consistent prediction design; and (iii) keep engine arguments required
downstream by predict_*().
An object of class mlsurv_model, a named list with elements:
The underlying fitted survivalsvm model.
Character scalar identifying the learner ("survsvm").
Character scalar naming the engine ("survivalsvm").
The original survival formula.
The training dataset (or a minimal subset needed for prediction).
Name of the survival time variable.
Name of the event indicator (1 = event, 0 = censored).
Uses survivalsvm::survivalsvm. Some optimization methods (e.g.,
"quadprog") require additional system dependencies and the quadprog
package to be installed.
Binder H, et al. (2009). Survival Support Vector Machines (and related work). survivalsvm package documentation.
predict_survsvm(), tune_survsvm()
mod_svm <- fit_survsvm(Surv(time, status) ~ age + celltype + karno, data = veteran, type = "regression", gamma.mu = 0.1, kernel = "lin_kernel") times <- c(100, 300, 500) predict_survsvm(mod_svm, newdata = veteran[1:5, ], times = times, dist = "exp") predict_survsvm(mod_svm, newdata = veteran[1:5, ], times = times, dist = "weibull", shape = 1.5) cv_results_svm <- cv_survlearner( formula = Surv(time, status) ~ age + celltype + karno, data = veteran, fit_fun = fit_survsvm, pred_fun = predict_survsvm, times = c(100, 300, 500), metrics = c("cindex", "ibs"), folds = 5, seed = 42, gamma.mu = 0.1, kernel = "lin_kernel") print(cv_results_svm) cv_summary(cv_results_svm) cv_plot(cv_results_svm)mod_svm <- fit_survsvm(Surv(time, status) ~ age + celltype + karno, data = veteran, type = "regression", gamma.mu = 0.1, kernel = "lin_kernel") times <- c(100, 300, 500) predict_survsvm(mod_svm, newdata = veteran[1:5, ], times = times, dist = "exp") predict_survsvm(mod_svm, newdata = veteran[1:5, ], times = times, dist = "weibull", shape = 1.5) cv_results_svm <- cv_survlearner( formula = Surv(time, status) ~ age + celltype + karno, data = veteran, fit_fun = fit_survsvm, pred_fun = predict_survsvm, times = c(100, 300, 500), metrics = c("cindex", "ibs"), folds = 5, seed = 42, gamma.mu = 0.1, kernel = "lin_kernel") print(cv_results_svm) cv_summary(cv_results_svm) cv_plot(cv_results_svm)
Fits a gradient-boosted tree model for time-to-event outcomes using
xgboost. Supports "survival:aft" (default) and "survival:cox".
Returns an mlsurv_model object compatible with the survalis pipeline.
fit_xgboost( formula, data, booster = "gbtree", objective = "survival:aft", aft_loss_distribution = "extreme", aft_loss_distribution_scale = 1, nrounds = 100 )fit_xgboost( formula, data, booster = "gbtree", objective = "survival:aft", aft_loss_distribution = "extreme", aft_loss_distribution_scale = 1, nrounds = 100 )
formula |
A survival formula of the form |
data |
A |
booster |
XGBoost booster type (default |
objective |
One of |
aft_loss_distribution |
AFT error distribution: |
aft_loss_distribution_scale |
Positive numeric scale for the AFT loss (default |
nrounds |
Integer number of boosting iterations (default |
Design contract: returns a named list with engine metadata and preserves
terms(formula) so prediction uses the exact same encoding as training.
An object of class mlsurv_model:
Fitted xgboost model.
"xgboost".
Original inputs.
Names of the survival time and event variables.
Objective used.
AFT distribution and scale (populated even if Cox is used).
Preserved terms for consistent prediction design matrices.
Uses xgboost. For "survival:aft", interval labels are set via
label_lower_bound/label_upper_bound with right-censoring
represented by Inf on the upper bound. For "survival:cox", labels
follow the Cox objective convention.
predict_xgboost(), tune_xgboost()
mod_xgb <- fit_xgboost( Surv(time, status) ~ age + karno + celltype, data = veteran )mod_xgb <- fit_xgboost( Surv(time, status) ~ age + karno + celltype, data = veteran )
Computes where
is the mean predicted survival across subjects and
is the Kaplan-Meier estimate. Integration is carried
out over KM event times using a left Riemann sum.
iae_survmat(object, sp_matrix, times)iae_survmat(object, sp_matrix, times)
object |
A |
sp_matrix |
Matrix/data frame of survival probabilities (rows = subjects,
columns aligned with |
times |
Numeric vector of times corresponding to the columns of |
A named numeric scalar: "iae".
y <- survival::Surv(time = veteran$time, event = veteran$status) times <- c(60, 120) lp <- stats::plogis(scale(veteran$karno)) sp <- cbind("t=60" = pmin(1, lp + 0.05), "t=120" = pmax(0, lp - 0.05)) colnames(sp) <- c("t=60", "t=120") iae_survmat(y, sp_matrix = sp, times = times)y <- survival::Surv(time = veteran$time, event = veteran$status) times <- c(60, 120) lp <- stats::plogis(scale(veteran$karno)) sp <- cbind("t=60" = pmin(1, lp + 0.05), "t=120" = pmax(0, lp - 0.05)) colnames(sp) <- c("t=60", "t=120") iae_survmat(y, sp_matrix = sp, times = times)
Computes the Integrated Brier Score (IBS) over a vector of times, using discrete left Riemann integration of IPCW Brier scores.
ibs_survmat(object, sp_matrix, times)ibs_survmat(object, sp_matrix, times)
object |
A |
sp_matrix |
Matrix/data frame of survival probabilities with one column
per time in |
times |
Numeric vector of strictly increasing times (length must equal
|
A named numeric scalar: "ibs".
y <- survival::Surv(time = veteran$time, event = veteran$status) times <- c(60, 120) lp <- stats::plogis(scale(veteran$karno)) sp <- cbind("t=60" = pmin(1, lp + 0.05), "t=120" = pmax(0, lp - 0.05)) colnames(sp) <- c("t=60", "t=120") ibs_survmat(y, sp_matrix = sp, times = times)y <- survival::Surv(time = veteran$time, event = veteran$status) times <- c(60, 120) lp <- stats::plogis(scale(veteran$karno)) sp <- cbind("t=60" = pmin(1, lp + 0.05), "t=120" = pmax(0, lp - 0.05)) colnames(sp) <- c("t=60", "t=120") ibs_survmat(y, sp_matrix = sp, times = times)
Computes where
is the mean predicted survival and
is the Kaplan-Meier curve. Integration uses KM event times and a left
Riemann sum.
ise_survmat(object, sp_matrix, times)ise_survmat(object, sp_matrix, times)
object |
A |
sp_matrix |
Matrix/data frame of survival probabilities (rows = subjects,
columns aligned with |
times |
Numeric vector of times corresponding to the columns of |
A named numeric scalar: "ise".
y <- survival::Surv(time = veteran$time, event = veteran$status) times <- c(60, 120) lp <- stats::plogis(scale(veteran$karno)) sp <- cbind("t=60" = pmin(1, lp + 0.05), "t=120" = pmax(0, lp - 0.05)) colnames(sp) <- c("t=60", "t=120") ise_survmat(y, sp_matrix = sp, times = times)y <- survival::Surv(time = veteran$time, event = veteran$status) times <- c(60, 120) lp <- stats::plogis(scale(veteran$karno)) sp <- cbind("t=60" = pmin(1, lp + 0.05), "t=120" = pmax(0, lp - 0.05)) colnames(sp) <- c("t=60", "t=120") ise_survmat(y, sp_matrix = sp, times = times)
Returns a table mapping each compute_* function to its paired
plot_* helper (if any). Methods without a plot helper show NA.
list_interpretability_methods()list_interpretability_methods()
A tibble with columns compute, plot,
has_compute, and has_plot.
list_interpretability_methods() subset(list_interpretability_methods(), is.na(plot)) # methods without plotlist_interpretability_methods() subset(list_interpretability_methods(), is.na(plot)) # methods without plot
Returns a data frame describing the evaluation metrics supported by
survalis (as used by helpers like cv_survlearner() and score_survmodel()).
list_metrics()list_metrics()
Columns:
metric: short metric name used throughout the package.
direction: whether higher or lower values are better.
summary: brief description.
range: typical value range.
A tibble/data.frame with one row per metric.
list_metrics()list_metrics()
Returns a table of known survival learners, showing the expected
fit_*, predict_*, and (if present) tune_* functions, plus
booleans indicating which functions are available.
list_survlearners(has_tune = FALSE)list_survlearners(has_tune = FALSE)
has_tune |
Logical (default |
A tibble with columns:
learner – learner id (e.g., "ranger")
fit, predict, tune – function names (tune may be NA)
has_fit, has_predict, has_tune – logical flags
available – has_fit & has_predict
list_survlearners() # all learners (default) list_survlearners(has_tune = TRUE) # only tunable learnerslist_survlearners() # all learners (default) list_survlearners(has_tune = TRUE) # only tunable learners
Filters learners to only those that provide a tune_* function in addition
to fit_* and predict_*.
list_tunable_survlearners()list_tunable_survlearners()
A base data.frame like list_survlearners() but containing only
rows where tune is not NA.
list_tunable_survlearners()list_tunable_survlearners()
Visualizes ALE results produced by compute_ale() either as per-time curves
(one curve per evaluation time) or as an integrated curve averaged across
times.
plot_ale( ale_result, feature, which = c("per_time", "integrated"), smooth = FALSE )plot_ale( ale_result, feature, which = c("per_time", "integrated"), smooth = FALSE )
ale_result |
A list returned by |
feature |
Character name of the feature (for axis labeling only). |
which |
Either |
smooth |
Logical; if |
Per-time plots show how the feature's local effect varies across different evaluation times. The integrated plot summarizes the average effect over the supplied time grid (simple mean across times of the centered ALE values).
A ggplot2 object.
compute_ale(), compute_pdp(), plot_pdp()
mod <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) ale_res <- compute_ale( model = mod, newdata = veteran, feature = "karno", times = c(80, 160), grid.size = 8 ) plot_ale(ale_res, feature = "karno", which = "per_time") plot_ale(ale_res, feature = "karno", which = "integrated", smooth = TRUE)mod <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) ale_res <- compute_ale( model = mod, newdata = veteran, feature = "karno", times = c(80, 160), grid.size = 8 ) plot_ale(ale_res, feature = "karno", which = "per_time") plot_ale(ale_res, feature = "karno", which = "integrated", smooth = TRUE)
Produces box‑and‑jitter plots of CV metric values per learner, faceted by metric for quick visual comparison.
plot_benchmark(benchmark_results)plot_benchmark(benchmark_results)
benchmark_results |
A data frame from
|
A ggplot2 object.
benchmark_default_survlearners(), summarise_benchmark()
res <- tibble::tibble( learner = c("coxph", "coxph", "rpart", "rpart"), metric = c("cindex", "ibs", "cindex", "ibs"), value = c(0.64, 0.19, 0.60, 0.23) ) plot_benchmark(res)res <- tibble::tibble( learner = c("coxph", "coxph", "rpart", "rpart"), metric = c("cindex", "ibs", "cindex", "ibs"), value = c(0.64, 0.19, 0.60, 0.23) ) plot_benchmark(res)
Produces a calibration plot comparing mean predicted survival (x-axis) to observed survival with bootstrap CIs (y-axis) at a single evaluation time.
plot_calibration(calib_output, smooth = TRUE)plot_calibration(calib_output, smooth = TRUE)
calib_output |
Output list returned by |
smooth |
Logical; if |
Points above the diagonal indicate underprediction (observed survival higher than predicted), while points below indicate overprediction.
A ggplot2 object showing bin-wise calibration points, bootstrap
error bars, the 45° reference line, and (optionally) a smooth curve.
mod <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) calib <- compute_calibration( model = mod, data = veteran, time = "time", status = "status", eval_time = 80, n_bins = 4, n_boot = 5, seed = 1 ) plot_calibration(calib)mod <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) calib <- compute_calibration( model = mod, data = veteran, time = "time", status = "status", eval_time = 80, n_bins = 4, n_boot = 5, seed = 1 ) plot_calibration(calib)
Visualizes counterfactual feature changes returned by
compute_counterfactual, ranking recommendations by either raw
survival gain or penalized gain.
plot_counterfactual( counterfactual_df, metric = c("penalized_gain", "survival_gain"), top_n = NULL, include_negative = FALSE )plot_counterfactual( counterfactual_df, metric = c("penalized_gain", "survival_gain"), top_n = NULL, include_negative = FALSE )
counterfactual_df |
A data frame returned by |
metric |
Character scalar; one of |
top_n |
Optional integer limiting the plot to the top |
include_negative |
Logical; if |
A ggplot2 object.
df <- veteran df$A <- df$trt mod <- fit_coxph(survival::Surv(time, status) ~ A + age + karno, data = df) cf <- compute_counterfactual( model = mod, newdata = df[1, , drop = FALSE], times = c(50, 100, 150), target_time = 100, features_to_change = c("A", "age", "karno"), grid.size = 10 ) plot_counterfactual(cf)df <- veteran df$A <- df$trt mod <- fit_coxph(survival::Surv(time, status) ~ A + age + karno, data = df) cf <- compute_counterfactual( model = mod, newdata = df[1, , drop = FALSE], times = c(50, 100, 150), target_time = 100, features_to_change = c("A", "age", "karno"), grid.size = 10 ) plot_counterfactual(cf)
Visualizes interaction outputs from compute_interactions as
(i) a ranked bar chart for one-way interactions, (ii) a pairwise heatmap,
or (iii) time-varying interaction trajectories.
plot_interactions(object, type = c("1way", "heatmap", "time"))plot_interactions(object, type = c("1way", "heatmap", "time"))
object |
A data frame returned by |
type |
One of |
1way: Bars rank features by Friedman-H interaction strength at the target time. heatmap: Tiles show pairwise interaction magnitudes (symmetric). time: Lines show interaction strength vs. time for each feature.
A ggplot2 object.
mod <- fit_coxph(Surv(time, status) ~ age + karno + trt, data = veteran) times <- c(80, 160) ia <- compute_interactions( model = mod, data = veteran, times = times, target_time = 80, features = c("age", "karno"), type = "1way", grid.size = 6 ) plot_interactions(ia, type = "1way")mod <- fit_coxph(Surv(time, status) ~ age + karno + trt, data = veteran) times <- c(80, 160) ia <- compute_interactions( model = mod, data = veteran, times = times, target_time = 80, features = c("age", "karno"), type = "1way", grid.size = 6 ) plot_interactions(ia, type = "1way")
Plots partial dependence (PDP) and/or individual conditional expectation (ICE)
results returned by compute_pdp either per evaluation time or as
an integrated PDP over time.
plot_pdp( pdp_ice_output, feature, method = "pdp+ice", ids = NULL, which = c("per_time", "integrated"), alpha_ice = 0.2, smooth = FALSE )plot_pdp( pdp_ice_output, feature, method = "pdp+ice", ids = NULL, which = c("per_time", "integrated"), alpha_ice = 0.2, smooth = FALSE )
pdp_ice_output |
The list returned by |
feature |
Character scalar; the same feature analyzed in |
method |
One of |
ids |
Optional vector of row ids ( |
which |
One of |
alpha_ice |
Alpha transparency for ICE lines/boxes in per-time plots (default 0.2). |
smooth |
Logical; if |
Per-time: For numeric features, draws ICE lines and PDP overlays per time.
For categorical features, shows ICE as boxplots per level and PDP as point summaries.
Integrated: Plots the PDP integrated across time (if provided by
compute_pdp()); numeric features can be smoothed with smooth=TRUE.
A ggplot2 object.
mod <- fit_coxph(Surv(time, status) ~ age + karno + trt, data = veteran) pdp_age <- compute_pdp( model = mod, data = veteran, feature = "age", times = c(80, 160), method = "pdp+ice", grid.size = 8 ) plot_pdp(pdp_age, feature = "age", which = "per_time") plot_pdp(pdp_age, feature = "age", which = "integrated", smooth = TRUE)mod <- fit_coxph(Surv(time, status) ~ age + karno + trt, data = veteran) pdp_age <- compute_pdp( model = mod, data = veteran, feature = "age", times = c(80, 160), method = "pdp+ice", grid.size = 8 ) plot_pdp(pdp_age, feature = "age", which = "per_time") plot_pdp(pdp_age, feature = "age", which = "integrated", smooth = TRUE)
Plots time-dependent SHAP estimates (lines over time) or aggregated SHAP
(bar chart) returned by compute_shap().
plot_shap(shapley_result, type = c("auto"))plot_shap(shapley_result, type = c("auto"))
shapley_result |
A data frame returned by |
type |
One of |
A ggplot2 object.
mod <- fit_coxph(survival::Surv(time, status) ~ age + karno + celltype, data = veteran) shap_td <- compute_shap(mod, veteran[10, , drop = FALSE], veteran, times = c(50, 100), sample.size = 5) p1 <- plot_shap(shap_td) # auto -> time plot shap_ag <- compute_shap(mod, veteran[10, , drop = FALSE], veteran, times = c(50, 100), sample.size = 5, aggregate = TRUE, method = "meanabs") p2 <- plot_shap(shap_ag) # auto -> bar plotmod <- fit_coxph(survival::Surv(time, status) ~ age + karno + celltype, data = veteran) shap_td <- compute_shap(mod, veteran[10, , drop = FALSE], veteran, times = c(50, 100), sample.size = 5) p1 <- plot_shap(shap_td) # auto -> time plot shap_ag <- compute_shap(mod, veteran[10, , drop = FALSE], veteran, times = c(50, 100), sample.size = 5, aggregate = TRUE, method = "meanabs") p2 <- plot_shap(shap_ag) # auto -> bar plot
Visualizes the signed local effects from compute_surrogate as a
horizontal bar chart, optionally limiting to the top n contributors.
plot_surrogate(surrogate_df, top_n = NULL)plot_surrogate(surrogate_df, top_n = NULL)
surrogate_df |
A data frame returned by |
top_n |
Optional integer; if provided, display only the top |
A ggplot2 object showing feature contributions
(positive/negative) at the target time.
mod <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) local_expl <- compute_surrogate( model = mod, newdata = veteran[2, , drop = FALSE], baseline_data = veteran, times = c(80, 160), target_time = 80, k = 3 ) plot_surrogate(local_expl, top_n = 3)mod <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) local_expl <- compute_surrogate( model = mod, newdata = veteran[2, , drop = FALSE], baseline_data = veteran, times = c(80, 160), target_time = 80, k = 3 ) plot_surrogate(local_expl, top_n = 3)
Plots one or more predicted survival curves from a survival-probability
matrix. By default, each row of S is shown as an individual curve. If
a group vector is supplied, curves are summarized by group using the
requested aggregation function.
plot_survmat( S, times = NULL, group = NULL, ids = NULL, summary_fun = c("mean", "median"), show_individual = NULL, alpha = 0.2, linewidth = 0.7 )plot_survmat( S, times = NULL, group = NULL, ids = NULL, summary_fun = c("mean", "median"), show_individual = NULL, alpha = 0.2, linewidth = 0.7 )
S |
A |
times |
Optional numeric vector of time points corresponding to columns
of |
group |
Optional vector of group labels of length |
ids |
Optional integer vector of row ids to subset before plotting. |
summary_fun |
Aggregation for grouped curves; one of |
show_individual |
Logical. If |
alpha |
Alpha transparency for individual curves (default |
linewidth |
Line width for plotted curves (default |
A ggplot2 object.
S <- data.frame(`t=1` = c(0.95, 0.90, 0.92), `t=2` = c(0.80, 0.70, 0.78), `t=3` = c(0.60, 0.45, 0.55), check.names = FALSE) plot_survmat(S) plot_survmat(S, group = c("A", "B", "A"))S <- data.frame(`t=1` = c(0.95, 0.90, 0.92), `t=2` = c(0.80, 0.70, 0.78), `t=3` = c(0.60, 0.45, 0.55), check.names = FALSE) plot_survmat(S) plot_survmat(S, group = c("A", "B", "A"))
Visualizes the learned nonnegative NNLS stacking weights over time
for each base learner.
plot_survmetalearner_weights(model)plot_survmetalearner_weights(model)
model |
A |
A ggplot2 object showing weight trajectories (one line per learner).
form <- Surv(time, status) ~ age + karno + trt times <- c(80, 160) mod_cox <- fit_coxph(form, data = veteran) mod_rpart <- fit_rpart(form, data = veteran) base_models <- list(coxph = mod_cox, rpart = mod_rpart) base_preds <- list( coxph = predict_coxph(mod_cox, newdata = veteran, times = times), rpart = predict_rpart(mod_rpart, newdata = veteran, times = times) ) meta_model <- fit_survmetalearner( base_preds = base_preds, time = veteran$time, status = veteran$status, times = times, base_models = base_models, formula = form, data = veteran ) plot_survmetalearner_weights(meta_model)form <- Surv(time, status) ~ age + karno + trt times <- c(80, 160) mod_cox <- fit_coxph(form, data = veteran) mod_rpart <- fit_rpart(form, data = veteran) base_models <- list(coxph = mod_cox, rpart = mod_rpart) base_preds <- list( coxph = predict_coxph(mod_cox, newdata = veteran, times = times), rpart = predict_rpart(mod_rpart, newdata = veteran, times = times) ) meta_model <- fit_survmetalearner( base_preds = base_preds, time = veteran$time, status = veteran$status, times = times, base_models = base_models, formula = form, data = veteran ) plot_survmetalearner_weights(meta_model)
Visualizes the results of a tree_surrogate object returned by
compute_tree_surrogate(). Can display either the fitted surrogate trees
or aggregated feature importance based on split counts.
plot_tree_surrogate(tree_surrogate, type = c("tree", "importance"), top_n = 10)plot_tree_surrogate(tree_surrogate, type = c("tree", "importance"), top_n = 10)
tree_surrogate |
An object of class |
type |
Character string indicating the type of plot:
|
top_n |
Integer, the number of top features to display in the importance plot.
Ignored if |
If type = "tree", a separate tree diagram is produced for each evaluation time.
If type = "importance", feature split counts are summed across all times
and plotted as a bar chart.
Requires the partykit package for tree plotting and ggplot2 for importance plotting.
A plot object (for "importance") or printed tree diagrams (for "tree").
plot_tree_surrogate(tree_ranger, type = "tree") plot_tree_surrogate(tree_ranger, type = "importance", top_n = 5)plot_tree_surrogate(tree_ranger, type = "tree") plot_tree_surrogate(tree_ranger, type = "importance", top_n = 5)
Creates a dot plot of permutation-based variable importance, using either the scaled importance (default) or the raw importance column.
plot_varimp(varimp_df, use_scaled = TRUE)plot_varimp(varimp_df, use_scaled = TRUE)
varimp_df |
A data frame as returned by |
use_scaled |
Logical; if |
A ggplot2 object.
mod <- fit_coxph(survival::Surv(time, status) ~ age + karno + celltype, data = veteran) imp <- compute_varimp( model = mod, times = 80, metric = "brier", n_repetitions = 3, seed = 1, subset = 40 ) plot_varimp(imp, use_scaled = TRUE) plot_varimp(imp, use_scaled = FALSE)mod <- fit_coxph(survival::Surv(time, status) ~ age + karno + celltype, data = veteran) imp <- compute_varimp( model = mod, times = 80, metric = "brier", n_repetitions = 3, seed = 1, subset = 40 ) plot_varimp(imp, use_scaled = TRUE) plot_varimp(imp, use_scaled = FALSE)
Computes survival probabilities at specified time points from a model fitted
with fit_aalen.
predict_aalen(object, newdata, times)predict_aalen(object, newdata, times)
object |
An |
newdata |
A data frame of new observations. |
times |
Numeric vector of time points at which to evaluate survival probabilities. |
A data frame (rows = observations, columns = "t=<time>").
mod <- fit_aalen(Surv(time, status) ~ trt + karno + age, data = veteran, max.time = 600) head(predict_aalen(mod, newdata = veteran[1:5, ], times = 0:10))mod <- fit_aalen(Surv(time, status) ~ trt + karno + age, data = veteran, max.time = 600) head(predict_aalen(mod, newdata = veteran[1:5, ], times = 0:10))
aftgee ModelComputes survival probabilities at specified time points from a fitted
AFT model using a log-normal approximation for the error distribution.
predict_aftgee(object, newdata, times = NULL)predict_aftgee(object, newdata, times = NULL)
object |
An |
newdata |
A data.frame of predictor values for prediction. |
times |
Numeric vector of time points at which to estimate survival probabilities.
If |
We use the approximation
where is the standard normal CDF. Here we set as a simple,
distribution-agnostic proxy; this yields a monotone-in-time score useful for benchmarking.
For production use, prefer a parametric AFT fit where is estimated.
A data.frame of survival probabilities with one row per newdata observation
and one column per requested time ("t=<time>").
mod <- fit_aftgee(Surv(time, status) ~ trt + karno + age, data = veteran) predict_aftgee(mod, newdata = veteran[1:5, ], times = c(20, 60, 120))mod <- fit_aftgee(Surv(time, status) ~ trt + karno + age, data = veteran) predict_aftgee(mod, newdata = veteran[1:5, ], times = c(20, 60, 120))
Generates survival probability predictions at requested times for new data
using a model fitted by fit_bart.
predict_bart(object, newdata, times)predict_bart(object, newdata, times)
object |
An |
newdata |
A |
times |
Numeric vector of time points at which to estimate survival probabilities (same scale as the training time). |
The BART engine predicts survival on its internal grid object$eval_times.
Requested times are aligned to that grid by nearest‐neighbor matching,
returning one survival estimate per requested time.
A base data.frame with one row per observation in newdata
and columns named "t=<time>" (character), containing survival
probabilities in [0, 1].
ex_data <- veteran[1:40, c("time", "status", "age", "karno", "celltype")] mod_bart <- fit_bart( Surv(time, status) ~ age + karno + celltype, data = ex_data, K = 1, ntree = 5, ndpost = 20, nskip = 5, mc.cores = 1, seed = 42 ) predict_bart(mod_bart, newdata = ex_data[1:5, ], times = c(10, 30, 60))ex_data <- veteran[1:40, c("time", "status", "age", "karno", "celltype")] mod_bart <- fit_bart( Surv(time, status) ~ age + karno + celltype, data = ex_data, K = 1, ntree = 5, ndpost = 20, nskip = 5, mc.cores = 1, seed = 42 ) predict_bart(mod_bart, newdata = ex_data[1:5, ], times = c(10, 30, 60))
Generates survival probabilities at requested times from a fitted
mboost Cox boosting model produced by fit_blackboost().
predict_blackboost(object, newdata, times, ...)predict_blackboost(object, newdata, times, ...)
object |
An |
newdata |
A |
times |
Numeric vector of time points at which to compute survival probabilities. |
... |
Additional arguments forwarded to |
Predictions use mboost::survFit() to obtain a step function
\(S(t)\) per observation. If any requested times exceed the model's
maximum time, the last survival value is carried forward (right-constant).
Values are then matched to times using stepwise (piecewise-constant)
interpolation.
A base data.frame with one row per observation in newdata and
columns named "t=<time>" (character), containing survival probabilities
in [0, 1].
mod <- fit_blackboost(Surv(time, status) ~ age + karno + celltype, data = veteran) predict_blackboost(mod, newdata = veteran[1:5, ], times = c(5, 10, 40))mod <- fit_blackboost(Surv(time, status) ~ age + karno + celltype, data = veteran) predict_blackboost(mod, newdata = veteran[1:5, ], times = c(5, 10, 40))
Generates survival probabilities from an mlsurv_model fitted by
fit_bnnsurv. If times is supplied, survival curves are
linearly interpolated from the engine's internal time grid to those times.
predict_bnnsurv(object, newdata, times = NULL)predict_bnnsurv(object, newdata, times = NULL)
object |
A fitted |
newdata |
A data frame of new observations for prediction. |
times |
Optional numeric vector of evaluation time points. If |
Internally, predictions are obtained via bnnSurvival::predict().
The returned survival matrix is post–processed to enforce monotonicity
(cumulative minimum over time). When times is provided, values are
obtained by stats::approx() (linear interpolation, rule = 2).
A base data.frame with one row per observation and columns
named "t=<time>" containing survival probabilities in [0, 1].
Probabilities are clipped to [0, 1] and made non–increasing over time.
mod <- fit_bnnsurv(Surv(time, status) ~ age + karno + diagtime + prior, data = veteran) pred <- predict_bnnsurv(mod, newdata = veteran[1:3, ], times = c(50, 100, 200)) predmod <- fit_bnnsurv(Surv(time, status) ~ age + karno + diagtime + prior, data = veteran) pred <- predict_bnnsurv(mod, newdata = veteran[1:3, ], times = c(50, 100, 200)) pred
Generates predicted survival probabilities at specified time points from a
fitted cforest survival model.
predict_cforest(object, newdata, times, ...)predict_cforest(object, newdata, times, ...)
object |
An |
newdata |
A data frame containing the predictor variables for prediction. |
times |
A numeric vector of time points at which to estimate survival probabilities. |
... |
Not used, included for compatibility. |
Survival curves are extracted from the fitted cforest model and
linearly interpolated to the requested time points.
A data frame of survival probabilities with one row per observation
in newdata and one column per requested time point (named "t=<time>").
mod <- fit_cforest(Surv(time, status) ~ age + celltype + karno, data = veteran) predict_cforest(mod, newdata = veteran[1:5, ], times = c(100, 200, 300))mod <- fit_cforest(Surv(time, status) ~ age + celltype + karno, data = veteran) predict_cforest(mod, newdata = veteran[1:5, ], times = c(100, 200, 300))
Generates predicted survival probabilities at specified time points for new data, using a fitted Cox proportional hazards model.
predict_coxph(object, newdata, times)predict_coxph(object, newdata, times)
object |
An |
newdata |
A data frame containing the predictor variables for which to compute predictions. |
times |
A numeric vector of time points at which to evaluate survival probabilities. |
Predictions are computed using pec::predictSurvProb().
The output is formatted as a data frame with one row per observation
in newdata and one column per time point.
A data frame of survival probabilities with columns named
"t=<time>" for each requested time.
mod_cox <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) predict_coxph(mod_cox, newdata = veteran[1:5, ], times = c(100, 200, 300))mod_cox <- fit_coxph(Surv(time, status) ~ age + karno + celltype, data = veteran) predict_coxph(mod_cox, newdata = veteran[1:5, ], times = c(100, 200, 300))
This function generates survival probability predictions at specified time
points from a model fitted with fit_flexsurvreg.
predict_flexsurvreg(object, newdata, times, ...)predict_flexsurvreg(object, newdata, times, ...)
object |
A fitted |
newdata |
A data frame of new observations for which to predict survival probabilities. |
times |
Numeric vector of time points at which to estimate survival probabilities. |
... |
Additional arguments passed to |
A data frame of survival probabilities with one row per observation in
newdata and one column per time point. Column names are of the form
"t=100", "t=200", etc.
mod_flex <- fit_flexsurvreg(Surv(time, status) ~ age + celltype + karno, data = veteran, dist = "weibull") predict_flexsurvreg(mod_flex, newdata = veteran[1:5, ], times = c(100, 200, 300))mod_flex <- fit_flexsurvreg(Surv(time, status) ~ age + celltype + karno, data = veteran, dist = "weibull") predict_flexsurvreg(mod_flex, newdata = veteran[1:5, ], times = c(100, 200, 300))
Predicts survival probabilities at specified time points from a fitted
penalized Cox proportional hazards model produced by fit_glmnet.
predict_glmnet(object, newdata, times, ...)predict_glmnet(object, newdata, times, ...)
object |
A fitted |
newdata |
A |
times |
Numeric vector of time points at which to estimate survival probabilities. Must be in the same scale as the training time variable. |
... |
Not used. |
Predictions are computed by:
Computing the linear predictors for training and new data at
s = "lambda.min".
Fitting a Cox PH model on the training linear predictors to estimate the baseline cumulative hazard.
Interpolating the baseline hazard at each times and
transforming via .
A data.frame with one row per observation in newdata
and one column per requested time point ("t=<time>").
mod_glmnet <- fit_glmnet( Surv(time, status) ~ age + karno + celltype, data = veteran ) predict_glmnet( mod_glmnet, newdata = veteran[1:5, ], times = c(100, 200, 300) )mod_glmnet <- fit_glmnet( Surv(time, status) ~ age + karno + celltype, data = veteran ) predict_glmnet( mod_glmnet, newdata = veteran[1:5, ], times = c(100, 200, 300) )
Generates survival probability predictions at specified time points from a fitted Oblique Random Survival Forest model.
predict_orsf(object, newdata, times, ...)predict_orsf(object, newdata, times, ...)
object |
A fitted ORSF model object from |
newdata |
A data frame of new observations for prediction. |
times |
A numeric vector of time points at which to estimate survival probabilities. |
... |
Additional arguments passed to |
Predictions are computed using the pred_type = "surv" option from aorsf,
returning estimated survival probabilities for each observation at the specified time points.
A data frame where each row corresponds to an observation in newdata
and each column corresponds to a requested prediction time ("t=<time>").
mod <- fit_orsf(Surv(time, status) ~ age + karno, data = veteran) pred <- predict_orsf(mod, newdata = veteran[1:5, ], times = c(100, 200, 300)) head(pred)mod <- fit_orsf(Surv(time, status) ~ age + karno, data = veteran) pred <- predict_orsf(mod, newdata = veteran[1:5, ], times = c(100, 200, 300)) head(pred)
Generates predicted survival probabilities for given time points
from a model fitted with fit_ranger.
predict_ranger(object, newdata, times)predict_ranger(object, newdata, times)
object |
An |
newdata |
A |
times |
Numeric vector of time points at which to estimate survival probabilities. |
Predictions are obtained from predict.ranger and
survival curves are interpolated to match the requested times.
A data.frame with one row per observation in newdata
and one column per time point (columns named "t=<time>").
mod <- fit_ranger( Surv(time, status) ~ age + karno + celltype, data = veteran, num.trees = 25 ) predict_ranger(mod, newdata = veteran[1:5, ], times = c(100, 200, 300))mod <- fit_ranger( Surv(time, status) ~ age + karno + celltype, data = veteran, num.trees = 25 ) predict_ranger(mod, newdata = veteran[1:5, ], times = c(100, 200, 300))
rpart Survival TreePredicts survival probabilities at specified time points from a fitted
rpart survival tree model, assuming an exponential survival distribution
within each terminal node.
predict_rpart(object, newdata, times, ...)predict_rpart(object, newdata, times, ...)
object |
An |
newdata |
A |
times |
Numeric vector of time points at which to compute survival probabilities. |
... |
Additional arguments passed to internal prediction functions. |
Predictions are based on converting the predicted mean survival times from the survival tree into survival probabilities under an exponential assumption:
where is the predicted mean survival time from the terminal node.
A data.frame with one row per observation in newdata and one column
per requested time point, containing predicted survival probabilities.
mod_rpart <- fit_rpart(Surv(time, status) ~ age + karno + celltype, data = veteran) predict_rpart(mod_rpart, newdata = veteran[1:5, ], times = c(100, 200, 300))mod_rpart <- fit_rpart(Surv(time, status) ~ age + karno + celltype, data = veteran) predict_rpart(mod_rpart, newdata = veteran[1:5, ], times = c(100, 200, 300))
Generates survival probability predictions for new data using a model
fitted by fit_rsf().
predict_rsf(object, newdata, times = NULL, ...)predict_rsf(object, newdata, times = NULL, ...)
object |
A fitted |
newdata |
A |
times |
Optional numeric vector of time points at which to return survival
probabilities. If |
... |
Additional arguments forwarded to
|
If times is provided, the function aligns predictions by selecting, for each
requested time, the closest available time from the RSF prediction object's
time.interest grid (nearest-neighbor matching).
A base data.frame of survival probabilities with one row per
observation in newdata and columns named t={time} (character), containing
numeric values in [0, 1].
mod_rsf <- fit_rsf(Surv(time, status) ~ age + celltype + karno, data = veteran, ntree = 200) times <- c(100, 200, 300) pred_probs <- predict_rsf(mod_rsf, newdata = veteran[1:5, ], times = times) print(round(pred_probs, 3))mod_rsf <- fit_rsf(Surv(time, status) ~ age + celltype + karno, data = veteran, ntree = 200) times <- c(100, 200, 300) pred_probs <- predict_rsf(mod_rsf, newdata = veteran[1:5, ], times = times) print(round(pred_probs, 3))
Generates survival probabilities at specified time points using a fitted
selected‑predictor Cox model (mlsurv_model) via pec::predictSurvProb().
predict_selectcox(object, newdata, times)predict_selectcox(object, newdata, times)
object |
A fitted |
newdata |
A |
times |
Numeric vector of evaluation time points (same scale as the training survival time). Must be non‑negative and finite. |
Internally calls pec::predictSurvProb(object$model, newdata, times)
and renames columns to the standard t=... convention used by survalis.
A base data.frame with one row per observation in newdata and
columns named t={time} (character), containing numeric survival
probabilities in [0, 1].
fit_selectcox(), tune_selectcox()
mod <- fit_selectcox(Surv(time, status) ~ age + celltype + karno, data = veteran) predict_selectcox(mod, newdata = veteran[1:5, ], times = c(100, 200, 300))mod <- fit_selectcox(Surv(time, status) ~ age + celltype + karno, data = veteran) predict_selectcox(mod, newdata = veteran[1:5, ], times = c(100, 200, 300))
Generates survival probabilities at specified time points using a fitted
rstpm2 model wrapped as an mlsurv_model.
predict_stpm2(object, newdata, times, ...)predict_stpm2(object, newdata, times, ...)
object |
A fitted |
newdata |
A |
times |
Numeric vector of evaluation time points (same scale as the training survival time). Must be non‑negative and finite. |
... |
Additional arguments forwarded to |
Internally expands newdata over the requested times and calls
predict(object$model, type = "surv", newtime = times); results are
reshaped to a wide format with one column per time point.
A base data.frame with one row per observation in newdata and
columns named t={time} (character), containing numeric survival
probabilities in [0, 1].
mod_stpm2 <- fit_stpm2(Surv(time, status) ~ age + karno + celltype, data = veteran, df = 4) predict_stpm2(mod_stpm2, newdata = veteran[1:5, ], times = c(100, 200, 300))mod_stpm2 <- fit_stpm2(Surv(time, status) ~ age + karno + celltype, data = veteran, df = 4) predict_stpm2(mod_stpm2, newdata = veteran[1:5, ], times = c(100, 200, 300))
Generates survival probabilities at specified time points using a fitted
deep neural network survival model (mlsurv_model).
predict_survdnn( object, newdata, times = NULL, type = c("survival", "lp", "risk"), ... )predict_survdnn( object, newdata, times = NULL, type = c("survival", "lp", "risk"), ... )
object |
A fitted |
newdata |
A |
times |
Numeric vector of evaluation time points (same scale as the
survival time used at training). Required for |
type |
Prediction type. |
... |
Additional arguments forwarded to |
Delegates to the installed survdnn prediction method. The default
remains type = "survival" so the result stays compatible with
cv_survlearner() and the rest of the survalis evaluation pipeline.
If type = "survival", a base data.frame with one row per observation
in newdata and columns named t={time} containing values in [0, 1].
If type = "lp" or type = "risk", a numeric vector.
if (requireNamespace("survdnn", quietly = TRUE) && requireNamespace("torch", quietly = TRUE) && torch::torch_is_installed()) { mod <- fit_survdnn(Surv(time, status) ~ age + karno + celltype, data = veteran, loss = "cox", epochs = 50, verbose = FALSE) pred <- predict_survdnn(mod, newdata = veteran[1:5, ], times = c(30, 90, 180)) print(pred) }if (requireNamespace("survdnn", quietly = TRUE) && requireNamespace("torch", quietly = TRUE) && torch::torch_is_installed()) { mod <- fit_survdnn(Surv(time, status) ~ age + karno + celltype, data = veteran, loss = "cox", epochs = 50, verbose = FALSE) pred <- predict_survdnn(mod, newdata = veteran[1:5, ], times = c(30, 90, 180)) print(pred) }
Produces stacked survival probabilities by combining base learner predictions
via the time‑specific weights learned by fit_survmetalearner.
predict_survmetalearner(model, newdata, times)predict_survmetalearner(model, newdata, times)
model |
A |
newdata |
A data frame of new observations for prediction. |
times |
Numeric vector of evaluation times (must be a subset of the times used to train the meta‑learner). |
For each base learner listed in model$learners, the corresponding
predict_<learner> function is called to obtain .
Stacked predictions are computed as ,
where are the learned nonnegative weights for time .
A data frame with one row per observation and one column per requested
time (columns named "t=<time>"), containing stacked survival probabilities.
fit_survmetalearner, plot_survmetalearner_weights
form <- Surv(time, status) ~ age + karno + trt times <- c(80, 160) mod_cox <- fit_coxph(form, data = veteran) mod_rpart <- fit_rpart(form, data = veteran) base_models <- list(coxph = mod_cox, rpart = mod_rpart) base_preds <- list( coxph = predict_coxph(mod_cox, newdata = veteran, times = times), rpart = predict_rpart(mod_rpart, newdata = veteran, times = times) ) meta_model <- fit_survmetalearner( base_preds = base_preds, time = veteran$time, status = veteran$status, times = times, base_models = base_models, formula = form, data = veteran ) predict_survmetalearner(meta_model, newdata = veteran[1:3, ], times = times)form <- Surv(time, status) ~ age + karno + trt times <- c(80, 160) mod_cox <- fit_coxph(form, data = veteran) mod_rpart <- fit_rpart(form, data = veteran) base_models <- list(coxph = mod_cox, rpart = mod_rpart) base_preds <- list( coxph = predict_coxph(mod_cox, newdata = veteran, times = times), rpart = predict_rpart(mod_rpart, newdata = veteran, times = times) ) meta_model <- fit_survmetalearner( base_preds = base_preds, time = veteran$time, status = veteran$status, times = times, base_models = base_models, formula = form, data = veteran ) predict_survmetalearner(meta_model, newdata = veteran[1:3, ], times = times)
Generates survival probabilities at specified times from a fitted survival
SVM (mlsurv_model). Since many SVM variants output a single predicted time
(or rank), this function maps predicted times to survival curves using either
an Exponential or Weibull parametric assumption.
predict_survsvm(object, newdata, times, dist = "exp", shape = 1)predict_survsvm(object, newdata, times, dist = "exp", shape = 1)
object |
A fitted |
newdata |
A |
times |
Numeric vector of evaluation time points (same scale as the survival time used at training). Must be non-negative and finite. |
dist |
Parametric family used to map predicted times to survival curves:
|
shape |
Weibull shape parameter (used only if |
Parametric mapping. Let be the predicted time from
the SVM model (per row). For a requested evaluation time :
Exponential:
Weibull: , where is shape
This mapping provides calibrated-looking survival probabilities but is a modeling assumption external to the SVM fit; verify adequacy in practice.
A base data.frame with one row per observation in newdata and
columns named t={time} (character), containing numeric values in [0, 1].
mod_svm <- fit_survsvm(Surv(time, status) ~ age + celltype + karno, data = veteran, type = "regression", gamma.mu = 0.1, kernel = "lin_kernel") times <- c(100, 300, 500) predict_survsvm(mod_svm, newdata = veteran[1:5, ], times = times, dist = "exp") predict_survsvm(mod_svm, newdata = veteran[1:5, ], times = times, dist = "weibull", shape = 1.5)mod_svm <- fit_survsvm(Surv(time, status) ~ age + celltype + karno, data = veteran, type = "regression", gamma.mu = 0.1, kernel = "lin_kernel") times <- c(100, 300, 500) predict_survsvm(mod_svm, newdata = veteran[1:5, ], times = times, dist = "exp") predict_survsvm(mod_svm, newdata = veteran[1:5, ], times = times, dist = "weibull", shape = 1.5)
Generates predictions from an XGBoost survival mlsurv_model.
If times is NULL, returns the raw linear predictor. If times
is provided, returns survival probabilities computed via the AFT mapping
using object$dist and object$scale.
predict_xgboost(object, newdata, times = NULL)predict_xgboost(object, newdata, times = NULL)
object |
A fitted |
newdata |
A |
times |
Optional numeric vector of evaluation time points (same scale as training time).
If |
AFT mapping with :
Normal:
Extreme (Gumbel):
Logistic:
Note: If the model was trained with objective = "survival:cox", survival
probabilities are still computed using the supplied AFT distribution/scale
stored in the object; interpret with caution.
If times is NULL: a numeric vector of linear predictors (one per row of newdata).
If times is provided: a base data.frame of survival probabilities with
columns named "t=<time>" and row names "ID_<i>".
mod_xgb <- fit_xgboost(Surv(time, status) ~ age + karno + celltype, data = veteran, nrounds = 20) predict_xgboost(mod_xgb, newdata = veteran[1:5, ], times = c(100, 200, 300))mod_xgb <- fit_xgboost(Surv(time, status) ~ age + karno + celltype, data = veteran, nrounds = 20) predict_xgboost(mod_xgb, newdata = veteran[1:5, ], times = c(100, 200, 300))
Computes one or more performance metrics for a fitted mlsurv_model,
predicting on the training data with the model's corresponding predict_*
function.
score_survmodel( model, times, metrics = c("cindex", "ibs", "brier", "iae", "ise") )score_survmodel( model, times, metrics = c("cindex", "ibs", "brier", "iae", "ise") )
model |
An object of class |
times |
Numeric vector of evaluation times. For |
metrics |
Character vector of metrics to compute. Supported:
|
The function constructs the appropriate predict_* function name from
model$learner, predicts survival probabilities on model$data,
builds a Surv object from model$formula, and computes the metrics.
If "brier" is requested with multiple times, an error is thrown.
A tibble with columns metric and value.
fitted_model <- fit_coxph(Surv(time, status) ~ age + karno + trt, data = veteran) score_survmodel( fitted_model, times = c(80, 160), metrics = c("cindex", "ibs") )fitted_model <- fit_coxph(Surv(time, status) ~ age + karno + trt, data = veteran) score_survmodel( fitted_model, times = c(80, 160), metrics = c("cindex", "ibs") )
Aggregates cross‑validated benchmark results by learner and metric, reporting mean, standard deviation, standard error, and an approximate 95% Wald confidence interval.
summarise_benchmark(benchmark_results)summarise_benchmark(benchmark_results)
benchmark_results |
A data frame produced by
|
A tibble with columns learner, metric, mean,
sd, n, se, lower, upper.
benchmark_default_survlearners(), plot_benchmark(),
summarize_benchmark_results()
res <- tibble::tibble( learner = c("coxph", "coxph", "rpart", "rpart"), metric = c("cindex", "ibs", "cindex", "ibs"), value = c(0.64, 0.19, 0.60, 0.23) ) summarise_benchmark(res)res <- tibble::tibble( learner = c("coxph", "coxph", "rpart", "rpart"), metric = c("cindex", "ibs", "cindex", "ibs"), value = c(0.64, 0.19, 0.60, 0.23) ) summarise_benchmark(res)
Creates a wide table summarizing each learner's performance as
mean sd per metric, suitable for reporting.
summarize_benchmark_results(results, digits = 3)summarize_benchmark_results(results, digits = 3)
results |
A data frame with columns |
digits |
Integer number of decimal places in the formatted summary
(default |
A wide tibble with one row per learner and one column per metric,
containing formatted strings "mean sd".
summarise_benchmark(), benchmark_default_survlearners()
res <- tibble::tibble( learner = c("coxph", "coxph", "rpart", "rpart"), metric = c("cindex", "ibs", "cindex", "ibs"), value = c(0.64, 0.19, 0.60, 0.23) ) summarize_benchmark_results(res, digits = 2)res <- tibble::tibble( learner = c("coxph", "coxph", "rpart", "rpart"), metric = c("cindex", "ibs", "cindex", "ibs"), value = c(0.64, 0.19, 0.60, 0.23) ) summarize_benchmark_results(res, digits = 2)
mlsurv_model
Produces a compact, human‑readable overview of a fitted survival learner that
follows the mlsurv_model contract (e.g., objects returned by fit_*() in
this toolkit). The summary prints the learner id, engine, original formula,
and basic data characteristics (sample size, predictor names, time range,
event rate). A structured list is returned invisibly for programmatic use.
## S3 method for class 'mlsurv_model' summary(object, ...)## S3 method for class 'mlsurv_model' summary(object, ...)
object |
An object of class |
... |
Ignored; included for S3 signature compatibility. |
This method relies on the presence of the fields learner, formula,
and data stored in the fitted object (the standard mlsurv_model
contract). Output printing uses cli if available.
Invisibly returns a list of class "summary.mlsurv_model" containing:
Character id of the learner (e.g., "ranger", "coxph").
Underlying package/engine used to fit the model.
The original survival formula.
List with observations, predictors,
time_range, and event_rate.
fit_coxph(), other fit_*() learners returning mlsurv_model objects.
mod <- fit_coxph(Surv(time, status) ~ age + trt + celltype, data = veteran) summary(mod) s <- summary(mod) # capture the structured result invisibly str(s)mod <- fit_coxph(Surv(time, status) ~ age + trt + celltype, data = veteran) summary(mod) s <- summary(mod) # capture the structured result invisibly str(s)
Uses the identity to compute
cumulative hazards for each observation and time point.
survmat_to_chf(S, eps = 1e-12)survmat_to_chf(S, eps = 1e-12)
S |
A |
eps |
Numeric in (0,1). Stabilizer to avoid |
A numeric matrix with the same dimensions as S, containing
cumulative hazards .
S <- data.frame(`t=1` = c(0.9, 0.8), `t=2` = c(0.7, 0.6)) survmat_to_chf(S)S <- data.frame(`t=1` = c(0.9, 0.8), `t=2` = c(0.7, 0.6)) survmat_to_chf(S)
Computes individual-specific (i.e., conditional on covariates) hazards
from predicted survival curves on a discrete time grid.
survmat_to_haz(S, times, eps = 1e-12, t0 = 0)survmat_to_haz(S, times, eps = 1e-12, t0 = 0)
S |
A |
times |
Numeric vector of time points corresponding to the columns of |
eps |
Numeric in (0,1). Stabilizer to avoid |
t0 |
Numeric scalar; left boundary for the first interval. Default is |
This function returns a piecewise-constant hazard per interval:
with and .
A numeric matrix of hazards with the same dimensions as S.
Each column corresponds to the interval ending at times[j].
times <- c(1, 2, 3) S <- matrix(c(0.9, 0.8, 0.7, 0.95,0.9,0.85), nrow = 2, byrow = TRUE) colnames(S) <- paste0("t=", times) H <- survmat_to_haz(S, times) Htimes <- c(1, 2, 3) S <- matrix(c(0.9, 0.8, 0.7, 0.95,0.9,0.85), nrow = 2, byrow = TRUE) colnames(S) <- paste0("t=", times) H <- survmat_to_haz(S, times) H
Returns the smallest grid time such that .
This is a right-continuous approximation on the provided time grid.
survmat_to_quantile(S, times, p = 0.5)survmat_to_quantile(S, times, p = 0.5)
S |
A |
times |
Numeric vector of time points corresponding to columns of |
p |
Probability in (0,1). Default is |
For p = 0.5, this yields a grid-based median survival time.
Numeric vector of quantile times (length = nrow(S)). Returns
NA if the threshold is not crossed on the grid (e.g., survival stays
above over all times).
times <- c(1, 2, 3) S <- matrix(c(0.9, 0.6, 0.4, 0.95,0.9,0.85), nrow = 2, byrow = TRUE) colnames(S) <- paste0("t=", times) survmat_to_quantile(S, times, p = 0.5) # median on the gridtimes <- c(1, 2, 3) S <- matrix(c(0.9, 0.6, 0.4, 0.95,0.9,0.85), nrow = 2, byrow = TRUE) colnames(S) <- paste0("t=", times) survmat_to_quantile(S, times, p = 0.5) # median on the grid
Computes for each observation,
approximated via the trapezoidal rule on the provided time grid.
survmat_to_rmst(S, times, tau = max(times))survmat_to_rmst(S, times, tau = max(times))
S |
A |
times |
Numeric vector of time points corresponding to columns of |
tau |
Positive numeric scalar; upper integration limit. Default is |
If the grid does not include 0, the function prepends
to correctly integrate from time 0.
Numeric vector of RMST values (length = nrow(S)).
times <- c(1, 2, 3) S <- matrix(c(0.9, 0.8, 0.7, 0.95,0.9,0.85), nrow = 2, byrow = TRUE) colnames(S) <- paste0("t=", times) survmat_to_rmst(S, times, tau = 3)times <- c(1, 2, 3) S <- matrix(c(0.9, 0.8, 0.7, 0.95,0.9,0.85), nrow = 2, byrow = TRUE) colnames(S) <- paste0("t=", times) survmat_to_rmst(S, times, tau = 3)
Cross-validates BART survival models over a user‐supplied grid and selects the best configuration by the primary metric. Optionally refits the best model on the full dataset.
tune_bart( formula, data, times, param_grid = expand.grid(K = c(3, 5), ntree = c(50, 100), power = c(2, 2.5), base = c(0.75, 0.95)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE )tune_bart( formula, data, times, param_grid = expand.grid(K = c(3, 5), ntree = c(50, 100), power = c(2, 2.5), base = c(0.75, 0.95)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE )
formula |
A survival formula of the form |
data |
A |
times |
Numeric vector of evaluation time points. |
param_grid |
A |
metrics |
Character vector of evaluation metrics (default |
folds |
Integer; number of cross‐validation folds (default |
seed |
Integer random seed for reproducibility (default |
ncores |
Integer number of CPU cores passed to |
refit_best |
Logical; if |
Internally calls cv_survlearner() with fit_bart()/predict_bart()
so tuning mirrors the production prediction path.
If refit_best = FALSE, a data.frame (class "tuned_surv")
sorted by the primary metric with one row per grid combination.
If refit_best = TRUE, a fitted mlsurv_model returned by fit_bart.
fit_bart, predict_bart, surv.bart
grid <- expand.grid(K = c(3), ntree = c(50), power = c(2), base = c(0.75, 0.95)) res <- tune_bart( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, param_grid = grid, times = c(10, 60), refit_best = FALSE ) print(res) mod_bart <- tune_bart( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, param_grid = grid, times = c(10, 60), refit_best = TRUE ) summary(mod_bart)grid <- expand.grid(K = c(3), ntree = c(50), power = c(2), base = c(0.75, 0.95)) res <- tune_bart( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, param_grid = grid, times = c(10, 60), refit_best = FALSE ) print(res) mod_bart <- tune_bart( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, param_grid = grid, times = c(10, 60), refit_best = TRUE ) summary(mod_bart)
Cross-validates mboost Cox boosting models over a hyperparameter grid and selects the best configuration according to the primary metric. Optionally refits the best model on the full dataset.
tune_blackboost( formula, data, times, param_grid = expand.grid(mstop = c(50, 100, 200), nu = c(0.05, 0.1), maxdepth = c(2, 3)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE, ... )tune_blackboost( formula, data, times, param_grid = expand.grid(mstop = c(50, 100, 200), nu = c(0.05, 0.1), maxdepth = c(2, 3)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE, ... )
formula |
A survival formula of the form |
data |
A |
times |
Numeric vector of evaluation time points (same scale as the training time). |
param_grid |
A |
metrics |
Character vector of metrics to compute (e.g., |
folds |
Integer; number of CV folds. Default |
seed |
Integer random seed for reproducibility. Default |
ncores |
Integer number of CPU cores passed to |
refit_best |
Logical; if |
... |
Additional arguments forwarded to |
Internally calls cv_survlearner() with fit_blackboost()/predict_blackboost()
so tuning mirrors the production path. Typical grids vary mstop, nu, and
tree maxdepth.
If refit_best = FALSE, a data.frame (class "tuned_surv") of grid
results with metric columns, sorted by the primary metric. If refit_best = TRUE,
a fitted mlsurv_model from fit_blackboost() using the selected hyperparameters.
res_blackboost <- tune_blackboost( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(5, 10, 40), param_grid = expand.grid( mstop = c(100, 200), nu = c(0.05, 0.1), maxdepth = c(2, 3) ), metrics = c("cindex", "ibs"), folds = 3 ) print(res_blackboost) mod_blackboost_best <- tune_blackboost( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(5, 10, 40), param_grid = expand.grid( mstop = c(100, 200), nu = c(0.05, 0.1), maxdepth = c(2, 3) ), metrics = c("cindex", "ibs"), folds = 3, refit_best = TRUE ) summary(mod_blackboost_best)res_blackboost <- tune_blackboost( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(5, 10, 40), param_grid = expand.grid( mstop = c(100, 200), nu = c(0.05, 0.1), maxdepth = c(2, 3) ), metrics = c("cindex", "ibs"), folds = 3 ) print(res_blackboost) mod_blackboost_best <- tune_blackboost( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(5, 10, 40), param_grid = expand.grid( mstop = c(100, 200), nu = c(0.05, 0.1), maxdepth = c(2, 3) ), metrics = c("cindex", "ibs"), folds = 3, refit_best = TRUE ) summary(mod_blackboost_best)
Cross-validates fit_bnnsurv over a user-supplied grid and
aggregates metrics (for example, "cindex", "ibs"). Optionally
refits and returns the best configuration.
tune_bnnsurv( formula, data, times, param_grid = expand.grid(k = c(2, 3), num_base_learners = c(30, 50), sample_fraction = c(0.5, 1), stringsAsFactors = FALSE), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE )tune_bnnsurv( formula, data, times, param_grid = expand.grid(k = c(2, 3), num_base_learners = c(30, 50), sample_fraction = c(0.5, 1), stringsAsFactors = FALSE), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE )
formula |
A survival formula of the form |
data |
Training data frame. |
times |
Numeric vector of evaluation time points used during CV. |
param_grid |
A data frame (for example from |
metrics |
Character vector of metric names to compute and summarize. The first entry is treated as the primary ranking metric. |
folds |
Integer number of cross-validation folds. Default |
seed |
Integer random seed for reproducibility. |
ncores |
Integer number of CPU cores passed to |
refit_best |
Logical; if |
Evaluation is performed by cv_survlearner() using
fit_bnnsurv() and predict_bnnsurv() to match production code
paths. Results are ordered by the first entry in metrics.
If refit_best = FALSE, a tibble with one row per
configuration containing metrics and the tuning parameters, with a
failed flag for combinations that errored during CV. If
refit_best = TRUE, an mlsurv_model refit at the selected
hyperparameters.
fit_bnnsurv(), predict_bnnsurv(), cv_survlearner()
grid <- expand.grid( k = c(2, 3), num_base_learners = c(5, 10), sample_fraction = c(0.5, 1), stringsAsFactors = FALSE ) res <- tune_bnnsurv( formula = Surv(time, status) ~ age + karno + diagtime + prior, data = veteran, times = c(100, 200), param_grid = grid, refit_best = FALSE ) res best <- tune_bnnsurv( formula = Surv(time, status) ~ age + karno + diagtime + prior, data = veteran, times = c(100, 200), param_grid = grid, refit_best = TRUE ) head(predict_bnnsurv(best, newdata = veteran[1:5, ], times = c(50, 100)))grid <- expand.grid( k = c(2, 3), num_base_learners = c(5, 10), sample_fraction = c(0.5, 1), stringsAsFactors = FALSE ) res <- tune_bnnsurv( formula = Surv(time, status) ~ age + karno + diagtime + prior, data = veteran, times = c(100, 200), param_grid = grid, refit_best = FALSE ) res best <- tune_bnnsurv( formula = Surv(time, status) ~ age + karno + diagtime + prior, data = veteran, times = c(100, 200), param_grid = grid, refit_best = TRUE ) head(predict_bnnsurv(best, newdata = veteran[1:5, ], times = c(50, 100)))
Performs cross-validated hyperparameter tuning for a conditional inference
survival forest using the fit_cforest() and predict_cforest() functions.
tune_cforest( formula, data, times, param_grid = expand.grid(ntree = c(100, 300), mtry = c(2, 3), mincriterion = c(0, 0.95), fraction = c(0.5, 0.632)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE, ... )tune_cforest( formula, data, times, param_grid = expand.grid(ntree = c(100, 300), mtry = c(2, 3), mincriterion = c(0, 0.95), fraction = c(0.5, 0.632)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE, ... )
formula |
A |
data |
A data frame containing the variables in the model. |
times |
A numeric vector of time points at which to evaluate performance. |
param_grid |
A data frame or list specifying the grid of hyperparameter
values to evaluate. Columns should include |
metrics |
A character vector of performance metrics to compute
(default = |
folds |
Integer, number of cross-validation folds (default = |
seed |
Integer, random seed for reproducibility. |
ncores |
Integer number of CPU cores passed to |
refit_best |
Logical, if |
... |
Additional arguments passed to |
Cross-validation is performed using cv_survlearner() and results
are sorted in descending order of the first metric specified in metrics.
If refit_best = FALSE, a data frame of mean cross-validation scores
for each hyperparameter combination (class "tuned_surv").
If refit_best = TRUE, an mlsurv_model object fitted with the optimal
hyperparameters.
grid <- expand.grid( ntree = c(50, 100), mtry = c(2, 4), mincriterion = c(0, 0.95), fraction = c(0.632) ) res <- tune_cforest(Surv(time, status) ~ age + celltype + karno, data = veteran, times = c(100, 200), param_grid = grid, folds = 3 ) print(res)grid <- expand.grid( ntree = c(50, 100), mtry = c(2, 4), mincriterion = c(0, 0.95), fraction = c(0.632) ) res <- tune_cforest(Surv(time, status) ~ age + celltype + karno, data = veteran, times = c(100, 200), param_grid = grid, folds = 3 ) print(res)
Performs hyperparameter tuning for fit_flexsurvreg over a grid
of parametric distributions using cross-validation. Returns either a summary
table of performance metrics or a refitted best model.
tune_flexsurvreg( formula, data, times, param_grid = c("weibull", "exponential", "lognormal"), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE, ... )tune_flexsurvreg( formula, data, times, param_grid = c("weibull", "exponential", "lognormal"), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE, ... )
formula |
A |
data |
A data frame containing the variables in the model. |
times |
Numeric vector of time points at which to evaluate performance. |
param_grid |
Character vector of distributions to test
(default = |
metrics |
Character vector of performance metrics to compute
(default = |
folds |
Integer, number of cross-validation folds (default = 5). |
seed |
Random seed for reproducibility. |
ncores |
Integer number of CPU cores passed to |
refit_best |
Logical; if |
... |
Additional arguments passed to |
If refit_best = FALSE, returns a data frame of tuning results
with one row per distribution and columns for each metric.
If refit_best = TRUE, returns the best mlsurv_model object.
res <- tune_flexsurvreg(Surv(time, status) ~ age + karno + celltype, data = veteran, param_grid = c("weibull", "exponential", "lognormal"), times = c(100, 200, 300)) print(res) best_mod <- tune_flexsurvreg(Surv(time, status) ~ age + karno + celltype, data = veteran, param_grid = c("weibull", "exponential", "lognormal"), times = c(100, 200, 300), refit_best = TRUE) summary(best_mod)res <- tune_flexsurvreg(Surv(time, status) ~ age + karno + celltype, data = veteran, param_grid = c("weibull", "exponential", "lognormal"), times = c(100, 200, 300)) print(res) best_mod <- tune_flexsurvreg(Surv(time, status) ~ age + karno + celltype, data = veteran, param_grid = c("weibull", "exponential", "lognormal"), times = c(100, 200, 300), refit_best = TRUE) summary(best_mod)
Performs hyperparameter tuning for penalized Cox models (glmnet) over
a grid of alpha values using cross-validation on one or more metrics.
Optionally refits the best model on the full dataset.
tune_glmnet( formula, data, times, param_grid = c(alpha = seq(0, 1, by = 0.25)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE, ... )tune_glmnet( formula, data, times, param_grid = c(alpha = seq(0, 1, by = 0.25)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE, ... )
formula |
A survival formula of the form |
data |
A |
times |
Numeric vector of time points at which to evaluate performance. |
param_grid |
A |
metrics |
Character vector of performance metrics to compute. The first entry is used as the primary selection metric. |
folds |
Integer; number of cross-validation folds. Default is |
seed |
Integer random seed for reproducibility. Default is |
ncores |
Integer number of CPU cores passed to |
refit_best |
Logical; if |
... |
Additional arguments passed to |
If refit_best = FALSE, returns a data.frame (class "tuned_surv")
with hyperparameters and metric values for each grid combination, sorted by
the primary metric.
If refit_best = TRUE, returns a fitted "mlsurv_model" from
fit_glmnet.
param_grid <- expand.grid(alpha = seq(0, 1, by = 0.25)) res_glmnet <- tune_glmnet( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(100, 200, 300), param_grid = param_grid, metrics = c("cindex", "ibs"), folds = 3 ) print(res_glmnet) mod_glmnet_best <- tune_glmnet( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(100, 200, 300), param_grid = param_grid, refit_best = TRUE ) summary(mod_glmnet_best)param_grid <- expand.grid(alpha = seq(0, 1, by = 0.25)) res_glmnet <- tune_glmnet( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(100, 200, 300), param_grid = param_grid, metrics = c("cindex", "ibs"), folds = 3 ) print(res_glmnet) mod_glmnet_best <- tune_glmnet( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(100, 200, 300), param_grid = param_grid, refit_best = TRUE ) summary(mod_glmnet_best)
Performs grid-search hyperparameter tuning for aorsf models using cross-validation and selects the best configuration by the primary metric. Optionally refits the best model on the full dataset.
tune_orsf( formula, data, times, param_grid = expand.grid(n_tree = c(100, 300), mtry = c(2, 3), min_events = c(5, 10)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE, ... )tune_orsf( formula, data, times, param_grid = expand.grid(n_tree = c(100, 300), mtry = c(2, 3), min_events = c(5, 10)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE, ... )
formula |
A survival formula of the form |
data |
A |
times |
Numeric vector of evaluation time points (same scale as the training survival time). Must be positive and finite. |
param_grid |
A
|
metrics |
Character vector of metrics to evaluate/optimize
(e.g., |
folds |
Integer; number of cross-validation folds. Default is |
seed |
Integer random seed for reproducibility. Default is |
ncores |
Integer number of CPU cores passed to |
refit_best |
Logical; if |
... |
Additional arguments forwarded to the underlying engine via
|
Internally calls cv_survlearner() with fit_orsf /
predict_orsf so tuning uses the exact same code paths as production.
The min_events column of param_grid is passed to the engine as
n_split (minimum events for a candidate split).
If refit_best = FALSE, a data.frame (class "tuned_surv")
with one row per grid combination and columns for hyperparameters and metrics,
ordered by the first metric. If refit_best = TRUE, a fitted
mlsurv_model from fit_orsf using the best settings.
fit_orsf, predict_orsf, aorsf
res_orsf <- tune_orsf( formula = Surv(time, status) ~ age + karno, data = veteran, times = c(100, 200, 300), param_grid = expand.grid( n_tree = c(100, 200), mtry = c(1, 2), min_events = c(5, 10) ), metrics = c("cindex", "ibs"), folds = 2 ) print(res_orsf) mod_orsf_best <- tune_orsf( formula = Surv(time, status) ~ age + karno, data = veteran, times = c(100, 200, 300), param_grid = expand.grid( n_tree = c(100, 200), mtry = c(1, 2), min_events = c(5, 10) ), metrics = c("cindex", "ibs"), folds = 2, refit_best = TRUE ) summary(mod_orsf_best)res_orsf <- tune_orsf( formula = Surv(time, status) ~ age + karno, data = veteran, times = c(100, 200, 300), param_grid = expand.grid( n_tree = c(100, 200), mtry = c(1, 2), min_events = c(5, 10) ), metrics = c("cindex", "ibs"), folds = 2 ) print(res_orsf) mod_orsf_best <- tune_orsf( formula = Surv(time, status) ~ age + karno, data = veteran, times = c(100, 200, 300), param_grid = expand.grid( n_tree = c(100, 200), mtry = c(1, 2), min_events = c(5, 10) ), metrics = c("cindex", "ibs"), folds = 2, refit_best = TRUE ) summary(mod_orsf_best)
Performs grid search tuning of a survival random forest model using ranger over a set of hyperparameter combinations.
tune_ranger( formula, data, times, param_grid = expand.grid(num.trees = c(100, 300), mtry = c(1, 2, 3), min.node.size = c(3, 5)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE, ... )tune_ranger( formula, data, times, param_grid = expand.grid(num.trees = c(100, 300), mtry = c(1, 2, 3), min.node.size = c(3, 5)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE, ... )
formula |
A survival formula of the form |
data |
A |
times |
Numeric vector of time points at which to evaluate performance. |
param_grid |
A |
metrics |
Character vector of evaluation metrics (e.g., |
folds |
Number of cross-validation folds. |
seed |
Random seed for reproducibility. |
ncores |
Integer number of CPU cores passed to |
refit_best |
Logical; if |
... |
Additional arguments passed to |
Uses cv_survlearner to perform cross-validation for each
parameter combination. If refit_best = TRUE, the function returns the
best-fitting model; otherwise, it returns a tuning results table.
If refit_best = FALSE, returns a data.frame of tuning results
sorted by the primary metric. If refit_best = TRUE, returns an
"mlsurv_model" object with the best parameters and an attribute
"tuning_results".
mod_ranger_best <- tune_ranger( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(80, 160), param_grid = expand.grid( num.trees = c(25), mtry = c(2), min.node.size = c(5) ), metrics = c("cindex", "ibs"), folds = 2, refit_best = TRUE ) summary(mod_ranger_best)mod_ranger_best <- tune_ranger( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(80, 160), param_grid = expand.grid( num.trees = c(25), mtry = c(2), min.node.size = c(5) ), metrics = c("cindex", "ibs"), folds = 2, refit_best = TRUE ) summary(mod_ranger_best)
rpart) via Cross-ValidationPerforms hyperparameter tuning for the rpart survival tree model using
cross-validation, returning either a table of results or the best fitted model.
tune_rpart( formula, data, times, param_grid = expand.grid(minsplit = c(10, 20), cp = c(0.001, 0.01), maxdepth = c(10, 30)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = TRUE, ... )tune_rpart( formula, data, times, param_grid = expand.grid(minsplit = c(10, 20), cp = c(0.001, 0.01), maxdepth = c(10, 30)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = TRUE, ... )
formula |
A survival formula of the form |
data |
A |
times |
Numeric vector of time points for evaluation. |
param_grid |
A |
metrics |
Character vector of evaluation metrics to compute (e.g., |
folds |
Number of cross-validation folds. Default is |
seed |
Random seed for reproducibility. |
ncores |
Integer number of CPU cores passed to |
refit_best |
Logical; if |
... |
Additional arguments passed to |
If refit_best = FALSE, returns a tibble summarizing mean CV performance for each parameter combination.
If refit_best = TRUE, returns an "mlsurv_model" object for the best parameters, with a "tuning_results" attribute.
res_rpart <- tune_rpart( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(100, 200, 300), param_grid = expand.grid( minsplit = c(10, 20), cp = c(0.001, 0.01), maxdepth = c(10, 30) ), metrics = c("cindex", "ibs"), folds = 3, seed = 42, refit_best = FALSE ) print(res_rpart) mod_rpart_best <- tune_rpart( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(100, 200, 300), param_grid = expand.grid( minsplit = c(10, 20), cp = c(0.001, 0.01), maxdepth = c(10, 30) ), metrics = c("cindex", "ibs"), folds = 3, seed = 42, refit_best = TRUE )res_rpart <- tune_rpart( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(100, 200, 300), param_grid = expand.grid( minsplit = c(10, 20), cp = c(0.001, 0.01), maxdepth = c(10, 30) ), metrics = c("cindex", "ibs"), folds = 3, seed = 42, refit_best = FALSE ) print(res_rpart) mod_rpart_best <- tune_rpart( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(100, 200, 300), param_grid = expand.grid( minsplit = c(10, 20), cp = c(0.001, 0.01), maxdepth = c(10, 30) ), metrics = c("cindex", "ibs"), folds = 3, seed = 42, refit_best = TRUE )
Cross-validates RSF models over a specified parameter grid and selects the best configuration according to the primary metric. Optionally refits the best model on the full dataset.
tune_rsf( formula, data, times, param_grid = expand.grid(ntree = c(200, 500), mtry = c(1, 2, 3), nodesize = c(5, 15)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE, ... )tune_rsf( formula, data, times, param_grid = expand.grid(ntree = c(200, 500), mtry = c(1, 2, 3), nodesize = c(5, 15)), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = FALSE, ... )
formula |
A survival formula of the form |
data |
A |
times |
Numeric vector of evaluation time points (same scale as the survival time used at training). Must be non-negative and finite. |
param_grid |
A data frame (e.g., from |
metrics |
Character vector of metrics to evaluate/optimize
(e.g., |
folds |
Integer; number of cross-validation folds. |
seed |
Integer random seed for reproducibility. |
ncores |
Integer number of CPU cores passed to |
refit_best |
Logical; if |
... |
Additional arguments forwarded to the underlying engine where applicable. |
Internally calls cv_survlearner() with fit_rsf()/predict_rsf() so tuning
uses the same code paths as production. Typical grids vary ntree, mtry,
and nodesize.
If refit_best = FALSE, a data.frame (class "tuned_surv") of grid
results with metric columns and hyperparameters, ordered by the first metric.
If refit_best = TRUE, a fitted mlsurv_model from fit_rsf() with
attribute "tuning_results" containing the full grid results.
mod_rsf_best <- tune_rsf( formula = Surv(time, status) ~ age + celltype + karno, data = veteran, times = c(100, 200, 300), param_grid = expand.grid( ntree = c(200, 500), mtry = c(1, 2, 3), nodesize = c(5, 15) ), metrics = c("cindex", "ibs"), folds = 3, refit_best = TRUE ) summary(mod_rsf_best)mod_rsf_best <- tune_rsf( formula = Surv(time, status) ~ age + celltype + karno, data = veteran, times = c(100, 200, 300), param_grid = expand.grid( ntree = c(200, 500), mtry = c(1, 2, 3), nodesize = c(5, 15) ), metrics = c("cindex", "ibs"), folds = 3, refit_best = TRUE ) summary(mod_rsf_best)
Cross‑validates pec's selectCox() across one or more selection
rules and selects the best configuration by the primary metric. Optionally
refits the best rule on the full dataset.
tune_selectcox( formula, data, times, rules = c("aic", "p"), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = TRUE, ... )tune_selectcox( formula, data, times, rules = c("aic", "p"), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1, refit_best = TRUE, ... )
formula |
A survival formula of the form |
data |
A |
times |
Numeric vector of evaluation time points (same scale as the training survival time). Must be non‑negative and finite. |
rules |
Character vector of selection rules to compare (e.g.,
|
metrics |
Character vector of metrics to evaluate/optimize
(e.g., |
folds |
Number of cross‑validation folds. |
seed |
Integer random seed for reproducibility. |
ncores |
Integer number of CPU cores passed to |
refit_best |
Logical; if |
... |
Additional arguments forwarded to the underlying engine where applicable. |
Evaluation. Internally calls cv_survlearner() with
fit_selectcox()/predict_selectcox() to ensure tuning uses the same code
paths as production. Rules typically include "aic" and/or "p".
If refit_best = FALSE, a data.frame (class "tuned_surv") with a
row per rule and metric columns, ordered by the first metric. If
refit_best = TRUE, a fitted mlsurv_model from fit_selectcox() with
attribute "tuning_results" containing the full results.
fit_selectcox(), predict_selectcox(), pec::selectCox()
res_selectcox <- tune_selectcox( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(100, 200, 300), rules = c("aic", "p"), metrics = c("cindex", "ibs", "ise"), folds = 3, refit_best = FALSE ) print(res_selectcox) class(res_selectcox) mod_selectcox <- tune_selectcox( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(100, 200, 300), rules = c("aic", "p"), metrics = c("cindex", "ibs", "ise"), folds = 3, refit_best = TRUE ) summary(mod_selectcox)res_selectcox <- tune_selectcox( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(100, 200, 300), rules = c("aic", "p"), metrics = c("cindex", "ibs", "ise"), folds = 3, refit_best = FALSE ) print(res_selectcox) class(res_selectcox) mod_selectcox <- tune_selectcox( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(100, 200, 300), rules = c("aic", "p"), metrics = c("cindex", "ibs", "ise"), folds = 3, refit_best = TRUE ) summary(mod_selectcox)
Cross-validates survdnn-based models over a user-specified grid and selects the best configuration by the primary metric. Optionally refits the best model on the full dataset.
tune_survdnn( formula, data, times, param_grid = list(hidden = list(c(32, 16), c(64, 32, 16)), lr = c(0.001, 5e-04), activation = c("relu", "gelu"), epochs = c(100, 200), loss = c("cox", "aft"), optimizer = "adam", dropout = c(0.1, 0.3), batch_norm = c(TRUE)), metrics = c("cindex", "ibs"), folds = 3, seed = 42, ncores = 1, refit_best = FALSE, ... )tune_survdnn( formula, data, times, param_grid = list(hidden = list(c(32, 16), c(64, 32, 16)), lr = c(0.001, 5e-04), activation = c("relu", "gelu"), epochs = c(100, 200), loss = c("cox", "aft"), optimizer = "adam", dropout = c(0.1, 0.3), batch_norm = c(TRUE)), metrics = c("cindex", "ibs"), folds = 3, seed = 42, ncores = 1, refit_best = FALSE, ... )
formula |
A survival formula of the form |
data |
A |
times |
Numeric vector of evaluation time points (same scale as the survival time used at training). Must be non-negative and finite. |
param_grid |
A named list of candidate hyperparameters. Typical entries:
|
metrics |
Character vector of metrics to evaluate/optimize
(e.g., |
folds |
Number of cross-validation folds. |
seed |
Integer random seed for reproducibility. |
ncores |
Integer number of CPU cores passed to |
refit_best |
Logical; if |
... |
Additional arguments forwarded to the underlying engine where applicable. |
Evaluation. Internally calls cv_survlearner() with
fit_survdnn()/predict_survdnn() so tuning uses the same code paths as
production. Hyperparameters are combined via tidyr::crossing() and each
row is passed through to fit_survdnn(), so the grid can include any
supported engine argument exposed by this wrapper.
If refit_best = FALSE, a data.frame (class "tuned_surv") with
hyperparameter settings and metric columns, ordered by the first metric.
If refit_best = TRUE, a fitted mlsurv_model from fit_survdnn() with
attribute "tuning_results" containing the full grid results.
fit_survdnn(), predict_survdnn()
if (requireNamespace("survdnn", quietly = TRUE) && requireNamespace("torch", quietly = TRUE) && torch::torch_is_installed()) { grid <- list( hidden = list(c(16), c(32, 16)), lr = c(1e-4, 5e-4), activation = c("relu", "tanh"), epochs = c(300), loss = c("cox", "coxtime"), optimizer = "adam", dropout = c(0.1, 0.3) ) mod <- tune_survdnn( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(90), metrics = c("cindex", "ibs"), param_grid = grid, refit_best = TRUE ) summary(mod) }if (requireNamespace("survdnn", quietly = TRUE) && requireNamespace("torch", quietly = TRUE) && torch::torch_is_installed()) { grid <- list( hidden = list(c(16), c(32, 16)), lr = c(1e-4, 5e-4), activation = c("relu", "tanh"), epochs = c(300), loss = c("cox", "coxtime"), optimizer = "adam", dropout = c(0.1, 0.3) ) mod <- tune_survdnn( formula = Surv(time, status) ~ age + karno + celltype, data = veteran, times = c(90), metrics = c("cindex", "ibs"), param_grid = grid, refit_best = TRUE ) summary(mod) }
Cross-validates Survival SVM models over a user-specified grid and selects the best configuration based on the chosen metric. Optionally refits the best model on the full dataset.
tune_survsvm( formula, data, times, metrics = "cindex", param_grid, folds = 5, seed = 42, ncores = 1, refit_best = FALSE, dist = "exp", shape = 1 )tune_survsvm( formula, data, times, metrics = "cindex", param_grid, folds = 5, seed = 42, ncores = 1, refit_best = FALSE, dist = "exp", shape = 1 )
formula |
A survival formula of the form |
data |
A |
times |
Numeric vector of evaluation time points (same scale as the survival time used at training). Must be non-negative and finite. |
metrics |
Character vector of metrics to evaluate/optimize
(e.g., |
param_grid |
A named list of candidate hyperparameters; typical entries
include |
folds |
Number of cross-validation folds. |
seed |
Integer random seed for reproducibility. |
ncores |
Integer number of CPU cores passed to |
refit_best |
Logical; if |
dist |
Parametric mapping for predictions during CV: |
shape |
Weibull shape parameter if |
Evaluation. Internally calls cv_survlearner() with
fit_survsvm()/predict_survsvm() to keep code paths consistent with
production usage. The prediction step applies the specified dist/shape
mapping to convert predicted times into survival probabilities at times.
If refit_best = FALSE, a data.frame (class "tune_surv") of grid
results with metric columns and tuning parameters. If refit_best = TRUE, a
fitted mlsurv_model (class augmented with "tune_surv") with the full
results attached in attr(, "tuning_results").
fit_survsvm(), predict_survsvm()
grid <- list( gamma.mu = c(0.01, 0.1), kernel = c("lin_kernel", "add_kernel") ) res_svm <- tune_survsvm( formula = Surv(time, status) ~ age + celltype + karno, data = veteran, times = c(100, 300, 500), metrics = c("cindex", "ibs"), param_grid = grid, folds = 3, refit_best = TRUE ) summary(res_svm) res_svm <- tune_survsvm( formula = Surv(time, status) ~ age + celltype + karno, data = veteran, times = c(100, 300, 500), metrics = c("cindex", "ibs"), param_grid = grid, folds = 3, refit_best = FALSE ) res_svmgrid <- list( gamma.mu = c(0.01, 0.1), kernel = c("lin_kernel", "add_kernel") ) res_svm <- tune_survsvm( formula = Surv(time, status) ~ age + celltype + karno, data = veteran, times = c(100, 300, 500), metrics = c("cindex", "ibs"), param_grid = grid, folds = 3, refit_best = TRUE ) summary(res_svm) res_svm <- tune_survsvm( formula = Surv(time, status) ~ age + celltype + karno, data = veteran, times = c(100, 300, 500), metrics = c("cindex", "ibs"), param_grid = grid, folds = 3, refit_best = FALSE ) res_svm
Cross-validates XGBoost survival models over a user-specified grid and
returns a results table with metric summaries per configuration. Any row
that errors during CV is marked failed = TRUE.
tune_xgboost( formula, data, times, param_grid = expand.grid(nrounds = c(50, 100), max_depth = c(3, 6), eta = c(0.01, 0.1), aft_loss_distribution = c("extreme", "logistic"), aft_loss_distribution_scale = c(0.5, 1), objective = "survival:aft", stringsAsFactors = FALSE), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1 )tune_xgboost( formula, data, times, param_grid = expand.grid(nrounds = c(50, 100), max_depth = c(3, 6), eta = c(0.01, 0.1), aft_loss_distribution = c("extreme", "logistic"), aft_loss_distribution_scale = c(0.5, 1), objective = "survival:aft", stringsAsFactors = FALSE), metrics = c("cindex", "ibs"), folds = 5, seed = 123, ncores = 1 )
formula |
A survival formula of the form |
data |
A |
times |
Numeric vector of evaluation time points. |
param_grid |
A |
metrics |
Character vector of metrics to evaluate (e.g., |
folds |
Integer number of CV folds. |
seed |
Integer random seed for reproducibility. |
ncores |
Integer number of CPU cores passed to |
Internally calls cv_survlearner with fit_xgboost /
predict_xgboost. Any configuration that errors (e.g., due to
invalid parameters or data issues) is recorded with failed = TRUE and
omitted from metric summarization.
A tibble with one row per grid configuration, containing:
The grid values.
Logical; TRUE if the configuration errored.
One column per entry in metrics (when available).
The table is arranged by the first metric in metrics (ascending, as implemented).
grid <- expand.grid( nrounds = c(20, 40), max_depth = c(2, 3), eta = c(0.1, 0.2), aft_loss_distribution = c("extreme", "logistic"), aft_loss_distribution_scale = c(0.5, 1), objective = "survival:aft", stringsAsFactors = FALSE ) res_xgb <- tune_xgboost( formula = survival::Surv(time, status) ~ age + karno + celltype, data = survival::veteran, times = c(100, 200), metrics = c("cindex", "ibs"), param_grid = grid, folds = 2, seed = 123 ) head(res_xgb)grid <- expand.grid( nrounds = c(20, 40), max_depth = c(2, 3), eta = c(0.1, 0.2), aft_loss_distribution = c("extreme", "logistic"), aft_loss_distribution_scale = c(0.5, 1), objective = "survival:aft", stringsAsFactors = FALSE ) res_xgb <- tune_xgboost( formula = survival::Surv(time, status) ~ age + karno + celltype, data = survival::veteran, times = c(100, 200), metrics = c("cindex", "ibs"), param_grid = grid, folds = 2, seed = 123 ) head(res_xgb)
This is the veteran dataset originally from the survival package,
containing data from a randomized trial of lung cancer treatments.
veteranveteran
A data frame with 137 observations and 8 variables:
Treatment: 1=standard, 2=test
Cell type: squamous, smallcell, adeno, large
Survival time in days
Censoring status: 1=dead, 0=alive
Karnofsky performance score (higher = better)
Months from diagnosis to randomization
Age in years
Prior therapy: 0=no, 10=yes
survival package, originally from Kalbfleisch and Prentice (1980) The Statistical Analysis of Failure Time Data.
head(veteran) summary(veteran$time)head(veteran) summary(veteran$time)