| Title: | Sample Size and Simulation for Negative Binomial Outcomes |
|---|---|
| Description: | Provides tools for planning and simulating recurrent event trials with overdispersed count endpoints analyzed using negative binomial (or Poisson) rate models. Implements sample size and power calculations for fixed designs with variable accrual, dropout, maximum follow-up, and event gaps, including methods of Zhu and Lakkis (2014) <doi:10.1002/sim.5947> and Friede and Schmidli (2010) <doi:10.3414/ME09-02-0060> as well as extensions for score-test sizing and gaps between events. Supports group sequential monitoring by building on the 'gsDesign' package. Includes recurrent-event simulation utilities (including seasonal rates), interim data truncation, Wald and score-test inference for rate ratios, and blinded or unblinded information estimation and sample size re-estimation. |
| Authors: | Keaven Anderson [aut, cre], Hongtao Zhang [aut], Andrea Maes [aut], Nan Xiao [ctb], Merck & Co., Inc., Rahway, NJ, USA and its affiliates [cph] (ROR: <https://ror.org/02891sr49>) |
| Maintainer: | Keaven Anderson <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.3.2 |
| Built: | 2026-06-03 13:56:20 UTC |
| Source: | https://github.com/keaven/gsdesignnb |
Estimates the blinded dispersion and event rate from aggregated interim data and calculates the required sample size to maintain power, assuming the planned treatment effect holds. This function supports constant rates (Friede & Schmidli 2010) and accommodates future extensions for time-varying rates (Schneider et al. 2013) by using the exposure-adjusted rate.
blinded_ssr( data, ratio = 1, lambda1_planning, lambda2_planning, rr0 = 1, power = 0.8, alpha = 0.025, method = "friede", accrual_rate, accrual_duration, trial_duration, dropout_rate = 0, max_followup = NULL, event_gap = NULL )blinded_ssr( data, ratio = 1, lambda1_planning, lambda2_planning, rr0 = 1, power = 0.8, alpha = 0.025, method = "friede", accrual_rate, accrual_duration, trial_duration, dropout_rate = 0, max_followup = NULL, event_gap = NULL )
data |
A data frame containing the blinded interim data. Must include
columns |
ratio |
Planned allocation ratio (experimental / control). Default is 1. |
lambda1_planning |
Planned event rate for the control group used in original calculation. |
lambda2_planning |
Planned event rate for the experimental group used in original calculation. |
rr0 |
Rate ratio under the null hypothesis (lambda2/lambda1). Default is 1. |
power |
Target power (1 - beta). Default is 0.8. |
alpha |
One-sided significance level. Default is 0.025. |
method |
Method for sample size recalculation. Currently "friede" (Friede & Schmidli 2010) is implemented, which uses the blinded nuisance parameter estimates. |
accrual_rate |
Vector of accrual rates (patients per unit time). |
accrual_duration |
Vector of durations for each accrual rate. Must be same length
as |
trial_duration |
Total planned duration of the trial. |
dropout_rate |
Dropout rate (hazard rate). Default is 0. |
max_followup |
Maximum follow-up time for any patient. Default is NULL (infinite). |
event_gap |
Gap duration after each event during which no new events are counted. Default is NULL (no gap). |
A list containing:
Re-estimated total sample size using blinded estimates.
Estimated dispersion parameter (k) from blinded data.
Estimated overall event rate from blinded data.
Estimated information fraction at interim (blinded information / target information).
Estimated statistical information from the blinded interim data.
Target statistical information required for the planned power.
Friede, T., & Schmidli, H. (2010). Blinded sample size reestimation with count data: methods and applications in multiple sclerosis. Statistics in Medicine, 29(10), 1145–1156. doi:10.1002/sim.3861
Schneider, S., Schmidli, H., & Friede, T. (2013). Blinded sample size re-estimation for recurrent event data with time trends. Statistics in Medicine, 32(30), 5448–5457. doi:10.1002/sim.5977
interim <- data.frame(events = c(1, 2, 1, 3), tte = c(0.8, 1.0, 1.2, 0.9)) blinded_ssr( interim, ratio = 1, lambda1_planning = 0.5, lambda2_planning = 0.3, power = 0.8, alpha = 0.025, accrual_rate = 10, accrual_duration = 12, trial_duration = 18 )interim <- data.frame(events = c(1, 2, 1, 3), tte = c(0.8, 1.0, 1.2, 0.9)) blinded_ssr( interim, ratio = 1, lambda1_planning = 0.5, lambda2_planning = 0.3, power = 0.8, alpha = 0.025, accrual_rate = 10, accrual_duration = 12, trial_duration = 18 )
Estimates the blinded dispersion and event rate from
pooled (blinded) interim data and calculates the observed statistical
information for the log rate ratio. The estimation follows
the approach of Friede & Schmidli (2010): a single negative binomial model
is fit to the pooled data, then the estimated overall rate is split into
group-specific rates using the planned rate ratio.
calculate_blinded_info( data, ratio = 1, lambda1_planning, lambda2_planning, event_gap = NULL )calculate_blinded_info( data, ratio = 1, lambda1_planning, lambda2_planning, event_gap = NULL )
data |
A data frame containing the blinded interim data. Must include
columns |
ratio |
Planned allocation ratio |
lambda1_planning |
Planned event rate |
lambda2_planning |
Planned event rate |
event_gap |
Optional gap duration (numeric). If provided, planning
rates are adjusted to effective rates
|
If the ML negative binomial fit fails to converge or produces an unreliable
shape estimate, the function falls back to method-of-moments (MoM)
estimation via estimate_nb_mom() rather than silently assuming
. This avoids the anti-conservative behaviour that would result
from treating overdispersed data as Poisson.
The statistical information is computed as:
where and
is the expected count for subject
if they belonged to group .
A list containing:
Estimated statistical information
.
Estimated dispersion parameter .
Estimated overall (pooled) event rate.
Re-estimated control rate
.
Re-estimated experimental rate
.
Character label describing which estimator was used
("ml" or "mom").
Friede, T., & Schmidli, H. (2010). Blinded sample size reestimation with negative binomial counts in superiority and non-inferiority trials. Methods of Information in Medicine, 49(06), 618–624. doi:10.3414/ME09-02-0060
blinded_ssr() for blinded sample size reestimation;
sample_size_nbinom() for the underlying sample size formula.
interim <- data.frame(events = c(1, 2, 1, 3), tte = c(0.8, 1.0, 1.2, 0.9)) calculate_blinded_info( interim, ratio = 1, lambda1_planning = 0.5, lambda2_planning = 0.3 )interim <- data.frame(events = c(1, 2, 1, 3), tte = c(0.8, 1.0, 1.2, 0.9)) calculate_blinded_info( interim, ratio = 1, lambda1_planning = 0.5, lambda2_planning = 0.3 )
Updates the group sequential design boundaries based on observed information and checks if boundaries have been crossed.
check_gs_bound( sim_results, design, info_scale = c("blinded", "unblinded"), info_col = NULL )check_gs_bound( sim_results, design, info_scale = c("blinded", "unblinded"), info_col = NULL )
sim_results |
Data frame of simulation results (from |
design |
The planning |
info_scale |
Character. Legacy selector for |
info_col |
Optional explicit column name containing the information
metric to use for bounds, e.g. |
A data frame with added columns:
Logical, true if upper bound crossed (efficacy)
Logical, true if lower bound crossed (futility)
Logical, true if harm bound crossed (test.type 7 or 8)
design <- gsDesign::gsDesign(k = 2, n.fix = 100, test.type = 2, timing = c(0.5, 1)) sim_df <- data.frame( sim = c(1, 1, 2, 2), analysis = c(1, 2, 1, 2), z_stat = c(2.5, NA, -0.2, 2.2), blinded_info = c(50, 100, 50, 100), unblinded_info = c(50, 100, 50, 100) ) check_gs_bound(sim_df, design) check_gs_bound(sim_df, design, info_col = "unblinded_info")design <- gsDesign::gsDesign(k = 2, n.fix = 100, test.type = 2, timing = c(0.5, 1)) sim_df <- data.frame( sim = c(1, 1, 2, 2), analysis = c(1, 2, 1, 2), z_stat = c(2.5, NA, -0.2, 2.2), blinded_info = c(50, 100, 50, 100), unblinded_info = c(50, 100, 50, 100) ) check_gs_bound(sim_df, design) check_gs_bound(sim_df, design, info_col = "unblinded_info")
Computes the statistical information for the log rate
ratio at a given calendar analysis
time, accounting for staggered enrollment, dropout, maximum follow-up, and
event gaps.
compute_info_at_time( analysis_time, accrual_rate, accrual_duration, lambda1, lambda2, dispersion, ratio = 1, dropout_rate = 0, event_gap = 0, max_followup = Inf )compute_info_at_time( analysis_time, accrual_rate, accrual_duration, lambda1, lambda2, dispersion, ratio = 1, dropout_rate = 0, event_gap = 0, max_followup = Inf )
analysis_time |
Calendar time of the analysis. |
accrual_rate |
Enrollment rate (subjects per time unit). |
accrual_duration |
Duration of the enrollment period. |
lambda1 |
Event rate |
lambda2 |
Event rate |
dispersion |
Dispersion parameter |
ratio |
Allocation ratio |
dropout_rate |
Dropout hazard rate. Default is 0. Can be a vector of length 2 for group-specific rates (control, treatment). |
event_gap |
Gap duration after each event. Default is 0. |
max_followup |
Maximum follow-up time per subject. Default is |
This function delegates to sample_size_nbinom() with power = NULL and
returns from the resulting
variance. This ensures full consistency with package design calculations,
including piecewise accrual, dropout, max follow-up truncation, event-gap
correction, and follow-up variability inflation ().
The statistical information (inverse of variance)
at the analysis time.
compute_info_at_time( analysis_time = 12, accrual_rate = 10, accrual_duration = 10, lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1 )compute_info_at_time( analysis_time = 12, accrual_rate = 10, accrual_duration = 10, lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1 )
Subsets the data to all subjects randomized by the specified date,
and prepares the data for analysis. This is a wrapper for cut_data_by_date()
typically used with a date determined by cut_date_for_completers().
cut_completers(data, cut_date, event_gap = 0)cut_completers(data, cut_date, event_gap = 0)
data |
Data generated by |
cut_date |
Calendar time (relative to trial start) at which to cut the data. |
event_gap |
Gap duration after each event during which no new events are counted.
Can be a numeric value (default |
A data frame with one row per subject randomized prior to cut_date.
Contains the truncated follow-up time (tte) and total number of observed events (events).
enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12) fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3)) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.1, 0.05), duration = c(100, 100) ) sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 20) # Find date when 5 subjects have completed date_5 <- cut_date_for_completers(sim, 5) # Get analysis dataset for this cut date (includes partial follow-up) cut_completers(sim, date_5)enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12) fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3)) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.1, 0.05), duration = c(100, 100) ) sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 20) # Find date when 5 subjects have completed date_5 <- cut_date_for_completers(sim, 5) # Get analysis dataset for this cut date (includes partial follow-up) cut_completers(sim, date_5)
Censors follow-up at a specified calendar time and aggregates events per subject. Returns one row per subject randomized before the cut date, with the total number of observed events and follow-up times.
cut_data_by_date(data, cut_date, event_gap = 0, ...) ## Default S3 method: cut_data_by_date(data, cut_date, event_gap = 0, ...) ## S3 method for class 'nb_sim_data' cut_data_by_date(data, cut_date, event_gap = 0, ...) ## S3 method for class 'nb_sim_seasonal' cut_data_by_date(data, cut_date, event_gap = 0, ...)cut_data_by_date(data, cut_date, event_gap = 0, ...) ## Default S3 method: cut_data_by_date(data, cut_date, event_gap = 0, ...) ## S3 method for class 'nb_sim_data' cut_data_by_date(data, cut_date, event_gap = 0, ...) ## S3 method for class 'nb_sim_seasonal' cut_data_by_date(data, cut_date, event_gap = 0, ...)
data |
Data generated by |
cut_date |
Calendar time (relative to trial start) at which to censor follow-up. |
event_gap |
Gap duration after each event during which no new events are counted.
Can be a numeric value (default |
... |
Additional arguments passed to methods. |
A data frame with one row per subject randomized prior to cut_date containing:
Subject identifier
Treatment group
Time of enrollment relative to trial start
Time at risk (total follow-up minus event gap periods)
Total follow-up time (calendar time, not adjusted for gaps)
Number of observed events
A data frame with one row per subject randomized prior to cut_date.
This method stops with an error for unsupported classes.
A data frame with one row per subject randomized prior to cut_date.
Includes total events and follow-up time within the cut window.
A data frame with one row per subject randomized prior to cut_date.
Includes season and follow-up time within the cut window.
cut_data_by_date(default): Default method.
cut_data_by_date(nb_sim_data): Method for nb_sim data.
cut_data_by_date(nb_sim_seasonal): Method for nb_sim_seasonal data.
enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12) fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3)) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.1, 0.05), duration = c(100, 100) ) sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 20) cut_data_by_date(sim, cut_date = 1)enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12) fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3)) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.1, 0.05), duration = c(100, 100) ) sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 20) cut_data_by_date(sim, cut_date = 1)
Finds the calendar time (since start of randomization) at which a specified number of subjects have completed their follow-up.
cut_date_for_completers(data, target_completers)cut_date_for_completers(data, target_completers)
data |
A data frame of simulated data, typically from |
target_completers |
Integer. The target number of completers. |
Numeric. The calendar date when target_completers is achieved.
If the dataset contains fewer than target_completers completers, returns the maximum
calendar time in the dataset and prints a message.
enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12) fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3)) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.1, 0.05), duration = c(100, 100) ) sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 20) cut_date_for_completers(sim, target_completers = 5)enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12) fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3)) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.1, 0.05), duration = c(100, 100) ) sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 20) cut_date_for_completers(sim, target_completers = 5)
Estimates the event rate(s) and common dispersion parameter (k) for negative binomial count data using the method of moments. This is a robust alternative to Maximum Likelihood Estimation (MLE), especially when MLE fails to converge or produces boundary estimates.
estimate_nb_mom(data, group = NULL)estimate_nb_mom(data, group = NULL)
data |
A data frame containing the data. Must include columns
|
group |
Optional character string specifying the grouping column name (e.g., "treatment"). If provided, rates are estimated separately for each group, while a common dispersion parameter is estimated across groups. If NULL (default), a single rate and dispersion are estimated (blinded case). |
The method of moments estimator for the dispersion parameter is derived
by equating the theoretical variance to the observed second central moment,
accounting for varying exposure times.
For a given group with rate , the expected count for subject
is . The variance is .
The estimator is calculated as:
where is the number of events, is the exposure time,
and is the estimated expected count.
When multiple groups are present, the numerator and denominator are summed
across all groups to estimate a common .
A list containing:
lambda |
Estimated event rate(s). A single numeric value if |
dispersion |
Estimated common dispersion parameter (k). |
# Blinded estimation (single group) df <- data.frame(events = c(1, 2, 0, 3), tte = c(1, 1.2, 0.5, 1.5)) estimate_nb_mom(df) # Unblinded estimation (two groups) df_group <- df df_group$group <- c("A", "A", "B", "B") estimate_nb_mom(df_group, group = "group")# Blinded estimation (single group) df <- data.frame(events = c(1, 2, 0, 3), tte = c(1, 1.2, 0.5, 1.5)) estimate_nb_mom(df) # Unblinded estimation (two groups) df_group <- df df_group$group <- c("A", "A", "B", "B") estimate_nb_mom(df_group, group = "group")
Fits a negative binomial generalized linear mixed model (GLMM) to
longitudinal count data using glmmTMB::glmmTMB() with
family = nbinom2(link = "log"). The fitted model and the estimated
dispersion parameter (where Var(Y) = ) are
returned for use in subsequent imputation steps. This mirrors PROC GLIMMIX
with dist=negbin link=log in SAS.
fit_nb_glmm(data, formula, replicate_col = NULL)fit_nb_glmm(data, formula, replicate_col = NULL)
data |
Data frame of observed (non-missing) records. |
formula |
A two-sided formula specifying fixed and random effects, e.g.
|
replicate_col |
Character or |
The dispersion parameter follows the parameterisation used throughout
gsDesignNB: such that . This
matches glmmTMB's nbinom2 family, where glmmTMB::sigma() returns
directly.
A named list—one element per unique replicate—where each element contains:
modelThe fitted glmmTMB object.
kEstimated NB dispersion (= glmmTMB::sigma(model)).
When replicate_col = NULL, the list has a single element named "1".
## Not run: # Requires glmmTMB obs_data <- long_data[!is.na(long_data$count), ] fits <- fit_nb_glmm( data = obs_data, formula = count ~ baseline + trt + visit + (1 | id) ) fits[["1"]]$k # estimated dispersion ## End(Not run)## Not run: # Requires glmmTMB obs_data <- long_data[!is.na(long_data$count), ] fits <- fit_nb_glmm( data = obs_data, formula = count ~ baseline + trt + visit + (1 | id) ) fits[["1"]]$k # estimated dispersion ## End(Not run)
Finds the calendar time (since start of randomization) at which a specified total number of events is reached in the simulated dataset.
get_analysis_date(data, planned_events, event_gap = 5/365.25)get_analysis_date(data, planned_events, event_gap = 5/365.25)
data |
A data frame of simulated data, typically from |
planned_events |
Integer. The target number of events. |
event_gap |
Gap duration after each event during which no new events are counted.
Can be a numeric value (default |
Numeric. The calendar date when planned_events is achieved.
If the dataset contains fewer than planned_events, returns the maximum
calendar time in the dataset and prints a message.
enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12) fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3)) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.1, 0.05), duration = c(100, 100) ) sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 40) get_analysis_date(sim, planned_events = 15)enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12) fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3)) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.1, 0.05), duration = c(100, 100) ) sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 40) get_analysis_date(sim, planned_events = 15)
Finds the earliest calendar date at which all specified criteria are met. Criteria can include a specific calendar date, a target number of events, a target number of completers, or a target amount of blinded information.
get_cut_date( data, planned_calendar = NULL, target_events = NULL, target_completers = NULL, target_info = NULL, event_gap = 0, ratio = 1, lambda1 = NULL, lambda2 = NULL, min_date = 0, max_date = Inf )get_cut_date( data, planned_calendar = NULL, target_events = NULL, target_completers = NULL, target_info = NULL, event_gap = 0, ratio = 1, lambda1 = NULL, lambda2 = NULL, min_date = 0, max_date = Inf )
data |
A data frame of simulated data (from |
planned_calendar |
Numeric. Target calendar time. |
target_events |
Integer. Target number of observed events. |
target_completers |
Integer. Target number of subjects with complete follow-up. |
target_info |
Numeric. Target blinded information. |
event_gap |
Numeric. Gap duration for event counting and info calculation. |
ratio |
Numeric. Randomization ratio (experimental/control) for info calculation. |
lambda1 |
Numeric. Planned control rate for info calculation. |
lambda2 |
Numeric. Planned experimental rate for info calculation. |
min_date |
Numeric. Minimum possible date (e.g., 0 or previous analysis time). |
max_date |
Numeric. Maximum possible date (e.g., trial duration). |
Numeric. The calendar date satisfying the criteria. If criteria cannot be met
within max_date (or data limits), returns max_date (or max data time).
set.seed(456) enroll_rate <- data.frame(rate = 15, duration = 1) fail_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.6, 0.4) ) sim_data <- nb_sim(enroll_rate, fail_rate, max_followup = 1, n = 20) get_cut_date(sim_data, planned_calendar = 0.5, target_events = 5, event_gap = 0)set.seed(456) enroll_rate <- data.frame(rate = 15, duration = 1) fail_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.6, 0.4) ) sim_data <- nb_sim(enroll_rate, fail_rate, max_followup = 1, n = 20) get_cut_date(sim_data, planned_calendar = 0.5, target_events = 5, event_gap = 0)
Creates a group sequential design for negative binomial outcomes based on
sample size calculations from sample_size_nbinom().
gsNBCalendar( x, k = 3, test.type = 4, alpha = 0.025, beta = 0.1, astar = 0, delta = 0, sfu = gsDesign::sfHSD, sfupar = -4, sfl = gsDesign::sfHSD, sflpar = -2, sfharm = gsDesign::sfHSD, sfharmparam = -2, testUpper = TRUE, testLower = TRUE, testHarm = TRUE, tol = 1e-06, r = 18, usTime = NULL, lsTime = NULL, analysis_times = NULL )gsNBCalendar( x, k = 3, test.type = 4, alpha = 0.025, beta = 0.1, astar = 0, delta = 0, sfu = gsDesign::sfHSD, sfupar = -4, sfl = gsDesign::sfHSD, sflpar = -2, sfharm = gsDesign::sfHSD, sfharmparam = -2, testUpper = TRUE, testLower = TRUE, testHarm = TRUE, tol = 1e-06, r = 18, usTime = NULL, lsTime = NULL, analysis_times = NULL )
x |
An object of class |
k |
Number of analyses (interim + final). Default is 3. |
test.type |
Test type as in
Default is 4. |
alpha |
Type I error (one-sided). Default is 0.025. |
beta |
Type II error (1 - power). Default is 0.1. |
astar |
For test.type 5 or 6, allocated Type I error for the lower
bound. For test.type 7 or 8, total probability of crossing the harm bound
under the null. Default is 0 (set to |
delta |
Standardized effect size. Default is 0 (computed from design). |
sfu |
Spending function for upper bound. Default is |
sfupar |
Parameter for upper spending function. Default is -4. |
sfl |
Spending function for lower bound. Default is |
sflpar |
Parameter for lower spending function. Default is -2. |
sfharm |
Spending function for the harm bound (test.type 7 or 8).
Default is |
sfharmparam |
Parameter for the harm spending function. Default is -2. |
testUpper |
Logical scalar or vector of length |
testLower |
Logical scalar or vector of length |
testHarm |
Logical scalar or vector of length |
tol |
Tolerance for convergence. Default is 1e-06. |
r |
Integer controlling grid size for numerical integration. Default is 18. |
usTime |
Spending time for upper bound (optional). |
lsTime |
Spending time for lower bound (optional). |
analysis_times |
Vector of calendar times for each analysis.
Must have length k. These times are stored in the |
An object of class gsNB which inherits from gsDesign
and sample_size_nbinom_result.
While the final sample size would be planned total enrollment, interim analysis
sample sizes are the expected number enrolled at the times specified in analysis_times.
Output value contains all elements from
gsDesign::gsDesign() plus:
The original sample_size_nbinom_result object
A vector with sample size per analysis for group 1
A vector with sample size per analysis for group 2
A vector with total sample size per analysis
A vector with expected total events per analysis
A vector with expected events per analysis for group 1
A vector with expected events per analysis for group 2
A vector with expected average calendar exposure per analysis
A vector with expected at-risk exposure per analysis for group 1
A vector with expected at-risk exposure per analysis for group 2
A vector with variance of log rate ratio per analysis
Calendar time at each analysis (if analysis_times provided)
Logical vector indicating which analyses have an efficacy bound
Logical vector indicating which analyses have a futility bound
Logical vector indicating which analyses have a harm bound (test.type 7 or 8 only)
Note that n.I in the returned object represents the statistical information
at each analysis.
Jennison, C. and Turnbull, B.W. (2000), Group Sequential Methods with Applications to Clinical Trials. Boca Raton: Chapman and Hall.
# First create a sample size calculation nb_ss <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.9, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) # Then create a group sequential design with analysis times gs_design <- gsNBCalendar(nb_ss, k = 3, test.type = 4, analysis_times = c(10, 18, 24) ) # Selective bound testing: futility only at IA1, efficacy deferred to IA2+ gs_selective <- gsNBCalendar(nb_ss, k = 3, test.type = 4, analysis_times = c(10, 18, 24), testUpper = c(FALSE, TRUE, TRUE), testLower = c(TRUE, TRUE, FALSE) ) gs_selective# First create a sample size calculation nb_ss <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.9, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) # Then create a group sequential design with analysis times gs_design <- gsNBCalendar(nb_ss, k = 3, test.type = 4, analysis_times = c(10, 18, 24) ) # Selective bound testing: futility only at IA1, efficacy deferred to IA2+ gs_selective <- gsNBCalendar(nb_ss, k = 3, test.type = 4, analysis_times = c(10, 18, 24), testUpper = c(FALSE, TRUE, TRUE), testLower = c(TRUE, TRUE, FALSE) ) gs_selective
Orchestrates the full multiple imputation (MI) pipeline for longitudinal recurrent-event count data with negative binomial overdispersion:
impute_nb( data, formula, outcome_col, miss_flag_col, baseline_col, trt_col, reference_trt, subject_col, strata_cols = NULL, mar_values = "MAR", mnar_value = "MNAR", composite_value = "Comp", n_imp = 5L, n_boot = 1L, seed = NULL )impute_nb( data, formula, outcome_col, miss_flag_col, baseline_col, trt_col, reference_trt, subject_col, strata_cols = NULL, mar_values = "MAR", mnar_value = "MNAR", composite_value = "Comp", n_imp = 5L, n_boot = 1L, seed = NULL )
data |
Data frame in long format (one row per subject × visit). |
formula |
Two-sided formula passed to |
outcome_col |
Character. Column name of the count outcome. |
miss_flag_col |
Character. Column name of the missingness mechanism
flag. Values in this column control which imputation strategy is applied:
|
baseline_col |
Character. Column name of the baseline count used by the composite strategy. |
trt_col |
Character. Column name of the treatment group. |
reference_trt |
Value in |
subject_col |
Character. Column name of the subject identifier (cluster unit for bootstrap resampling). |
strata_cols |
Character vector of column names used to stratify the
bootstrap resampling. Default |
mar_values |
Character vector. Values of |
mnar_value |
Character. Value of |
composite_value |
Character. Value of |
n_imp |
Integer. Number of imputations per bootstrap
replicate. Default |
n_boot |
Integer. Number of bootstrap replicates. Default
|
seed |
Integer or |
Bootstrap resampling (optional): cluster-level (subject-level)
stratified resampling with replacement, creating n_boot replicates.
This propagates estimation uncertainty into the imputed values,
mirroring the PROC SURVEYSELECT method=urs cluster=USUBJID step in the
SAS macro.
GLMM fitting: a negative binomial GLMM is fitted to the observed
(non-missing) rows of each replicate via fit_nb_glmm().
Imputation by mechanism:
MAR rows: predicted mean with subject BLUPs → Gamma–Poisson draw.
MNAR reference-arm rows: same as MAR (reference arm has no "better" treatment to copy from).
MNAR non-reference-arm rows: reference-based (copy-reference)
imputation. The counterfactual mean is the fixed-effects-only
prediction under the reference arm multiplied by the subject's
random-effect ratio (BLUP prediction / FE prediction on the response
scale). See impute_nb_mnar_ref().
Composite ICE rows: missing value set to baseline count.
See impute_nb_composite().
Returns a long-format data frame with one row per original observation × bootstrap replicate × imputation.
Setting n_boot > 1 combines bootstrap and MI ("boot-MI"), which yields
a valid variance estimator without requiring Rubin's rules. Setting
n_boot = 1 produces conventional MI; apply Rubin's rules to the n_imp
imputed datasets when pooling.
The formula is passed directly to glmmTMB::glmmTMB(). A typical formula
mirrors the PROC GLIMMIX model:
outcome ~ baseline + strat1 + strat2 + trt + visit + param + (1 | id)
The original SAS model also included an unstructured residual covariance
across visits within id:param:
+ (0 + visit | id:param)
Complex random-effect structures may cause convergence issues; start with a random intercept only and add complexity as needed.
The composite strategy applies only to missing post-ICE rows
(is.na(outcome_col) must be TRUE). Observed rows with miss_flag_col == composite_value are left unchanged.
A data frame with all columns from data plus:
replicateBootstrap replicate index (1 to n_boot).
imputationImputation index (1 to n_imp).
imputed_valueImputed count. Equals the observed value for non-missing rows; contains imputed draws for missing rows.
The total number of rows is
nrow(data) * n_boot * n_imp.
## Not run: # Requires glmmTMB result <- impute_nb( data = long_data, formula = count ~ baseline + trt + visit + (1 | id), outcome_col = "count", miss_flag_col = "miss_flag", baseline_col = "baseline", trt_col = "trt", reference_trt = 0L, subject_col = "id", strata_cols = c("trt", "strat1"), mar_values = "MAR", mnar_value = "MNAR", composite_value = "Comp", n_imp = 5L, n_boot = 10L, seed = 42L ) head(result[!is.na(result$miss_flag), ]) ## End(Not run)## Not run: # Requires glmmTMB result <- impute_nb( data = long_data, formula = count ~ baseline + trt + visit + (1 | id), outcome_col = "count", miss_flag_col = "miss_flag", baseline_col = "baseline", trt_col = "trt", reference_trt = 0L, subject_col = "id", strata_cols = c("trt", "strat1"), mar_values = "MAR", mnar_value = "MNAR", composite_value = "Comp", n_imp = 5L, n_boot = 10L, seed = 42L ) head(result[!is.na(result$miss_flag), ]) ## End(Not run)
For subjects whose missingness flag matches composite_value, all missing
post-ICE count observations are set to the subject's baseline count. This
implements the composite estimand strategy for intercurrent events such as
death or treatment discontinuation due to disease worsening, where the
event itself is incorporated into the outcome (e.g., baseline carried
forward as a "worst case" placeholder).
impute_nb_composite( data, outcome_col, imputed_value_col = "imputed_value", miss_flag_col, composite_value = "Comp", baseline_col )impute_nb_composite( data, outcome_col, imputed_value_col = "imputed_value", miss_flag_col, composite_value = "Comp", baseline_col )
data |
Data frame. |
outcome_col |
Character. Column with the original count outcome;
used to identify which rows are missing ( |
imputed_value_col |
Character. Column to update. If absent, it is
created as a copy of |
miss_flag_col |
Character. Column with the missingness flag. |
composite_value |
Character. Flag value triggering the composite
strategy. Default |
baseline_col |
Character. Column with the baseline count used as the fill value. |
The function is intentionally simple and requires no model. It can be
applied to a dataset already containing imputed_value from a prior MAR
or MNAR imputation step, or directly to the original data.
Data frame with imputed_value_col updated for composite rows.
df <- data.frame( count = c(3L, NA, NA, 5L), imputed_value = c(3L, 7L, NA, 5L), miss_flag = c(NA, "MAR", "Comp", NA), baseline = c(4L, 4L, 4L, 6L) ) impute_nb_composite( df, outcome_col = "count", miss_flag_col = "miss_flag", composite_value = "Comp", baseline_col = "baseline" )df <- data.frame( count = c(3L, NA, NA, 5L), imputed_value = c(3L, 7L, NA, 5L), miss_flag = c(NA, "MAR", "Comp", NA), baseline = c(4L, 4L, 4L, 6L) ) impute_nb_composite( df, outcome_col = "count", miss_flag_col = "miss_flag", composite_value = "Comp", baseline_col = "baseline" )
For observations whose missingness flag matches mar_values, generates
n_imp imputed counts using the GLMM predicted mean including
subject-level BLUPs. Draws use the Gamma–Poisson compound distribution:
impute_nb_mar( data, fits, outcome_col, miss_flag_col, mar_values = "MAR", n_imp = 5L, replicate_col = NULL )impute_nb_mar( data, fits, outcome_col, miss_flag_col, mar_values = "MAR", n_imp = 5L, replicate_col = NULL )
data |
Data frame including all rows (observed and missing). |
fits |
Named list of fits as returned by |
outcome_col |
Character. Column with the count outcome (may have |
miss_flag_col |
Character. Column with the missingness flag. |
mar_values |
Character vector. Flag values treated as MAR. Default
|
n_imp |
Integer. Number of imputations per replicate. Default |
replicate_col |
Character or |
This function handles one or more bootstrap replicates when a
replicate_col is provided and fits contains one model per replicate.
Observed rows (non-missing outcome) are passed through unchanged
(imputed_value = observed value).
Data frame in long format with all original columns plus
imputation (integer 1 to n_imp) and imputed_value (imputed count;
equals the observed value for non-missing rows).
## Not run: fits <- fit_nb_glmm(obs_data, count ~ base + trt + visit + (1 | id)) imp_mar <- impute_nb_mar( data = long_data, fits = fits, outcome_col = "count", miss_flag_col = "miss_flag", mar_values = "MAR", n_imp = 5L ) ## End(Not run)## Not run: fits <- fit_nb_glmm(obs_data, count ~ base + trt + visit + (1 | id)) imp_mar <- impute_nb_mar( data = long_data, fits = fits, outcome_col = "count", miss_flag_col = "miss_flag", mar_values = "MAR", n_imp = 5L ) ## End(Not run)
Implements the copy-reference (CR) strategy for observations whose
missingness flag matches mnar_value in non-reference treatment arms.
The imputation mean is the fixed-effects-only prediction under the reference
arm, adjusted upward (or downward) by the subject's estimated random effect:
This mirrors the SAS PROC PLM approach that re-predicts under the
counterfactual treatment and then multiplies by the BLUP ratio on the
response scale.
impute_nb_mnar_ref( data, fits, outcome_col, miss_flag_col, mnar_value = "MNAR", trt_col, reference_trt, n_imp = 5L, replicate_col = NULL )impute_nb_mnar_ref( data, fits, outcome_col, miss_flag_col, mnar_value = "MNAR", trt_col, reference_trt, n_imp = 5L, replicate_col = NULL )
data |
Data frame including all rows (observed and missing). |
fits |
Named list of fits as returned by |
outcome_col |
Character. Column with the count outcome. |
miss_flag_col |
Character. Column with the missingness flag. |
mnar_value |
Character. Flag value identifying MNAR rows. Default
|
trt_col |
Character. Column with the treatment assignment. |
reference_trt |
Value in |
n_imp |
Integer. Number of imputations per replicate. Default
|
replicate_col |
Character or |
MNAR subjects already in the reference arm should be handled by
impute_nb_mar() (MAR imputation is appropriate for the reference arm
because there is no better arm to "copy from").
Data frame in long format with all original columns plus
imputation and imputed_value. Only MNAR non-reference rows have
counterfactual imputations; all other rows pass through unchanged.
## Not run: fits <- fit_nb_glmm(obs_data, count ~ base + trt + visit + (1 | id)) imp_mnar <- impute_nb_mnar_ref( data = long_data, fits = fits, outcome_col = "count", miss_flag_col = "miss_flag", mnar_value = "MNAR", trt_col = "trt", reference_trt = 0L, n_imp = 5L ) ## End(Not run)## Not run: fits <- fit_nb_glmm(obs_data, count ~ base + trt + visit + (1 | id)) imp_mnar <- impute_nb_mnar_ref( data = long_data, fits = fits, outcome_col = "count", miss_flag_col = "miss_flag", mnar_value = "MNAR", trt_col = "trt", reference_trt = 0L, n_imp = 5L ) ## End(Not run)
Fits a negative binomial (or Poisson) log-rate model to the aggregated
subject-level data produced by cut_data_by_date(). With
test_type = "wald" (default), the method matches the
Wald test described by Mutze et al. (2019). With test_type = "score",
the function fits only the null (no treatment effect) model and computes
the score statistic, which evaluates all quantities under and
avoids the finite-sample anti-conservatism of the Wald test.
mutze_test( data, method = c("nb", "poisson"), test_type = c("wald", "score"), conf_level = 0.95, sided = 1, poisson_threshold = 50, mom_threshold = 20 ) ## S3 method for class 'mutze_test' print(x, ...)mutze_test( data, method = c("nb", "poisson"), test_type = c("wald", "score"), conf_level = 0.95, sided = 1, poisson_threshold = 50, mom_threshold = 20 ) ## S3 method for class 'mutze_test' print(x, ...)
data |
A data frame with at least the columns |
method |
Type of model to fit: "nb" (default) uses a negative binomial
GLM via |
test_type |
Type of test statistic: |
conf_level |
Confidence level for the rate ratio interval. Default 0.95. |
sided |
Number of sides for the test: 1 (default) or 2. |
poisson_threshold |
Upper threshold (in units of |
mom_threshold |
Lower threshold on |
x |
An object of class |
... |
Additional arguments (currently ignored). |
When the maximum likelihood negative binomial fit is unreliable, the test automatically switches to one of two statistically sensible fallbacks: a Poisson test when the data are essentially Poisson, or a method-of-moments (MoM) variance estimate plugged into the same negative binomial information formula when the data are extremely overdispersed or the ML fit fails to converge.
An object of class mutze_test containing:
method: A string indicating the test method used.
estimate: log rate ratio (experimental vs control). For
test_type = "score", this is a plug-in estimate.
se: standard error for the log rate ratio.
z: test statistic (Wald or score).
p_value: one-sided or two-sided p-value.
rate_ratio: estimated rate ratio and its confidence interval.
dispersion: estimated dispersion on the scale.
group_summary: observed subjects/events/exposure per treatment.
fallback: character label ("ml", "poisson", or "mom").
test_type: character label ("wald" or "score").
Invisibly returns the input object.
print(mutze_test): Print method for mutze_test objects.
enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12) fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3)) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.1, 0.05), duration = c(100, 100) ) sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 40) cut <- cut_data_by_date(sim, cut_date = 1.5) mutze_test(cut) mutze_test(cut, test_type = "score")enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12) fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3)) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.1, 0.05), duration = c(100, 100) ) sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 40) cut <- cut_data_by_date(sim, cut_date = 1.5) mutze_test(cut) mutze_test(cut, test_type = "score")
Simulates recurrent events for a clinical trial with piecewise constant enrollment, exponential failure rates (Poisson process), and piecewise exponential dropout.
nb_sim( enroll_rate, fail_rate, dropout_rate = NULL, max_followup = NULL, n = NULL, block = c(rep("Control", 2), rep("Experimental", 2)), event_gap = 0 )nb_sim( enroll_rate, fail_rate, dropout_rate = NULL, max_followup = NULL, n = NULL, block = c(rep("Control", 2), rep("Experimental", 2)), event_gap = 0 )
enroll_rate |
A data frame with columns |
fail_rate |
A data frame with columns |
dropout_rate |
A data frame with columns |
max_followup |
Numeric. Maximum duration of follow-up for each individual (relative to their randomization time). |
n |
Total sample size. If NULL, it is estimated from |
block |
Block vector for treatment allocation. Default is
|
event_gap |
Numeric. Gap duration after each event during which no new events are counted. Default is 0. |
The simulation generates data consistent with the negative binomial models described by Friede and Schmidli (2010) and Mütze et al. (2019). Specifically, it simulates a Gamma-distributed frailty variable for each individual (if dispersion > 0), which acts as a multiplier for that individual's event rate. Events are then generated according to a Poisson process with this subject-specific rate.
More explicitly, for a subject with baseline rate and exposure time
, the model used here is a Gamma–Poisson mixture:
Marginally, follows a negative binomial distribution with
and .
This is the package dispersion parameter (and corresponds to
in MASS::glm.nb() terminology).
A data frame (tibble) with columns:
Subject identifier
Treatment group
Time of enrollment relative to trial start
Time to event or censoring relative to randomization
Calendar time of event or censoring (enroll_time + tte)
Binary indicator: 1 for event, 0 for censoring
Multiple rows per subject are returned (one for each event, plus one for the final censoring time).
Friede, T., & Schmidli, H. (2010). Blinded sample size reestimation with count data: methods and applications in multiple sclerosis. Statistics in Medicine, 29(10), 1145–1156. doi:10.1002/sim.3861
Mütze, T., Glimm, E., Schmidli, H., & Friede, T. (2019). Group sequential designs for negative binomial outcomes. Statistical Methods in Medical Research, 28(8), 2326–2347. doi:10.1177/0962280218773115
enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12) fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3)) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.1, 0.05), duration = c(100, 100) ) sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 20) head(sim)enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12) fail_rate <- data.frame(treatment = c("Control", "Experimental"), rate = c(0.5, 0.3)) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.1, 0.05), duration = c(100, 100) ) sim <- nb_sim(enroll_rate, fail_rate, dropout_rate, max_followup = 2, n = 20) head(sim)
Simulates recurrent events where event rates depend on the season.
nb_sim_seasonal( enroll_rate, fail_rate, dropout_rate = NULL, max_followup = NULL, randomization_start_date = NULL, n = NULL, block = c(rep("Control", 2), rep("Experimental", 2)) )nb_sim_seasonal( enroll_rate, fail_rate, dropout_rate = NULL, max_followup = NULL, randomization_start_date = NULL, n = NULL, block = c(rep("Control", 2), rep("Experimental", 2)) )
enroll_rate |
A data frame with columns |
fail_rate |
A data frame with columns |
dropout_rate |
A data frame with columns |
max_followup |
Numeric. Max follow-up duration (years). |
randomization_start_date |
Date. Start of randomization. |
n |
Integer. Total sample size. |
block |
Character vector for block randomization. |
A data frame of class nb_sim_seasonal with columns:
id, treatment, season, enroll_time, start, end, event, calendar_start, calendar_end.
Rows represent intervals of risk or events. event=1 indicates an event at end.
event=0 indicates censoring or end of a seasonal interval at end.
enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12) fail_rate <- data.frame( treatment = rep(c("Control", "Experimental"), each = 4), season = rep(c("Winter", "Spring", "Summer", "Fall"), times = 2), rate = c(0.6, 0.5, 0.4, 0.5, 0.4, 0.3, 0.2, 0.3) ) sim <- nb_sim_seasonal( enroll_rate = enroll_rate, fail_rate = fail_rate, max_followup = 1, randomization_start_date = as.Date("2020-01-01"), n = 20 ) head(sim)enroll_rate <- data.frame(rate = 20 / (5 / 12), duration = 5 / 12) fail_rate <- data.frame( treatment = rep(c("Control", "Experimental"), each = 4), season = rep(c("Winter", "Spring", "Summer", "Fall"), times = 2), rate = c(0.6, 0.5, 0.4, 0.5, 0.4, 0.3, 0.2, 0.3) ) sim <- nb_sim_seasonal( enroll_rate = enroll_rate, fail_rate = fail_rate, max_followup = 1, randomization_start_date = as.Date("2020-01-01"), n = 20 ) head(sim)
pkgdown::preview_site() opens docs/index.html as a file:// URL. Many
browsers restrict loading stylesheets, scripts, and fonts from local files, so
the site can appear almost unstyled. This function serves docs/ over HTTP
on the loopback interface so the site matches the published GitHub Pages
appearance.
preview_pkgdown_site(port = 8787L)preview_pkgdown_site(port = 8787L)
port |
Integer; TCP port for the static server (default |
Call from the package root after pkgdown::build_site() (or any build
that populates docs/).
The server runs until you interrupt the R session (e.g. Esc in RStudio or Ctrl+C in the terminal).
Invisibly, NULL (called for its side effect of running the server).
## Not run: pkgdown::build_site() preview_pkgdown_site() ## End(Not run)## Not run: pkgdown::build_site() preview_pkgdown_site() ## End(Not run)
Print method for gsNBsummary objects
## S3 method for class 'gsNBsummary' print(x, ...)## S3 method for class 'gsNBsummary' print(x, ...)
x |
An object of class |
... |
Additional arguments (currently ignored). |
Invisibly returns the input object.
nb_ss <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.9, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) gs_design <- gsNBCalendar(nb_ss, k = 3, analysis_times = c(12, 18, 24)) s <- summary(gs_design) print(s)nb_ss <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.9, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) gs_design <- gsNBCalendar(nb_ss, k = 3, analysis_times = c(12, 18, 24)) s <- summary(gs_design) print(s)
Prints a concise summary of the sample size calculation results.
## S3 method for class 'sample_size_nbinom_result' print(x, ...)## S3 method for class 'sample_size_nbinom_result' print(x, ...)
x |
An object of class |
... |
Additional arguments (currently ignored). |
Invisibly returns the input object.
x <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) print(x)x <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) print(x)
Print method for sample_size_nbinom_summary objects
## S3 method for class 'sample_size_nbinom_summary' print(x, ...)## S3 method for class 'sample_size_nbinom_summary' print(x, ...)
x |
An object of class |
... |
Additional arguments (currently ignored). |
Invisibly returns the input object.
x <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) s <- summary(x) print(s)x <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) s <- summary(x) print(s)
Opens a lightweight Shiny interface for interactive exploration of adaptive
sample size re-estimation (SSR) scenarios. The app is a thin wrapper over
sample_size_nbinom(), gsNBCalendar(), sim_ssr_nbinom(), and
summarize_ssr_sim(), so the statistical computations remain in the package.
run_ssr_shiny( display.mode = c("normal", "showcase"), launch.browser = interactive() )run_ssr_shiny( display.mode = c("normal", "showcase"), launch.browser = interactive() )
display.mode |
Character; passed to |
launch.browser |
Logical; passed to |
Invisibly returns the Shiny app object.
## Not run: run_ssr_shiny() ## End(Not run)## Not run: run_ssr_shiny() ## End(Not run)
Computes the sample size (or power) for comparing two treatment groups
assuming negative binomial distributed event counts. When
test_type = "wald" (default), the formula uses a single variance
evaluated under the alternative, corresponding to Method 3 of
Zhu & Lakkis (2014) and the formulas of Friede & Schmidli (2010)
and Mutze et al. (2019). When test_type = "score", separate null
and alternative variances are used (Farrington & Manning style),
aligning the calculation with the null-variance scale of the score test.
In practice, the final test statistic affects Type I error more than the
small difference between Wald and score sizing, so score-test designs should
be checked by simulation for both Type I error and power.
sample_size_nbinom( lambda1, lambda2, dispersion, power = NULL, alpha = 0.025, sided = 1, ratio = 1, rr0 = 1, accrual_rate, accrual_duration, trial_duration, dropout_rate = 0, max_followup = NULL, test_type = c("wald", "score"), event_gap = NULL )sample_size_nbinom( lambda1, lambda2, dispersion, power = NULL, alpha = 0.025, sided = 1, ratio = 1, rr0 = 1, accrual_rate, accrual_duration, trial_duration, dropout_rate = 0, max_followup = NULL, test_type = c("wald", "score"), event_gap = NULL )
lambda1 |
Event rate for group 1 (control), in events per unit time. |
lambda2 |
Event rate for group 2 (treatment), in events per unit time. |
dispersion |
Dispersion parameter |
power |
Target power ( |
alpha |
Significance level. Default is 0.025. |
sided |
Number of sides for the test: 1 (one-sided) or 2 (two-sided). Default is 1. |
ratio |
Allocation ratio |
rr0 |
Rate ratio under the null hypothesis
( |
accrual_rate |
Vector of accrual rates (patients per unit time) for each recruitment segment. |
accrual_duration |
Vector of durations for each accrual segment.
Must be the same length as |
trial_duration |
Total planned duration of the trial. If
|
dropout_rate |
Dropout hazard rate. Can be:
|
max_followup |
Maximum follow-up time for any patient. Default is
|
test_type |
Type of test for which to size the study:
|
event_gap |
Gap duration after each event during which no new events
are counted (e.g., a recovery period). Default is |
Wald test (test_type = "wald"):
Score test (test_type = "score"):
where ,
, and:
is the variance under . Under (pooled rate
):
with the expected event count and
the average exposure for group .
In superiority settings, the traditional Wald/Zhu-Lakkis sample size may be slightly larger than score sizing and can provide a useful power margin when the final analysis uses the score test. Compare both sizing rules and verify the chosen design with simulation when finite-sample calibration matters.
The average exposure accounts for piecewise accrual,
piecewise exponential dropout, and maximum follow-up truncation. With
piecewise constant dropout hazards
over successive intervals, the survival function is
where is the
time spent in interval . The expected exposure for a patient with
potential follow-up is , computed
as a sum of exponential integrals over each piece. For a single constant
rate this simplifies to
.
The overall average is a weighted mean across accrual segments.
When follow-up times are variable, the dispersion is inflated by a factor
(Zhu & Lakkis,
2014) to account for the non-linear dependence of the NB variance on
exposure.
When event_gap > 0, the naive effective rate
overestimates the true population-level
effective rate because of subject-level heterogeneity (frailty).
In the Gamma-Poisson mixture, each subject's rate
is random.
Since is concave, Jensen's inequality gives
.
A second-order Taylor correction is applied:
This uses and
.
An object of class sample_size_nbinom_result, which is a list
containing:
Named list of the original function arguments.
Sample size for group 1 (control).
Sample size for group 2 (treatment).
Total sample size ().
Significance level used.
One-sided or two-sided test.
Power of the test.
Average calendar exposure (vector of
length 2 for control and treatment).
Average at-risk exposure for group 1 (adjusted for event gap).
Average at-risk exposure for group 2 (adjusted for event gap).
Expected number of events in group 1.
Expected number of events in group 2.
Total expected number of events.
Variance of the log rate ratio
.
Null variance of the log rate ratio used for
score-test sizing, on the same final-analysis scale as variance.
Accrual rate(s) used (possibly scaled to achieve target power).
Accrual duration(s) used.
Zhu, H., & Lakkis, H. (2014). Sample size calculation for comparing two negative binomial rates. Statistics in Medicine, 33(3), 376–387. doi:10.1002/sim.5947
Friede, T., & Schmidli, H. (2010). Blinded sample size reestimation with negative binomial counts in superiority and non-inferiority trials. Methods of Information in Medicine, 49(06), 618–624. doi:10.3414/ME09-02-0060
Mutze, T., Glimm, E., Schmidli, H., & Friede, T. (2019). Group sequential designs for negative binomial outcomes. Statistical Methods in Medical Research, 28(8), 2326–2347. doi:10.1177/0962280218773115
compute_info_at_time() for computing statistical information at a given
analysis time; blinded_ssr() for blinded sample size reestimation;
gsNBCalendar() for group sequential designs;
vignette("sample-size-nbinom", package = "gsDesignNB") for detailed
methodology.
# Basic sample size calculation x <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) class(x) summary(x) # With piecewise accrual x2 <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8, accrual_rate = c(5, 10), accrual_duration = c(3, 3), trial_duration = 12 ) summary(x2) # Compute power for a fixed design (power = NULL) sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = NULL, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 )# Basic sample size calculation x <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) class(x) summary(x) # With piecewise accrual x2 <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8, accrual_rate = c(5, 10), accrual_duration = c(3, 3), trial_duration = 12 ) summary(x2) # Compute power for a fixed design (power = NULL) sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = NULL, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 )
Simulates multiple replicates of a group sequential clinical trial with negative binomial outcomes, performing interim analyses at specified calendar times. Supports parallel execution via the future framework for faster simulation with reproducible random number generation.
sim_gs_nbinom( n_sims, enroll_rate, fail_rate, dropout_rate = NULL, max_followup, event_gap = NULL, analysis_times = NULL, n_target = NULL, design = NULL, data_cut = cut_data_by_date, cuts = NULL, test_type = c("wald", "score"), seed = TRUE )sim_gs_nbinom( n_sims, enroll_rate, fail_rate, dropout_rate = NULL, max_followup, event_gap = NULL, analysis_times = NULL, n_target = NULL, design = NULL, data_cut = cut_data_by_date, cuts = NULL, test_type = c("wald", "score"), seed = TRUE )
n_sims |
Number of simulations to run. |
enroll_rate |
Enrollment rates (data frame with |
fail_rate |
Failure rates (data frame with |
dropout_rate |
Dropout rates (data frame with |
max_followup |
Maximum follow-up time. |
event_gap |
Event gap duration. If |
analysis_times |
Vector of calendar times for interim and final analyses.
Optional if |
n_target |
Total sample size to enroll (optional, if not defined by |
design |
An object of class |
data_cut |
Function to cut data for analysis. Defaults to |
cuts |
A list of cutting criteria for each analysis. Each element of the list
should be a list of arguments for |
test_type |
Type of test statistic passed to |
seed |
Random seed for reproducible simulations. Controls the
When future.apply is not installed, |
This function uses future.apply::future_lapply() to distribute simulation
replicates across workers. By default, simulations run sequentially
(equivalent to lapply()). To enable parallel execution, set a
future plan before calling this function:
library(future) plan(multisession, workers = 4) # use 4 parallel workers results <- sim_gs_nbinom(...) plan(sequential) # restore default
The default seed = TRUE ensures that results are fully reproducible
regardless of the future plan (sequential or parallel) and regardless
of the number of workers. This is achieved via the L'Ecuyer-CMRG algorithm
which generates statistically independent random number streams for each
simulation replicate. To obtain the same results across runs:
set.seed(42) res1 <- sim_gs_nbinom(n_sims = 100, ..., seed = TRUE) set.seed(42) res2 <- sim_gs_nbinom(n_sims = 100, ..., seed = TRUE) identical(res1, res2) # TRUE, even with different plan()
A data frame containing simulation results for each analysis of each trial. Columns include:
Simulation ID
Analysis index
Calendar time of analysis
Number of subjects enrolled
Number of subjects in control group
Number of subjects in experimental group
Total events observed
Events in control group
Events in experimental group
Exposure at risk in control group (adjusted for event gaps)
Exposure at risk in experimental group (adjusted for event gaps)
Total exposure in control group (calendar follow-up)
Total exposure in experimental group (calendar follow-up)
Z-statistic from the Wald test (positive favors experimental if rate ratio < 1)
Estimated log rate ratio from the model
Standard error of the estimate
Method used for inference ("nb" or "poisson")
Estimated dispersion parameter from the model
Estimated blinded statistical information (ML)
Observed unblinded statistical information (ML)
Observed unblinded statistical information (ML)
Estimated blinded statistical information (ML)
Observed unblinded statistical information (Method of Moments)
Estimated blinded statistical information (Method of Moments)
# Basic sequential usage with reproducible seed set.seed(123) enroll_rate <- data.frame(rate = 10, duration = 3) fail_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.6, 0.4), dispersion = 0.2 ) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.05, 0.05), duration = c(6, 6) ) design <- sample_size_nbinom( lambda1 = 0.6, lambda2 = 0.4, dispersion = 0.2, power = 0.8, accrual_rate = enroll_rate$rate, accrual_duration = enroll_rate$duration, trial_duration = 6 ) cuts <- list( list(planned_calendar = 2), list(planned_calendar = 4) ) sim_results <- sim_gs_nbinom( n_sims = 2, enroll_rate = enroll_rate, fail_rate = fail_rate, dropout_rate = dropout_rate, max_followup = 4, n_target = 30, design = design, cuts = cuts, seed = TRUE ) head(sim_results) ## Not run: # Parallel execution (requires future and future.apply) library(future) plan(multisession, workers = 4) set.seed(42) sim_results <- sim_gs_nbinom( n_sims = 1000, enroll_rate = enroll_rate, fail_rate = fail_rate, dropout_rate = dropout_rate, max_followup = 4, n_target = 30, design = design, cuts = cuts, seed = TRUE ) plan(sequential) ## End(Not run)# Basic sequential usage with reproducible seed set.seed(123) enroll_rate <- data.frame(rate = 10, duration = 3) fail_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.6, 0.4), dispersion = 0.2 ) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.05, 0.05), duration = c(6, 6) ) design <- sample_size_nbinom( lambda1 = 0.6, lambda2 = 0.4, dispersion = 0.2, power = 0.8, accrual_rate = enroll_rate$rate, accrual_duration = enroll_rate$duration, trial_duration = 6 ) cuts <- list( list(planned_calendar = 2), list(planned_calendar = 4) ) sim_results <- sim_gs_nbinom( n_sims = 2, enroll_rate = enroll_rate, fail_rate = fail_rate, dropout_rate = dropout_rate, max_followup = 4, n_target = 30, design = design, cuts = cuts, seed = TRUE ) head(sim_results) ## Not run: # Parallel execution (requires future and future.apply) library(future) plan(multisession, workers = 4) set.seed(42) sim_results <- sim_gs_nbinom( n_sims = 1000, enroll_rate = enroll_rate, fail_rate = fail_rate, dropout_rate = dropout_rate, max_followup = 4, n_target = 30, design = design, cuts = cuts, seed = TRUE ) plan(sequential) ## End(Not run)
Simulates recurrent-event group sequential trials with information-based
interim analyses and optional sample size re-estimation (SSR). Interim timing
follows the blinded-information targeting used in the SSR study vignette:
get_cut_date() searches for the earliest cut date where the blinded
information reaches the planned information fraction, with an unblinded
fallback if the blinded fit is unavailable.
sim_ssr_nbinom( n_sims, enroll_rate, fail_rate, dropout_rate = NULL, max_followup, design, n_max = NULL, strategies = c("No adaptation", "Blinded SSR", "Unblinded SSR"), adapt_analysis = NULL, min_if_futility = 0, max_enrollment_frac_for_adapt = 1, min_months_to_close_for_adapt = 0, analysis_lag_months = 0, event_gap = NULL, bound_info = c("unblinded_ml", "blinded_ml", "unblinded_mom", "blinded_mom"), first_min_time = 1, min_analysis_gap = 0.5, ignore_futility = FALSE, metadata = NULL, test_type = c("wald", "score"), seed = TRUE )sim_ssr_nbinom( n_sims, enroll_rate, fail_rate, dropout_rate = NULL, max_followup, design, n_max = NULL, strategies = c("No adaptation", "Blinded SSR", "Unblinded SSR"), adapt_analysis = NULL, min_if_futility = 0, max_enrollment_frac_for_adapt = 1, min_months_to_close_for_adapt = 0, analysis_lag_months = 0, event_gap = NULL, bound_info = c("unblinded_ml", "blinded_ml", "unblinded_mom", "blinded_mom"), first_min_time = 1, min_analysis_gap = 0.5, ignore_futility = FALSE, metadata = NULL, test_type = c("wald", "score"), seed = TRUE )
n_sims |
Number of simulated trials. |
enroll_rate |
Enrollment-rate data frame passed to |
fail_rate |
Failure-rate data frame passed to |
dropout_rate |
Optional dropout-rate data frame passed to |
max_followup |
Maximum follow-up per patient in the simulated trial. |
design |
A planning object of class |
n_max |
Maximum total enrollment allowed after SSR. Defaults to 150% of the planned final enrollment, rounded up and constrained to be at least the planned sample size. |
strategies |
Character vector of strategies to simulate. Must be chosen
from |
adapt_analysis |
Interim analysis index where SSR may be applied.
Defaults to the last interim analysis ( |
min_if_futility |
Minimum observed information fraction required before
allowing a futility stop. Default is |
max_enrollment_frac_for_adapt |
Maximum fraction of the planned
enrollment already accrued at the adaptation cut for SSR to be allowed.
Default is |
min_months_to_close_for_adapt |
Minimum predicted months remaining to
planned enrollment close required to allow SSR. Default is |
analysis_lag_months |
Additional months of enrollment counted after a
futility stop to approximate operational lag. Default is |
event_gap |
Optional event-gap duration. If |
bound_info |
Information measure used for efficacy/futility bounds.
Choices are |
first_min_time |
Minimum calendar time allowed for the first
information-based interim search. Default is |
min_analysis_gap |
Minimum gap between successive information-based
interim searches. Default is |
ignore_futility |
Logical. If |
metadata |
Optional named list or one-row data frame of scenario labels to repeat across the returned rows. |
test_type |
Type of test statistic passed to |
seed |
Random-seed control passed to |
The function compares one or more strategies:
The trial keeps the planned sample size.
A blinded nuisance-parameter update is applied at
adapt_analysis using blinded_ssr().
An unblinded nuisance-parameter update is applied at
adapt_analysis using unblinded_ssr().
The returned trial_results include stage-specific columns such as
z_ia1, if_ia1, ia1_time, participants_with_events_ia1,
events_observed_ia1, and analogous columns for later analyses. For the
actual stopping stage, participants_with_events_stop and
events_observed_stop summarize the observed burden carried into the final
decision for each simulated trial.
An object of class sim_ssr_nbinom with components:
One row per simulation and strategy, containing trial-level outcomes and stage-specific summaries in wide format.
One row per simulation, strategy, and analysis in long format.
A list of key design and simulation settings.
set.seed(123) enroll_rate <- data.frame(rate = 12, duration = 6) fail_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.6, 0.42), dispersion = 0.4 ) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.05, 0.05), duration = c(12, 12) ) fixed_design <- sample_size_nbinom( lambda1 = 0.6, lambda2 = 0.42, dispersion = 0.4, power = 0.8, alpha = 0.025, accrual_rate = 12, accrual_duration = 6, trial_duration = 12, max_followup = 6 ) gs_design <- gsNBCalendar( fixed_design, k = 3, test.type = 4, alpha = 0.025, sfu = sfHSD, sfupar = -2, sfl = sfHSD, sflpar = 1, analysis_times = c(4, 8, 12) ) sim_res <- sim_ssr_nbinom( n_sims = 2, enroll_rate = enroll_rate, fail_rate = fail_rate, dropout_rate = dropout_rate, max_followup = 6, design = gs_design, seed = 123 ) names(sim_res) head(sim_res$trial_results)set.seed(123) enroll_rate <- data.frame(rate = 12, duration = 6) fail_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.6, 0.42), dispersion = 0.4 ) dropout_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.05, 0.05), duration = c(12, 12) ) fixed_design <- sample_size_nbinom( lambda1 = 0.6, lambda2 = 0.42, dispersion = 0.4, power = 0.8, alpha = 0.025, accrual_rate = 12, accrual_duration = 6, trial_duration = 12, max_followup = 6 ) gs_design <- gsNBCalendar( fixed_design, k = 3, test.type = 4, alpha = 0.025, sfu = sfHSD, sfupar = -2, sfl = sfHSD, sflpar = 1, analysis_times = c(4, 8, 12) ) sim_res <- sim_ssr_nbinom( n_sims = 2, enroll_rate = enroll_rate, fail_rate = fail_rate, dropout_rate = dropout_rate, max_followup = 6, design = gs_design, seed = 123 ) names(sim_res) head(sim_res$trial_results)
Provides a summary of the operating characteristics of the group sequential design based on simulation results.
summarize_gs_sim(x, info_trim = 0.01)summarize_gs_sim(x, info_trim = 0.01)
x |
A data frame returned by |
info_trim |
Proportion of observations trimmed from each tail when
summarizing information estimates. Defaults to |
A list containing:
Number of simulations
Overall power (probability of crossing upper bound)
Overall futility rate (probability of crossing lower bound and not upper)
Data frame with per-analysis statistics (sample
size, events, information, crossings, and optional exposure columns when
present in x).
design <- gsDesign::gsDesign(k = 2, n.fix = 80, test.type = 2, timing = c(0.5, 1)) sim_df <- data.frame( sim = c(1, 1, 2, 2), analysis = c(1, 2, 1, 2), z_stat = c(2.4, NA, -0.5, 1.9), blinded_info = c(40, 80, 40, 80), unblinded_info = c(40, 80, 40, 80), n_enrolled = c(30, 60, 30, 60), events_total = c(12, 25, 10, 22) ) bounds_checked <- check_gs_bound(sim_df, design) summarize_gs_sim(bounds_checked)design <- gsDesign::gsDesign(k = 2, n.fix = 80, test.type = 2, timing = c(0.5, 1)) sim_df <- data.frame( sim = c(1, 1, 2, 2), analysis = c(1, 2, 1, 2), z_stat = c(2.4, NA, -0.5, 1.9), blinded_info = c(40, 80, 40, 80), unblinded_info = c(40, 80, 40, 80), n_enrolled = c(30, 60, 30, 60), events_total = c(12, 25, 10, 22) ) bounds_checked <- check_gs_bound(sim_df, design) summarize_gs_sim(bounds_checked)
Produces trial-level and analysis-level summaries from the output of
sim_ssr_nbinom(). The summary includes expected sample size, stopping
probabilities, expected participants with events, and expected events
observed.
summarize_ssr_sim(x, by = "strategy")summarize_ssr_sim(x, by = "strategy")
x |
A |
by |
Character vector of grouping columns for the trial-level summary.
Defaults to |
A list with:
Trial-level grouped summary.
Analysis-level grouped summary.
set.seed(123) enroll_rate <- data.frame(rate = 10, duration = 4) fail_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.5, 0.35), dispersion = 0.3 ) fixed_design <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.35, dispersion = 0.3, power = 0.8, alpha = 0.025, accrual_rate = 10, accrual_duration = 4, trial_duration = 8, max_followup = 4 ) gs_design <- gsNBCalendar( fixed_design, k = 3, test.type = 4, alpha = 0.025, analysis_times = c(3, 5, 8) ) sim_res <- sim_ssr_nbinom( n_sims = 2, enroll_rate = enroll_rate, fail_rate = fail_rate, max_followup = 4, design = gs_design, strategies = "No adaptation", seed = 321 ) summarize_ssr_sim(sim_res)$trial_summaryset.seed(123) enroll_rate <- data.frame(rate = 10, duration = 4) fail_rate <- data.frame( treatment = c("Control", "Experimental"), rate = c(0.5, 0.35), dispersion = 0.3 ) fixed_design <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.35, dispersion = 0.3, power = 0.8, alpha = 0.025, accrual_rate = 10, accrual_duration = 4, trial_duration = 8, max_followup = 4 ) gs_design <- gsNBCalendar( fixed_design, k = 3, test.type = 4, alpha = 0.025, analysis_times = c(3, 5, 8) ) sim_res <- sim_ssr_nbinom( n_sims = 2, enroll_rate = enroll_rate, fail_rate = fail_rate, max_followup = 4, design = gs_design, strategies = "No adaptation", seed = 321 ) summarize_ssr_sim(sim_res)$trial_summary
Provides a textual summary of a group sequential design for negative binomial
outcomes, similar to the summary provided by gsDesign::gsDesign().
For tabular output, use gsDesign::gsBoundSummary() directly on
the gsNB object.
## S3 method for class 'gsNB' summary(object, ...)## S3 method for class 'gsNB' summary(object, ...)
object |
An object of class |
... |
Additional arguments (currently ignored). |
A character string summarizing the design (invisibly). The summary is also printed to the console.
nb_ss <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.9, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) gs_design <- gsNBCalendar(nb_ss, k = 3, analysis_times = c(12, 18, 24)) summary(gs_design) # For tabular bounds summary, use gsBoundSummary() directly: gsBoundSummary(gs_design)nb_ss <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.9, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) gs_design <- gsNBCalendar(nb_ss, k = 3, analysis_times = c(12, 18, 24)) summary(gs_design) # For tabular bounds summary, use gsBoundSummary() directly: gsBoundSummary(gs_design)
Provides a textual summary of the sample size calculation for negative binomial outcomes, similar to the summary for gsNB objects.
## S3 method for class 'sample_size_nbinom_result' summary(object, ...)## S3 method for class 'sample_size_nbinom_result' summary(object, ...)
object |
An object of class |
... |
Additional arguments (currently ignored). |
A character string summarizing the design (invisibly). The summary is also printed to the console.
x <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) class(x) summary(x)x <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.8, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) class(x) summary(x)
Generic function to round sample sizes in a group sequential design to integers.
This extends the gsDesign::toInteger() function from the gsDesign
package to work with gsNB objects.
toInteger(x, ...) ## S3 method for class 'gsDesign' toInteger(x, ratio = x$ratio, roundUpFinal = TRUE, ...) ## S3 method for class 'gsNB' toInteger(x, ratio = x$nb_design$inputs$ratio, roundUpFinal = TRUE, ...)toInteger(x, ...) ## S3 method for class 'gsDesign' toInteger(x, ratio = x$ratio, roundUpFinal = TRUE, ...) ## S3 method for class 'gsNB' toInteger(x, ratio = x$nb_design$inputs$ratio, roundUpFinal = TRUE, ...)
x |
An object of class |
... |
Additional arguments passed to methods. |
ratio |
Randomization ratio (n2/n1). If an integer is provided, rounding
is done to a multiple of |
roundUpFinal |
If |
This function rounds the final sample size while maintaining the randomization ratio. When calendar analysis times are available, interim sample sizes remain expected enrollment counts at those calendar times after rescaling the accrual rate to the rounded final sample size.
When analysis_times were provided to gsNBCalendar(),
expected events, exposure, and statistical information (n.I) are recomputed
at each analysis time based on the new sample size and expected exposures.
An object of the same class as input with integer sample sizes.
toInteger(gsDesign): Method for gsDesign objects (calls gsDesign::toInteger()).
toInteger(gsNB): Method for gsNB objects.
Rounds sample sizes in a group sequential negative binomial design to integers, respecting the randomization ratio.
nb_ss <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.9, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) gs_design <- gsNBCalendar(nb_ss, k = 3, analysis_times = c(12, 18, 24)) gs_integer <- toInteger(gs_design)nb_ss <- sample_size_nbinom( lambda1 = 0.5, lambda2 = 0.3, dispersion = 0.1, power = 0.9, accrual_rate = 10, accrual_duration = 20, trial_duration = 24 ) gs_design <- gsNBCalendar(nb_ss, k = 3, analysis_times = c(12, 18, 24)) gs_integer <- toInteger(gs_design)
Estimates the event rates and dispersion from unblinded interim data and calculates the required sample size to maintain power, assuming the planned treatment effect holds (or using the observed control rate).
unblinded_ssr( data, ratio = 1, lambda1_planning, lambda2_planning, rr0 = 1, power = 0.8, alpha = 0.025, accrual_rate, accrual_duration, trial_duration, dropout_rate = 0, max_followup = NULL, event_gap = NULL )unblinded_ssr( data, ratio = 1, lambda1_planning, lambda2_planning, rr0 = 1, power = 0.8, alpha = 0.025, accrual_rate, accrual_duration, trial_duration, dropout_rate = 0, max_followup = NULL, event_gap = NULL )
data |
A data frame containing the unblinded interim data. Must include
columns |
ratio |
Planned allocation ratio (experimental / control). Default is 1. |
lambda1_planning |
Planned event rate for the control group used in original calculation. |
lambda2_planning |
Planned event rate for the experimental group used in original calculation. |
rr0 |
Rate ratio under the null hypothesis (lambda2/lambda1). Default is 1. |
power |
Target power (1 - beta). Default is 0.8. |
alpha |
One-sided significance level. Default is 0.025. |
accrual_rate |
Vector of accrual rates (patients per unit time). |
accrual_duration |
Vector of durations for each accrual rate. Must be same length
as |
trial_duration |
Total planned duration of the trial. |
dropout_rate |
Dropout rate (hazard rate). Default is 0. |
max_followup |
Maximum follow-up time for any patient. Default is NULL (infinite). |
event_gap |
Gap duration after each event during which no new events are counted. Default is NULL (no gap). |
If the maximum likelihood negative binomial fit fails to converge, the
function falls back to method-of-moments estimation via estimate_nb_mom()
rather than erroring out. The observed Fisher information is then computed
analytically from the MoM-estimated rates and dispersion using the same
subject-level weight formula as calculate_blinded_info(). This keeps SSR
updates well-defined under extreme overdispersion or sparse interim data.
A list containing:
Re-estimated total sample size using unblinded estimates.
Estimated dispersion parameter (k) from unblinded data.
Estimated control event rate from unblinded data.
Estimated experimental event rate from unblinded data.
Estimated information fraction at interim (unblinded information / target information).
Estimated statistical information from the unblinded interim data.
Target statistical information required for the planned power.
Character label for which estimator was used
("ml" or "mom").
interim <- data.frame( events = c(1, 2, 1, 3), tte = c(0.8, 1.0, 1.2, 0.9), treatment = c("Control", "Control", "Experimental", "Experimental") ) unblinded_ssr( interim, ratio = 1, lambda1_planning = 0.5, lambda2_planning = 0.3, power = 0.8, alpha = 0.025, accrual_rate = 10, accrual_duration = 12, trial_duration = 18 )interim <- data.frame( events = c(1, 2, 1, 3), tte = c(0.8, 1.0, 1.2, 0.9), treatment = c("Control", "Control", "Experimental", "Experimental") ) unblinded_ssr( interim, ratio = 1, lambda1_planning = 0.5, lambda2_planning = 0.3, power = 0.8, alpha = 0.025, accrual_rate = 10, accrual_duration = 12, trial_duration = 18 )