--- title: "Multi-season studies for rare events" output: rmarkdown::html_vignette bibliography: gsDesign.bib vignette: > %\VignetteIndexEntry{Multi-season studies for rare events} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", dev = "svg", fig.ext = "svg", fig.width = 7.2916667, fig.asp = 0.618, fig.align = "center", out.width = "80%" ) options(width = 58) ``` ```{r, message=FALSE, warning=FALSE} library(gsDesign) library(gt) library(tibble) ``` ## Introduction This vignette demonstrates a practical design workflow for rare seasonal events. The motivating use case is a vaccine efficacy trial with annual high-risk seasons. The same ideas can be adapted to infectious disease studies where simple superiority or non-inferiority may be considered. For a general introduction to exact-binomial vaccine efficacy monitoring, see `vignette("VaccineEfficacy", package = "gsDesign")`. We focus on exact binomial monitoring with spending functions: - analyses after each season with equally spaced spending times, - efficacy spending based on `sfHSD` with `gamma = 1` (Pocock-like), - exact binomial p-values from `repeatedPValueBinomialExact()` and `sequentialPValueBinomialExact()`. We include both a fixed group sequential design and a blinded information-adaptive option that can increase future enrollment if events accrue slowly. Conceptually, enrollment is concentrated in early fall (for example, September and October), and each analysis uses a data cut at the end of the corresponding high-risk season (for example, May 1). ## Assumptions and parameterization We use a super-superiority example: - null vaccine efficacy (VE) = 30% (`HR0 = 0.7`), - alternative VE = 80% (`HR = 0.2`), - one-sided `alpha = 0.025`, `beta = 0.1`, - randomization ratio = 3:1 (experimental:control), - control seasonal event rate = 0.3% over 6 months, - dropout modeled as exponential with 10% dropout over each 6-month season. ```{r} alpha <- 0.025 beta <- 0.1 ratio <- 3 ve0 <- 0.30 ve1 <- 0.80 hr0 <- 1 - ve0 hr1 <- 1 - ve1 seasonal_event_rate_control <- 0.003 season_length_months <- 6 season_length_years <- season_length_months / 12 dropout_6mo <- 0.10 n_seasons <- 3 stopifnot(n_seasons >= 2) timing <- seq_len(n_seasons) / n_seasons enrollment_months <- 2 off_enrollment_months <- 10 annual_cycle_months <- enrollment_months + off_enrollment_months enroll_pattern <- c(rep(c(1, 0), n_seasons - 1), 1) enroll_periods <- c(rep(c(enrollment_months, off_enrollment_months), n_seasons - 1), enrollment_months) calendar_time <- enrollment_months + season_length_months + annual_cycle_months * (seq_len(n_seasons) - 1) test_lower <- rep(FALSE, n_seasons) test_lower[1] <- TRUE ``` For the exact binomial approximation, the probability that an event is in the experimental group is ```{r} p_event_experimental <- function(ve, randomization_ratio) { randomization_ratio / (randomization_ratio + 1 / (1 - ve)) } p0 <- p_event_experimental(ve0, ratio) p1 <- p_event_experimental(ve1, ratio) c(p0 = p0, p1 = p1) ``` ## Initial group sequential setup To define spending-function monitoring, we first construct a calendar-time `gsSurvCalendar` object and then convert to integer event looks and exact binomial bounds. Here we use `r n_seasons` looks at months `r paste(calendar_time, collapse = ", ")`, corresponding to the end of each season after that year's enrollment period. Enrollment is modeled as `r enrollment_months` months on and `r off_enrollment_months` months off each year for `r n_seasons` years. The `gsSurvCalendar` inputs use months as the time unit, so seasonal event and dropout probabilities are converted to monthly hazards before fitting the design. Since events are seasonal, the control event hazard is modeled as piecewise constant: positive for the `r season_length_months`-month high-risk season and zero afterward. ```{r} lambdaC <- c(-log(1 - seasonal_event_rate_control) / season_length_months, 0) S <- season_length_months eta <- -log(1 - dropout_6mo) / season_length_months # Integer conversion can slightly adjust total enrollment so the final integer # event target is achievable with the seasonal piecewise event-rate model. design_calendar <- gsSurvCalendar( test.type = 4, # Non-binding lower bound framework alpha = alpha, # One-sided Type I error beta = beta, # Type II error calendarTime = calendar_time, # Analysis times in months spending = "information", # Spending by information fraction sfu = sfHSD, # Efficacy spending function sfupar = 1, # Pocock-like efficacy spending sfl = sfHSD, # Futility spending function sflpar = -2, # Futility spending parameter lambdaC = lambdaC, # Control hazard rate per month by period S = S, # Event-rate period duration in months hr = hr1, # Alternative hypothesis HR hr0 = hr0, # Null hypothesis HR eta = eta, # Dropout hazard rate per month gamma = enroll_pattern, # Relative enrollment rates by period R = enroll_periods, # Enrollment period durations in months minfup = season_length_months, # Minimum follow-up for final enrollees ratio = ratio, # Experimental:control randomization ratio testLower = test_lower # Futility only at selected analyses ) |> toInteger() |> suppressWarnings() gsBoundSummary(design_calendar) planned_final_events <- design_calendar$n.I[design_calendar$k] planned_counts <- as.integer(round(planned_final_events * timing)) planned_counts[n_seasons] <- planned_final_events for (j in seq_along(planned_counts)[-1]) { planned_counts[j] <- max(planned_counts[j], planned_counts[j - 1] + 1L) } design_exact <- toBinomialExact(design_calendar, observedEvents = planned_counts) planned_enrollment_period <- as.numeric(rowSums(as.matrix(design_calendar$gamma))) * as.numeric(design_calendar$R) season_id <- rep(seq_len(n_seasons), each = 2, length.out = length(planned_enrollment_period)) planned_enrollment_by_season <- as.integer(round(tapply(planned_enrollment_period, season_id, sum))) planned_enrollment_control <- as.integer(round(planned_enrollment_by_season / (1 + ratio))) planned_enrollment_experimental <- planned_enrollment_by_season - planned_enrollment_control planned_cum_enrollment <- cumsum(planned_enrollment_by_season) ``` The table above is the initial survival-design approximation and includes the approximate futility `Z`-boundary. The table below summarizes planned event looks and exact binomial efficacy/futility bounds. In this table, `x` is the cumulative number of observed events in the experimental arm at a given analysis. ```{r} target_alpha_spend <- design_calendar$upper$sf( alpha = alpha, t = timing, param = design_calendar$upper$param )$spend achieved_alpha_spend <- cumsum( gsBinomialExact( k = design_exact$k, # Number of analyses theta = design_exact$theta[1], # Null event probability in experimental arm n.I = design_exact$n.I, # Cumulative events by analysis a = design_exact$lower$bound, # Efficacy bounds (x <= a crosses) b = design_exact$n.I + 1 # Non-binding upper bound for alpha-spend check )$lower$prob[, 1] ) achieved_power_h1 <- cumsum( gsBinomialExact( k = design_exact$k, # Number of analyses theta = design_exact$theta[2], # Alternative event probability in experimental arm n.I = design_exact$n.I, # Cumulative events by analysis a = design_exact$lower$bound, # Efficacy bounds b = design_exact$n.I + 1 # Non-binding upper bound for cumulative efficacy probability )$lower$prob[, 1] ) ve_from_bound <- function(x, n, ratio) { out <- rep(NA_real_, length(x)) ok <- x >= 0 & x < n if (any(ok)) { p <- x[ok] / n[ok] hr <- p / (ratio * (1 - p)) out[ok] <- 1 - hr } out } futility_active <- if (!is.null(design_calendar$testLower)) { tl <- design_calendar$testLower if (length(tl) == 1) tl <- rep(tl, design_exact$k) as.logical(tl) } else { design_exact$upper$bound <= design_exact$n.I } nominal_p_futility <- rep(NA_real_, design_exact$k) nominal_p_futility[futility_active] <- stats::pbinom( q = design_exact$upper$bound[futility_active], size = design_exact$n.I[futility_active], prob = p0 ) tibble( Season = seq_len(n_seasons), `Spending time` = timing, `Planned total events` = design_exact$n.I, `Approx cumulative enrollment` = planned_cum_enrollment, `Exact efficacy bound (x <= a)` = design_exact$lower$bound, `VE at bound (efficacy)` = ve_from_bound(design_exact$lower$bound, design_exact$n.I, ratio), `Nominal 1-sided p at bound (efficacy)` = stats::pbinom(design_exact$lower$bound, design_exact$n.I, p0), `Exact futility bound (x >= b)` = ifelse(futility_active, design_exact$upper$bound, NA_integer_), `VE at bound (futility)` = ve_from_bound( ifelse(futility_active, design_exact$upper$bound, -1L), design_exact$n.I, ratio ), `Nominal 1-sided p at bound (futility)` = nominal_p_futility, `Target cumulative alpha spend` = target_alpha_spend, `Achieved cumulative alpha spend` = achieved_alpha_spend, `Cumulative power under H1` = achieved_power_h1 ) |> gt() |> fmt_number(columns = 2, decimals = 3) |> fmt_percent(columns = c(`VE at bound (efficacy)`, `VE at bound (futility)`), decimals = 1) |> fmt_number( columns = c( `Nominal 1-sided p at bound (efficacy)`, `Nominal 1-sided p at bound (futility)`, `Target cumulative alpha spend`, `Achieved cumulative alpha spend`, `Cumulative power under H1` ), decimals = 4 ) |> tab_spanner( label = "Efficacy", columns = c( `Exact efficacy bound (x <= a)`, `VE at bound (efficacy)`, `Nominal 1-sided p at bound (efficacy)` ) ) |> tab_spanner( label = "Futility", columns = c( `Exact futility bound (x >= b)`, `VE at bound (futility)`, `Nominal 1-sided p at bound (futility)` ) ) |> tab_header( title = "Planned exact binomial seasonal monitoring", subtitle = "Super-superiority example with Pocock-like efficacy spending" ) |> tab_footnote( footnote = "x denotes cumulative observed events in the experimental arm; efficacy is established when x is at or below the listed efficacy bound.", locations = cells_column_labels(columns = `Exact efficacy bound (x <= a)`) ) |> tab_footnote( footnote = "Blank futility entries indicate no futility stopping boundary at that analysis.", locations = cells_column_labels(columns = `Exact futility bound (x >= b)`) ) ``` The next table gives planned enrollment/sample size by season and overall. ```{r} enrollment_table <- tibble( Season = as.character(seq_len(n_seasons)), `Control planned enrollment` = planned_enrollment_control, `Experimental planned enrollment` = planned_enrollment_experimental ) |> dplyr::mutate( `Total planned enrollment` = `Control planned enrollment` + `Experimental planned enrollment`, `Cumulative planned enrollment` = cumsum(`Total planned enrollment`) ) dplyr::bind_rows( enrollment_table, tibble( Season = "Overall", `Control planned enrollment` = sum(enrollment_table$`Control planned enrollment`), `Experimental planned enrollment` = sum(enrollment_table$`Experimental planned enrollment`), `Total planned enrollment` = sum(enrollment_table$`Total planned enrollment`), `Cumulative planned enrollment` = sum(enrollment_table$`Total planned enrollment`) ) ) |> gt() |> tab_header(title = "Planned enrollment by season and overall") ``` ## Example repeated and sequential p-values Suppose observed event totals at the `r n_seasons` seasonal analyses are equal to the planned totals and observed experimental events are as below. Here, `x` is the observed number of events in the experimental arm. The offset sets season 2 to be one event above the efficacy bound to demonstrate that the repeated p-value is greater than 0.025. ```{r} x_offset_from_efficacy <- rep(0L, n_seasons) x_offset_from_efficacy[min(2L, n_seasons)] <- 1L example_x <- pmax(0L, design_exact$lower$bound + x_offset_from_efficacy) example_p <- repeatedPValueBinomialExact( gsD = design_calendar, n.I = design_exact$n.I, x = example_x ) example_p ``` The sequential p-value is the minimum repeated p-value: ```{r} sequentialPValueBinomialExact( gsD = design_calendar, n.I = design_exact$n.I, x = example_x ) ``` ## Update exact bounds at analysis time The official package path for updating exact-binomial bounds at analysis time is to call `toBinomialExact()` with `observedEvents`. The spending denominator remains the original planned final event count from the design object, so earlier analyses are not changed when observed totals differ from plan. At look `j`, spending time is based on `observedEvents[j] / planned_final_events` (capped at 1). If a look reaches the planned final event count, all remaining spending is used at that look. If the final look is under-run but you still want to use full alpha at the last analysis, set `maxSpend = TRUE`. This only changes the final spending time to 1 and leaves earlier looks unchanged. For explicit control, you can pass `usTime` (and for `test.type = 4`, `lsTime`) directly to `toBinomialExact()`, following the same spending-time conventions as `gsDesign()`, `gsSurv()`, and `gsSurvCalendar()`. As above, setting `x` one event above an updated efficacy bound at a look gives a repeated p-value above 0.025 for that analysis. ```{r} observed_counts_update <- c( planned_counts[-n_seasons], max(planned_counts[n_seasons - 1] + 1L, planned_counts[n_seasons] - 5L) ) update_exact <- toBinomialExact(design_calendar, observedEvents = observed_counts_update) update_exact_full <- toBinomialExact( design_calendar, observedEvents = observed_counts_update, maxSpend = TRUE ) tibble( Analysis = seq_along(update_exact$n.I), `Observed total events` = update_exact$n.I, `Updated efficacy bound (x <= a), default spending` = update_exact$lower$bound, `Updated efficacy bound (x <= a), maxSpend=TRUE` = update_exact_full$lower$bound, `Updated futility bound, default spending` = update_exact$upper$bound, `Updated futility bound, maxSpend=TRUE` = update_exact_full$upper$bound ) |> gt() |> tab_header(title = "Updated exact bounds using observedEvents") ``` ## Lightweight runnable simulation We use the package helper `simBinomialSeasonalExact()` to run both fixed and blinded-adaptive seasonal scenarios. In this vignette, we set `final_full_spending = TRUE` to force full alpha spending at the final look even when the final observed total event count is below plan. ```{r} ve_scenarios <- c(`H0 (VE=30%)` = ve0, `H1 (VE=80%)` = ve1) planned_control_event_rates <- rep(seasonal_event_rate_control, length(ve_scenarios)) sim_light <- simBinomialSeasonalExact( gsD = design_calendar, ve = ve_scenarios, nsim = rep(150, length(ve_scenarios)), control_event_rate = planned_control_event_rates, season_length = season_length_years, dropout_rate = dropout_6mo, planned_counts = planned_counts, enroll_control_per_look = planned_enrollment_control, enroll_experimental_per_look = planned_enrollment_experimental, adaptive = c(FALSE, TRUE), max_multiplier = 2, final_full_spending = TRUE, seed = 101 ) ``` ```{r} oc <- sim_light$summary |> dplyr::mutate( Scenario = ifelse(adaptive, paste0("Adaptive: ", scenario), paste0("Fixed: ", scenario)) ) |> dplyr::select( Scenario, `Efficacy crossing probability` = rejection_rate, `Futility stopping probability` = futility_stop_rate, `MC SE (efficacy)` = mc_se, `MC SE (futility)` = futility_mc_se, `Mean total events` = mean_total_events, `Mean total enrolled` = mean_total_enrolled, `Mean looks used` = mean_looks ) oc |> gt() |> fmt_number(columns = 2:5, decimals = 4) |> fmt_number(columns = 6:8, decimals = 2) |> tab_header( title = "Lightweight simulation results", subtitle = "Exact-binomial monitoring with seasonal analyses" ) |> tab_footnote( footnote = "For VE=30% scenarios, efficacy crossing probability is Type I error under the non-binding futility convention (futility crossings do not block later efficacy crossings).", locations = cells_column_labels(columns = `Efficacy crossing probability`) ) ``` ## Example with lower-than-planned event rates To illustrate adaptation when events are lower than planned, we halve the seasonal control event rate and compare fixed versus adaptive monitoring for both `VE = 30%` and `VE = 80%`. ```{r} low_control_event_rates <- planned_control_event_rates / 2 sim_low <- simBinomialSeasonalExact( gsD = design_calendar, ve = ve_scenarios, nsim = rep(300, length(ve_scenarios)), control_event_rate = low_control_event_rates, season_length = season_length_years, dropout_rate = dropout_6mo, planned_counts = planned_counts, enroll_control_per_look = planned_enrollment_control, enroll_experimental_per_look = planned_enrollment_experimental, adaptive = c(FALSE, TRUE), max_multiplier = 2, final_full_spending = TRUE, seed = 505 ) low <- sim_low$summary tibble( Scenario = c( "Without adaptation: Type I error (VE=30%)", "With adaptation: Type I error (VE=30%)", "Without adaptation: Power (VE=80%)", "With adaptation: Power (VE=80%)" ), `Efficacy crossing probability` = c( low$rejection_rate[!low$adaptive & low$scenario == "H0 (VE=30%)"], low$rejection_rate[low$adaptive & low$scenario == "H0 (VE=30%)"], low$rejection_rate[!low$adaptive & low$scenario == "H1 (VE=80%)"], low$rejection_rate[low$adaptive & low$scenario == "H1 (VE=80%)"] ), `Futility stopping probability` = c( low$futility_stop_rate[!low$adaptive & low$scenario == "H0 (VE=30%)"], low$futility_stop_rate[low$adaptive & low$scenario == "H0 (VE=30%)"], low$futility_stop_rate[!low$adaptive & low$scenario == "H1 (VE=80%)"], low$futility_stop_rate[low$adaptive & low$scenario == "H1 (VE=80%)"] ), `Mean total events` = c( low$mean_total_events[!low$adaptive & low$scenario == "H0 (VE=30%)"], low$mean_total_events[low$adaptive & low$scenario == "H0 (VE=30%)"], low$mean_total_events[!low$adaptive & low$scenario == "H1 (VE=80%)"], low$mean_total_events[low$adaptive & low$scenario == "H1 (VE=80%)"] ), `Mean total enrolled` = c( low$mean_total_enrolled[!low$adaptive & low$scenario == "H0 (VE=30%)"], low$mean_total_enrolled[low$adaptive & low$scenario == "H0 (VE=30%)"], low$mean_total_enrolled[!low$adaptive & low$scenario == "H1 (VE=80%)"], low$mean_total_enrolled[low$adaptive & low$scenario == "H1 (VE=80%)"] ), `Mean looks used` = c( low$mean_looks[!low$adaptive & low$scenario == "H0 (VE=30%)"], low$mean_looks[low$adaptive & low$scenario == "H0 (VE=30%)"], low$mean_looks[!low$adaptive & low$scenario == "H1 (VE=80%)"], low$mean_looks[low$adaptive & low$scenario == "H1 (VE=80%)"] ) ) |> gt() |> fmt_number(columns = 2:3, decimals = 4) |> fmt_number(columns = 4:6, decimals = 2) |> tab_header( title = "Lower-than-planned event rate illustration", subtitle = "Adaptive approach increases enrollment to recover information" ) |> tab_footnote( footnote = "Type I error rows use non-binding futility for efficacy crossing probability; futility stopping probability is shown separately.", locations = cells_column_labels(columns = `Efficacy crossing probability`) ) ``` This table provides side-by-side comparisons of Type I error, power, and futility stopping probability without and with adaptation under the lower-than-planned event-rate scenario. ## Larger offline runs (template) The chunk below is intentionally not executed in package builds. ```{r, eval=FALSE} # Suggested offline settings type1_nsim <- 20000 power_nsim <- 3500 sim_type1_big <- simBinomialSeasonalExact( gsD = design_calendar, ve = c(`H0 (VE=30%)` = ve0), nsim = type1_nsim, control_event_rate = seasonal_event_rate_control, season_length = season_length_years, dropout_rate = dropout_6mo, planned_counts = planned_counts, enroll_control_per_look = planned_enrollment_control, enroll_experimental_per_look = planned_enrollment_experimental, adaptive = c(FALSE, TRUE), final_full_spending = TRUE, seed = 5001 ) sim_power_big <- simBinomialSeasonalExact( gsD = design_calendar, ve = c(`H1 (VE=80%)` = ve1), nsim = power_nsim, control_event_rate = seasonal_event_rate_control, season_length = season_length_years, dropout_rate = dropout_6mo, planned_counts = planned_counts, enroll_control_per_look = planned_enrollment_control, enroll_experimental_per_look = planned_enrollment_experimental, adaptive = c(FALSE, TRUE), final_full_spending = TRUE, seed = 6001 ) ``` ## Notes and extensions - The adaptive pathway shown here is blinded because it uses total event counts only; treatment-group differences are not used to update enrollment. - Spending fractions are set by `timing`; with the current specification they are `r paste(round(timing, 3), collapse = ", ")`. - For modified intention-to-treat analyses, additional exclusion/dropout mechanisms can be layered into the simulation by reducing at-risk counts before each seasonal event/dropout draw. - Simple superiority and non-inferiority variants can be studied by changing `hr0`/`hr1` and rebuilding the same workflow. ## References