Time to event

Noemi Montobbio

2026-06-18

This vignette demonstrates how to use the msprog package to analyse the disability course in multiple sclerosis (MS) in a time-to-event setting (e.g., to compute survival times for survival analysis).

For a more general introduction on msprog package usage for disability course assessment, please refer to the vignette Analysing disability course in MS.

We shall use the toy datasets toydata_visits and toydata_relapses provided in msprog.

library(msprog)
head(toydata_visits)
#> # A tibble: 6 × 5
#>   id    date       visit_day  EDSS  SDMT
#>   <chr> <date>         <dbl> <dbl> <dbl>
#> 1 1     2021-09-23         0   5.5    54
#> 2 1     2021-11-03        41   5.5    54
#> 3 1     2022-01-19       118   5.5    57
#> 4 1     2022-04-27       216   5.5    55
#> 5 1     2022-07-12       292   6      57
#> 6 1     2022-11-06       409   6      51
head(toydata_relapses)
#> # A tibble: 5 × 3
#>   id    date       visit_day
#>   <chr> <date>         <dbl>
#> 1 2     2021-06-12       198
#> 2 2     2022-10-25       698
#> 3 3     2022-12-01       409
#> 4 6     2022-12-18       426
#> 5 7     2021-09-11       185

Time to first disability worsening event

Given the relative rarity of disability worsening events in MS, it is quite common to use the time to first confirmed disability worsening as an endpoint of interest. This is the case especially in clinical trials, where follow-up periods are often limited to 2-3 years. Survival times may be computed using the following code:

output_edss <- MSprog(toydata_visits,
                 subj_col="id", value_col="EDSS", date_col="date",
                 outcome="edss",
                 event="firstCDW",   # <--- only detect first CDW event
                 relapse=toydata_relapses,
                 verbose=0)

Note in particular that the event argument is set to firstCDW – i.e., only the first confirmed disability worsening (CDW) event is detected. Many other arguments are not explicitly set: for those, the defaults are used (you can type ?MSprog to access the documentation explaining how to specify each argument). Let’s print out a description of the criteria used for the above computation, to make sure all the settings are as expected.

print(output_edss)
#> ---
#> #> msprog version: 1.0.0
#> #> ---
#> #> MSprog() arguments:
#> #> outcome=edss, event=firstCDW, RAW_PIRA=FALSE, baseline=fixed, proceed_from=firstconf,
#> validconf_col=validconf, skip_local_extrema=none, conf_days=84, conf_tol_days=c(7,
#> 730.5), require_sust_days=0, check_intermediate=TRUE, relapse_to_bl=c(30, 0),
#> relapse_to_event=c(0, 0), relapse_to_conf=c(30, 0), relapse_assoc=c(90, 0),
#> relapse_indep=list(prec = list(0, 0), event = list(90, 30), conf = list(90, 30),
#> prec_type = "baseline"), renddate_col=NULL, sub_threshold_rebl=none, bl_geq=FALSE,
#> relapse_rebl=FALSE, impute_last_visit=0, worsening=increase,
#> #> delta_fun=NULL
#> #>
#> #> Textual description of applied criteria:
#> #> We detected the first EDSS CDW event confirmed over 84 days (with a lower tolerance of
#> 7 days and an upper tolerance of 730.5 days). A visit could not be used as confirmation
#> if occurring within 30 days after the onset of a relapse. The baseline was kept fixed at
#> the first visit, unless this occurred within 30 days after the onset of a relapse - in
#> which case the baseline was moved to the next eligible visit.
#> #> ---
#> #> Clinically meaningful threshold for EDSS change (delta function): default for EDSS
#> (1.5 if baseline=0, 1.0 if 0.0<baseline<=5.0, 0.5 if baseline>5.0)
#> 

To access the time-to-event data, we need to extract the results attribute from the function output.

res <- output_edss$results
# print(res, row.names=FALSE)
DT::datatable(res, rownames=F,
              options = list(dom="t", scrollX=T, scrollY="200px", paging = FALSE)
              )

If we were to conduct a survival analysis with censoring, we would be interested in the "time2event" column (time to CDW, in days) and in the "nevent" column (1 if the event occurred, 0 otherwise):

survival_data <- res[c("id", "time2event", "nevent")]
# print(survival_data, row.names=FALSE)
DT::datatable(survival_data, rownames=F,
              options = list(dom="t", scrollX=T, scrollY="200px", paging = FALSE)
              )

Note that, in a more general multiple-event setting, the "nevent" column would contain the count of events for each subject. Here, since we’re stopping at the first event, this number can only be 0 (no event) or 1 (event).

The data is now ready to use for survival analysis. Here’s an example code that one may use to plot Kaplan-Meier curves from survival_data:

# library(dplyr)
# library(ggsurvfit)
survfit2(Surv(time2event, nevent) ~ 1, data=survival_data) %>%
  ggsurvfit() +
  labs(
    x = "time (days)",
    y = "survival probability"
  )

Time to disability milestone

Instead of studying disability worsening with respect to a baseline value, one can focus on the time taken to reach a specific disability milestone. This can be computed using msprog::value_milestone().

The following code detects the time to EDSS \(\geq\) 4.5 for all subjects in our toy data. Similar to MSprog(), we can set verbose=2 to display progress info.

vm <- value_milestone(toydata_visits, milestone=4.5,
                 subj_col="id", value_col="EDSS", date_col="date",
                 outcome="edss", relapse=toydata_relapses,
                 verbose=2)
#> 
#> Subject #1: 8 visits, 0 relapses
#> Found value>=4.5 at visit no.1 (2021-09-23); potential confirmation visits available: no. 3, 4, 5, 6, 7, 8
#> Confirmed value>=4.5 (visit no.1, 2021-09-23): end process
#> 
#> Subject #2: 10 visits, 2 relapses
#> Found value>=4.5 at visit no.3 (2021-03-24); potential confirmation visits available: no. 5, 6, 7, 8, 9, 10
#> Confirmed value>=4.5 (visit no.3, 2021-03-24): end process
#> 
#> Subject #3: 7 visits, 1 relapse
#> Found value>=4.5 at visit no.2 (2021-12-01); potential confirmation visits available: no. 4, 5, 7
#> Value>=4.5 not confirmed: proceed with search
#> No value>=4.5 in any visit: end process
#> 
#> Subject #4: 7 visits, 0 relapses
#> Found value>=4.5 at visit no.1 (2021-09-18); potential confirmation visits available: no. 2, 3, 4, 5, 6, 7
#> Value>=4.5 not confirmed: proceed with search
#> Found value>=4.5 at visit no.4 (2022-07-19); potential confirmation visits available: no. 5, 6, 7
#> Confirmed value>=4.5 (visit no.4, 2022-07-19): end process
#> 
#> Subject #5: 9 visits, 0 relapses
#> Found value>=4.5 at visit no.3 (2021-11-01); potential confirmation visits available: no. 4, 5, 6, 7, 8, 9
#> Confirmed value>=4.5 (visit no.3, 2021-11-01): end process
#> 
#> Subject #6: 7 visits, 1 relapse
#> Found value>=4.5 at visit no.1 (2021-10-18); potential confirmation visits available: no. 3, 4, 5, 7
#> Confirmed value>=4.5 (visit no.1, 2021-10-18): end process
#> 
#> Subject #7: 9 visits, 1 relapse
#> Found value>=4.5 at visit no.3 (2021-09-11); potential confirmation visits available: no. 4, 5, 6, 7, 8, 9
#> Value>=4.5 not confirmed: proceed with search
#> Found value>=4.5 at visit no.5 (2022-03-17); potential confirmation visits available: no. 6, 7, 8, 9
#> Confirmed value>=4.5 (visit no.5, 2022-03-17): end process
#> 
#> ---
#> Outcome: edss
#> Confirmation over: 84 days (-7 days, +730.5 days)
#> Event skipped if: -
#> Confirmation visit skipped if: <30 days from last relapse
#> 
#> ---
#> Total subjects: 7
#> 6 reached the milestone EDSS=4.5.

For example, subject 2 reached EDSS=4.5 at their third visit, on date 2021-03-24:

print(toydata_visits[toydata_visits$id==2, c("date", "EDSS")], row.names=FALSE)
#> # A tibble: 10 × 2
#>    date        EDSS
#>    <date>     <dbl>
#>  1 2020-11-26   4  
#>  2 2020-12-30   4  
#>  3 2021-03-24   4.5
#>  4 2021-06-12   5.5
#>  5 2021-09-04   5  
#>  6 2021-12-02   4.5
#>  7 2022-02-23   4.5
#>  8 2022-05-19   6  
#>  9 2022-08-28   6  
#> 10 2022-11-26   6

Subject 4 had EDSS=4.5 at their first visit, but it was not confirmed; the first confirmed value \(\geq\) 4.5 was found at the fourth visit, on date 2022-07-19:

print(toydata_visits[toydata_visits$id==4, c("date", "EDSS")], row.names=FALSE)
#> # A tibble: 7 × 2
#>   date        EDSS
#>   <date>     <dbl>
#> 1 2021-09-18   4.5
#> 2 2021-12-04   3.5
#> 3 2022-03-12   3.5
#> 4 2022-07-19   5  
#> 5 2022-10-05   5  
#> 6 2023-01-16   5.5
#> 7 2023-04-27   5

The function returned the following data frame:

# print(vm)
DT::datatable(vm, 
              options = list(dom="t", scrollX=T, scrollY="200px", paging = FALSE)
              )

where: "date" contains the date of the first confirmed EDSS \(\geq\) 4.5 (or last date of follow-up if milestone was not reached or not confirmed); "EDSS" contains the first EDSS value \(\geq\) 4.5, if present, otherwise no value; "time2event" contains the time to reach EDSS \(\geq\) 4.5 (or total follow-up length if not reached or not confirmed); "observed" indicates whether EDSS \(\geq\) 4.5 was reached (1) or not (0).

Please refer to the documentation (by typing ?value_milestone) for a complete illustration of each of the function arguments and their default values.

As before, for survival analysis we would need the "time2event" column (time to reach EDSS \(\geq\) 4.5, in days) and the "observed" column (1 if the milestone was reached, 0 otherwise).

Multiple events

In studies with longer follow-ups, it can be informative to analyse recurrent events with consecutive survival times.

The following code extracts multiple EDSS events from our toy data with a roving baseline scheme (i.e., the baseline is updated after each event to the first available confirmation visit).

output <- MSprog(toydata_visits, "id", "EDSS", "date", "edss", relapse=toydata_relapses,
                event="multiple", baseline="roving",  # <--- detect multiple events with a roving baseline
                verbose=0)
survival_data <- output$results[c("id", "nevent", "time2event")]
# print(survival_data, row.names=FALSE)
DT::datatable(survival_data, rownames=F,
              options = list(dom="t", scrollX=T, scrollY="200px", paging = FALSE)
              )

Note that the event argument is set to multiple, and that the results can contain more than one row for the same subject, with the "nevent" column keeping track of cumulative event count.