This vignette demonstrates how to use the msprog package
to analyse the disability course in multiple sclerosis (MS) in a
time-to-event setting (e.g., to compute survival times for survival
analysis).
For a more general introduction on msprog package usage
for disability course assessment, please refer to the vignette
Analysing disability course in MS.
We shall use the toy datasets toydata_visits and
toydata_relapses provided in msprog.
library(msprog)
head(toydata_visits)
#> # A tibble: 6 × 5
#> id date visit_day EDSS SDMT
#> <chr> <date> <dbl> <dbl> <dbl>
#> 1 1 2021-09-23 0 5.5 54
#> 2 1 2021-11-03 41 5.5 54
#> 3 1 2022-01-19 118 5.5 57
#> 4 1 2022-04-27 216 5.5 55
#> 5 1 2022-07-12 292 6 57
#> 6 1 2022-11-06 409 6 51head(toydata_relapses)
#> # A tibble: 5 × 3
#> id date visit_day
#> <chr> <date> <dbl>
#> 1 2 2021-06-12 198
#> 2 2 2022-10-25 698
#> 3 3 2022-12-01 409
#> 4 6 2022-12-18 426
#> 5 7 2021-09-11 185Given the relative rarity of disability worsening events in MS, it is quite common to use the time to first confirmed disability worsening as an endpoint of interest. This is the case especially in clinical trials, where follow-up periods are often limited to 2-3 years. Survival times may be computed using the following code:
output_edss <- MSprog(toydata_visits,
subj_col="id", value_col="EDSS", date_col="date",
outcome="edss",
event="firstCDW", # <--- only detect first CDW event
relapse=toydata_relapses,
verbose=0)Note in particular that the event argument is set to
firstCDW – i.e., only the first confirmed disability
worsening (CDW) event is detected. Many other arguments are not
explicitly set: for those, the defaults are used (you can type
?MSprog to access the documentation explaining how to
specify each argument). Let’s print out a description of the criteria
used for the above computation, to make sure all the settings are as
expected.
print(output_edss)
#> ---
#> #> msprog version: 1.0.0
#> #> ---
#> #> MSprog() arguments:
#> #> outcome=edss, event=firstCDW, RAW_PIRA=FALSE, baseline=fixed, proceed_from=firstconf,
#> validconf_col=validconf, skip_local_extrema=none, conf_days=84, conf_tol_days=c(7,
#> 730.5), require_sust_days=0, check_intermediate=TRUE, relapse_to_bl=c(30, 0),
#> relapse_to_event=c(0, 0), relapse_to_conf=c(30, 0), relapse_assoc=c(90, 0),
#> relapse_indep=list(prec = list(0, 0), event = list(90, 30), conf = list(90, 30),
#> prec_type = "baseline"), renddate_col=NULL, sub_threshold_rebl=none, bl_geq=FALSE,
#> relapse_rebl=FALSE, impute_last_visit=0, worsening=increase,
#> #> delta_fun=NULL
#> #>
#> #> Textual description of applied criteria:
#> #> We detected the first EDSS CDW event confirmed over 84 days (with a lower tolerance of
#> 7 days and an upper tolerance of 730.5 days). A visit could not be used as confirmation
#> if occurring within 30 days after the onset of a relapse. The baseline was kept fixed at
#> the first visit, unless this occurred within 30 days after the onset of a relapse - in
#> which case the baseline was moved to the next eligible visit.
#> #> ---
#> #> Clinically meaningful threshold for EDSS change (delta function): default for EDSS
#> (1.5 if baseline=0, 1.0 if 0.0<baseline<=5.0, 0.5 if baseline>5.0)
#> To access the time-to-event data, we need to extract the
results attribute from the function output.
res <- output_edss$results
# print(res, row.names=FALSE)
DT::datatable(res, rownames=F,
options = list(dom="t", scrollX=T, scrollY="200px", paging = FALSE)
)If we were to conduct a survival analysis with censoring, we would be
interested in the "time2event" column (time to CDW, in
days) and in the "nevent" column (1 if the event occurred,
0 otherwise):
survival_data <- res[c("id", "time2event", "nevent")]
# print(survival_data, row.names=FALSE)
DT::datatable(survival_data, rownames=F,
options = list(dom="t", scrollX=T, scrollY="200px", paging = FALSE)
)Note that, in a more general multiple-event setting, the
"nevent" column would contain the count of events for each
subject. Here, since we’re stopping at the first event, this number can
only be 0 (no event) or 1 (event).
The data is now ready to use for survival analysis. Here’s an example
code that one may use to plot Kaplan-Meier curves from
survival_data:
Instead of studying disability worsening with respect to a baseline
value, one can focus on the time taken to reach a specific disability
milestone. This can be computed using
msprog::value_milestone().
The following code detects the time to EDSS \(\geq\) 4.5 for all subjects in our toy
data. Similar to MSprog(), we can set
verbose=2 to display progress info.
vm <- value_milestone(toydata_visits, milestone=4.5,
subj_col="id", value_col="EDSS", date_col="date",
outcome="edss", relapse=toydata_relapses,
verbose=2)
#>
#> Subject #1: 8 visits, 0 relapses
#> Found value>=4.5 at visit no.1 (2021-09-23); potential confirmation visits available: no. 3, 4, 5, 6, 7, 8
#> Confirmed value>=4.5 (visit no.1, 2021-09-23): end process
#>
#> Subject #2: 10 visits, 2 relapses
#> Found value>=4.5 at visit no.3 (2021-03-24); potential confirmation visits available: no. 5, 6, 7, 8, 9, 10
#> Confirmed value>=4.5 (visit no.3, 2021-03-24): end process
#>
#> Subject #3: 7 visits, 1 relapse
#> Found value>=4.5 at visit no.2 (2021-12-01); potential confirmation visits available: no. 4, 5, 7
#> Value>=4.5 not confirmed: proceed with search
#> No value>=4.5 in any visit: end process
#>
#> Subject #4: 7 visits, 0 relapses
#> Found value>=4.5 at visit no.1 (2021-09-18); potential confirmation visits available: no. 2, 3, 4, 5, 6, 7
#> Value>=4.5 not confirmed: proceed with search
#> Found value>=4.5 at visit no.4 (2022-07-19); potential confirmation visits available: no. 5, 6, 7
#> Confirmed value>=4.5 (visit no.4, 2022-07-19): end process
#>
#> Subject #5: 9 visits, 0 relapses
#> Found value>=4.5 at visit no.3 (2021-11-01); potential confirmation visits available: no. 4, 5, 6, 7, 8, 9
#> Confirmed value>=4.5 (visit no.3, 2021-11-01): end process
#>
#> Subject #6: 7 visits, 1 relapse
#> Found value>=4.5 at visit no.1 (2021-10-18); potential confirmation visits available: no. 3, 4, 5, 7
#> Confirmed value>=4.5 (visit no.1, 2021-10-18): end process
#>
#> Subject #7: 9 visits, 1 relapse
#> Found value>=4.5 at visit no.3 (2021-09-11); potential confirmation visits available: no. 4, 5, 6, 7, 8, 9
#> Value>=4.5 not confirmed: proceed with search
#> Found value>=4.5 at visit no.5 (2022-03-17); potential confirmation visits available: no. 6, 7, 8, 9
#> Confirmed value>=4.5 (visit no.5, 2022-03-17): end process
#>
#> ---
#> Outcome: edss
#> Confirmation over: 84 days (-7 days, +730.5 days)
#> Event skipped if: -
#> Confirmation visit skipped if: <30 days from last relapse
#>
#> ---
#> Total subjects: 7
#> 6 reached the milestone EDSS=4.5.For example, subject 2 reached EDSS=4.5 at their third
visit, on date 2021-03-24:
print(toydata_visits[toydata_visits$id==2, c("date", "EDSS")], row.names=FALSE)
#> # A tibble: 10 × 2
#> date EDSS
#> <date> <dbl>
#> 1 2020-11-26 4
#> 2 2020-12-30 4
#> 3 2021-03-24 4.5
#> 4 2021-06-12 5.5
#> 5 2021-09-04 5
#> 6 2021-12-02 4.5
#> 7 2022-02-23 4.5
#> 8 2022-05-19 6
#> 9 2022-08-28 6
#> 10 2022-11-26 6Subject 4 had EDSS=4.5 at their first visit, but it was
not confirmed; the first confirmed value \(\geq\) 4.5 was found at the fourth visit,
on date 2022-07-19:
print(toydata_visits[toydata_visits$id==4, c("date", "EDSS")], row.names=FALSE)
#> # A tibble: 7 × 2
#> date EDSS
#> <date> <dbl>
#> 1 2021-09-18 4.5
#> 2 2021-12-04 3.5
#> 3 2022-03-12 3.5
#> 4 2022-07-19 5
#> 5 2022-10-05 5
#> 6 2023-01-16 5.5
#> 7 2023-04-27 5The function returned the following data frame:
where: "date" contains the date of the first confirmed
EDSS \(\geq\) 4.5 (or last date of
follow-up if milestone was not reached or not confirmed);
"EDSS" contains the first EDSS value \(\geq\) 4.5, if present, otherwise no value;
"time2event" contains the time to reach EDSS \(\geq\) 4.5 (or total follow-up length if
not reached or not confirmed); "observed" indicates whether
EDSS \(\geq\) 4.5 was reached (1) or
not (0).
Please refer to the documentation (by typing
?value_milestone) for a complete illustration of each of
the function arguments and their default values.
As before, for survival analysis we would need the
"time2event" column (time to reach EDSS \(\geq\) 4.5, in days) and the
"observed" column (1 if the milestone was reached, 0
otherwise).
In studies with longer follow-ups, it can be informative to analyse recurrent events with consecutive survival times.
The following code extracts multiple EDSS events from our toy data with a roving baseline scheme (i.e., the baseline is updated after each event to the first available confirmation visit).
output <- MSprog(toydata_visits, "id", "EDSS", "date", "edss", relapse=toydata_relapses,
event="multiple", baseline="roving", # <--- detect multiple events with a roving baseline
verbose=0)
survival_data <- output$results[c("id", "nevent", "time2event")]
# print(survival_data, row.names=FALSE)
DT::datatable(survival_data, rownames=F,
options = list(dom="t", scrollX=T, scrollY="200px", paging = FALSE)
)Note that the event argument is set to
multiple, and that the results can contain more than one
row for the same subject, with the "nevent" column keeping
track of cumulative event count.