---
title: "Analysing disability course in MS"
author: "Noemi Montobbio"
date: "`r Sys.Date()`"
output:
rmarkdown::html_vignette:
toc: true
number_sections: true
# pdf_document:
# highlight: null
bibliography: MSbiblio.bib
csl: aaps-pharmscitech.csl
link-citations: true
vignette: >
%\VignetteIndexEntry{Analysing disability course in MS}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(
collapse=TRUE,
comment="#>"
)
```
```{r wrap-hook, include=FALSE}
library(knitr)
hook_output = knit_hooks$get("output")
knit_hooks$set(output = function(x, options) {
# this hook is used only when the linewidth option is not NULL
if (!is.null(n <- options$linewidth)) {
x = xfun::split_lines(x) #substring(x, 4, nchar(x)))
# any lines wider than n should be wrapped
if (any(nchar(x) > n)) x = strwrap(x, width = n)
x = paste0(paste(x, collapse = "\n#> "), "\n")
}
hook_output(x, options)
})
```
This vignette illustrates how to use the `msprog` package to analyse the evolution of disability in multiple sclerosis (MS) based on repeated assessments through time of an outcome measure (EDSS, NHPT, T25FW, SDMT; or any custom outcome measure). We'll start by illustrating the type of [input data](#input) needed, and by giving a [minimal working example](#example) to introduce the main function and facilitate the reading of the document. We'll then move on to a more detailed description of the different parameter configurations.
```{r}
library(msprog)
```
# Input data {#input}
The data must be organised in a data frame object containing (at least) the following columns:
* Subject IDs
* Visit dates
* Outcome values.
The visits should be listed in chronological order (if they are not, `msprog::MSprog()` will sort them automatically before analysing them).
The `msprog` package provides a toy dataset `toydata_visits` with artificially generated EDSS and SDMT assessments for a small cohort of patients.
```{r}
head(toydata_visits)
```
For relapsing MS patients, an additional data frame object with the onset dates of relapses is needed to correctly assess disease course (and possibly to characterise worsening events as relapse-associated or relapse-independent). The dataset should contain (at least) the following columns:
* Subject IDs
* Relapse onset dates.
The package provides a toy dataset with artificially generated relapse onset dates associated with the patients in `toydata_visits`:
```{r}
head(toydata_relapses)
```
Dates are interpreted as calendar days (e.g., YYYY-mm-dd) by default, but can optionally be provided as number of days "from start" (starting point can be different across subjects -- e.g., days from randomisation in a clinical trial): see `"visit_day"` column in the toy datasets, and [SDMT example below](#example). Disability assessment dates and relapse dates must be given in the same format.
# Minimal example {#example}
The core block of `msprog` is the `MSprog()` function. Given data on visits and relapses[^rnames] in the form specified above, `MSprog()` detects the confirmed disability worsening (CDW) or improvement (CDI) events for each subject for the outcome of interest. If specified, CDW events can be further [classified](#rawpira) based on their timing with respect to relapses and identified as relapse-associated worsening ([RAW](#raw)) or progression independent of relapse activity ([PIRA](#pira), see [@lublin2014; @kappos2018; @silent2019]).
[^rnames]: If the names of columns with subject ID and date in the relapse database are different from the main database, they must be specified using arguments `rsubj_col` and `rdate_col`.
By default, `MSprog()` only detects the first CDW. We can test it on the EDSS toy data...
```{r}
output_edss <- MSprog(data=toydata_visits, # data on visits
subj_col="id", value_col="EDSS", date_col="date", # specify column names
outcome="edss", # specify outcome type
relapse=toydata_relapses) # data on relapses
```
...or on the SDMT toy data:
```{r}
output_sdmt <- MSprog(data=toydata_visits, # data on visits
subj_col="id", value_col="SDMT", date_col="visit_day", # specify column names
outcome="sdmt", # specify outcome type
relapse=toydata_relapses, # data on relapses
date_format="day") # specify that dates are given as days
```
Note that, in the SDMT example, `"visit_day"` is used as date column. For the `MSprog()` function to interpret it correctly, we need to specify `date_format="day"` (note that this also applies to relapse dates).
The function prints out concise info (the `verbose` argument can be used to control the amount of info printed out -- see [the relevant section](#progressinfo)).
Full tabulation of the results can be accessed via the following attributes of the function output.
1. `results`: detailed info on each event for all subjects. In our example:
```{r, indent=" "}
# print(output_edss$results, row.names=FALSE)
DT::datatable(output_edss$results, rownames=F,
options = list(dom="t", scrollX=T, scrollY=F, paging = FALSE)
)
```
where: `nevent` is the cumulative event count for each subject; `event_type` characterises the event; `total_fu` is the number of days from start to end of follow-up; `time2event` is the number of days from start of follow-up to event[^timeto]; `bl2event` is the number of days from current baseline to event; `sust_days` is the number of days for which the event was sustained; `sust_last` reports whether the event was sustained until the last visit.
2. `event_count`: a data frame summarising event counts for each subject. Here's what it looks like for the EDSS-based computation above:
```{r, indent=" "}
# print(output_edss$event_count)
DT::datatable(output_edss$event_count,
options = list(dom="t", scrollX=T, scrollY=F, paging = FALSE)
)
```
> **Note.** In this example, we called `MSprog()` without specifying the `event` argument - which defaults to `firstCDW` (only detect the first CDW event). For this reason,
>
> 1. In `output_edss$results`, `nevent` can only be 0 or 1.
> 2. In `output_edss$event_count`, CDW count can only be 0 or 1, and CDI column is omitted.
>
> In a multiple-event setting (`event="multiple"`), more than one event per subject can be detected, and `event_count` also includes the event sequence for each subject. See [the relevant section](#multiple).
Additionally, by applying the `print` method to the `MSprog()` output, the user can print out the full list of function arguments, as well as a short paragraph describing the complete set of applied criteria, to be reported to ensure complete reproducibility. In our example:
```{r, linewidth=90}
print(output_edss)
```
[^timeto]: For subjects with no confirmed events, `time2event` is the total follow-up length. To omit these subjects, set `include_stable=FALSE` in `MSprog()`.
Several qualitative and quantitative options for analysing disability evolution are given as optional arguments of `MSprog()` that can be set by the user. In order to ensure reproducibility, the results should always be complemented by the set of criteria used to obtain them (e.g., using the `print` method mentioned above). In the following sections we will go into more detail about usage and best practices for the different options. Please refer to the documentation (by typing `?MSprog`) for a complete illustration of each of the function arguments and their default values.
# Outcome {#outcome}
In `MSprog()`, outcome type is specified via the mandatory argument `outcome`. This triggers:
* Value range checks (e.g., EDSS in the 0-10 range)
* Direction of worsening (e.g., for EDSS, "higher values = worse")
* Outcome-specific **"minimum clinically relevant change"**
Outcome values are scanned in chronological order, and tested for their difference from the current reference value. Such difference is typically required to be larger than a clinically meaningful threshold $\delta$, depending on the reference value $x$. Default settings are implemented in `msprog` for the most used disability scales (see [@lorscheider2016; @bosma2010; @kalinowski2022; @strober2019]):
* `outcome="edss"`: Expanded Disability Status Scale (EDSS). $\delta(x)=\begin{cases} 1.5 \quad \text{ if } x=0\\1 \quad\;\;\; \text{ if } 0 < x \leq 5\\0.5 \quad \text{ if } x>5\end{cases}$;
* `outcome="nhpt"`: Nine-Hole Peg Test (NHPT), for either the dominant or the non-dominant hand. $\delta=$ 20% of reference score;
* `outcome="t25fw"`: Timed 25-Foot Walk (T25FW). $\delta=$ 20% of the reference score;
* `outcome="sdmt"`: Symbol Digit Modalities Test (SDMT). $\delta=$ either 4 points or 20% of the reference score.
These default options are internally implemented by the `compute_delta()` function. For example, if the baseline EDSS score is 4, by default a clinically meaningful EDSS worsening will correspond to an increase by (at least):
```{r}
print(compute_delta(4, outcome="edss"))
```
If the baseline T25FW score is 10, the minimum clinically relevant change will be:
```{r}
print(compute_delta(10, outcome="t25fw"))
```
## Customising "clinically relevant change"
The user can provide their own function to customise the computation of clinically relevant thresholds, using the `delta_fun` argument of `MSprog()`. The user-defined function must take the baseline value as argument, and return the corresponding threshold. This applies to two scenarios.
1. **The outcome of interest is among the ones listed above, but we want to define the thresholds differently.** For example, we want to change the minimum $\delta$ for SDMT to, say, "either 3 points or 10% of the reference value". In this case, we would define:
```{r}
my_sdmt_delta <- function(x) {min(c(x/10, 3))}
```
Let's compute the minimum clinically relevant change in SDMT score starting from a baseline of 50 using our custom function, and compare it with the default function:
```{r}
print(my_sdmt_delta(50)) # my delta
print(compute_delta(50, outcome="sdmt")) # default delta
```
We can then compute:
``` {r, eval=FALSE}
output <- MSprog(...
outcome="sdmt", delta_fun=my_sdmt_delta,
...)
```
Specifying `outcome="sdmt"`, the outcomes values will be checked internally to verify that they are in the correct range for SDMT.
2. **The outcome of interest is not among the ones listed above.** In this case, we set `outcome="custom"`, and `delta_fun` as our desired custom-defined function. For custom outcomes, the direction of worsening must be specified as well (i.e., whether increasing or decreasing values of the outcome are interpreted as a worsening). This can be done by setting the `worsening` argument to either `"increase"` or `"decrease"`. Note that the provided `worsening` argument is only used by `MSprog()` when `outcome` is set to `"custom"`. Otherwise, `worsening` is automatically set to `"increase"` if `outcome` is set to `"edss"`, `"nhpt"`, `"t25fw"`, and to `"decrease"` if `outcome` is set to `"sdmt"`.
# Baseline scheme
The assessment of the disability evolution strongly depends on the choices made in defining the starting point, i.e., the baseline. Different behaviours are appropriate in different contexts. This aspect is controlled by the `baseline` argument in `MSprog()`. The following alternative baseline schemes can be adopted.
* *Fixed baseline* (`baseline="fixed"`, default): the baseline visit is set to be the first available assessment.
* *Roving baseline* (`baseline="roving"`): the baseline visit is initially set as the first available assessment, then updated after each event. This scheme is recommended in a **"multiple events" setting** [@kappos2018] (see [example below](#multiple)), but **not recommended for clinical trial data** as it may break the randomisation.
* *Improvement-based roving baseline* (`baseline="roving_impr"`): the baseline visit is initially set as the first available assessment, then updated after each confirmed improvement. This scheme is suitable for a **first-CDW setting** to discard fluctuations around the baseline [@muller2023], but **not recommended for clinical trial data** as it may break the randomisation.
* *Worsening-based roving baseline* (`baseline="roving_wors"`): the baseline visit is initially set as the first available assessment, then updated after each confirmed worsening. This scheme is suitable when the endpoint of interest is **a [specific type](#input) of CDW**. For instance, when searching specifically for PIRA events, the reference should be moved after any previous RAW event; when searching specifically for RAW events, the reference should be moved after any previous PIRA event.
> **Note.** If the baseline data is stored in a separate file, the user should merge it with the main data frame containing longitudinal visit data. This can be done by inserting the baseline date and outcome value for each subject at the beginning of the data frame (not necessarily next to the other visits from the same subject -- they will be grouped by subject ID).
## Additional options
* For roving baseline schemes, the re-baseline procedure can be made finer through the `sub_threshold_rebl` argument in `MSprog()`: setting `sub_threshold_rebl="change"`, for instance, rebaseline can be triggered by *any* confirmed change, even if below [the clinically meaningful threshold](#outcome). In this case, the sub-threshold improvement events are used to move the baseline, but not considered in event count.
* When a roving baseline is used, the reference is moved after a previously detected event. The new reference may be set at the date of the previous event (`proceed_from="event"`), or at the first available confirmation visit for such event (`proceed_from="firstconf"`, default).
* If the baseline visit occurs in the vicinity of a clinical relapse, the reference value may be overestimated. The `relapse_to_bl` argument allows to specify the minimum accepted distance (in days) of the baseline visit from the onset of a previous relapse. For instance, if `baseline="fixed"` and `relapse_to_bl=30`, the baseline visit is chosen as the first available visit satisfying the requirement of "no relapses within the preceding 30 days". If two values are provided, they are interpreted as intervals before *and after* the event -- e.g., `relapse_to_bl=c(30, 2)` implements the constraint of "no relapses within the preceding 30 days or within the following 2 days". If relapse **end dates** are available (may be provided as an additional column in the relapse file whose name is specified by argument `renddate_col` in `MSprog()`), the first value of `relapse_to_bl` is overwritten by the relapse duration, unless it was set to 0 (in which case it stays 0 and the baseline is **not** moved based on the influence of previous relapses). If the `baseline` argument is set to a roving scheme, all the above applies to every baseline change as well.
* On top of the chosen baseline scheme, *post-relapse re-baseline* @kappos2020 can be applied by setting `relapse_rebl=TRUE` in `MSprog()`. If this is enabled, the onset of each relapse prompts a re-baseline to the next available visit following the relapse and out of its influence (as per `relapse_to_bl`). As an additional constraint, the updated baseline can be forced to have a score greater than or equal to the previous baseline by setting `bl_geq=TRUE` in `MSprog()`.
* Due to fluctuations in the data, the new reference value following a re-baseline may fall at a local minimum or maximum of the disability trajectory. These may be excluded for consideration as baseline visits using the `skip_local_extrema` argument in `MSprog()`.
As already mentioned above, extra caution should be used when applying any re-baseline scheme to randomised data, as moving the reference value based on post-randomisation events can introduce bias (especially if the occurrence of these events is influenced by the treatment). **For clinical trial data, general recommendation is to use a fixed baseline (at the time of randomisation).**
## Multiple events {#multiple}
The `event` argument allows to specify which events to detect. By default, it is set to `"firstCDW"` (only detect the first CDW event). It can be set to `"multiple"` to sequentially detect all CDW or CDI events.
For example, extracting multiple EDSS events for subject `4` from `toydata_visits` with a **fixed** baseline would result in the following.
```{r}
print(toydata_visits[toydata_visits$id==4, c("date", "EDSS")]) # EDSS visits
output <- MSprog(data=toydata_visits,
subj_col="id", value_col="EDSS", date_col="date", outcome="edss",
subjects=4,
relapse=toydata_relapses,
event="multiple", baseline="fixed", # <---
verbose=0)
# print(output$results, row.names=FALSE) # results
DT::datatable(output$results, rownames=F,
options = list(dom="t", scrollX=T, scrollY=F, paging = FALSE)
)
```
With these settings, the EDSS improvement at visit 2 (EDSS=3.5, confirmed at visit 3) does not trigger a re-baseline. The algorithm keeps searching for events from after the confirmation visit, evaluating the changes **relative to the original baseline** (EDSS=4.5). The subsequent EDSS worsening is therefore not detected. On the other hand, adopting a **roving** baseline scheme we get:
```{r}
output <- MSprog(data=toydata_visits,
subj_col="id", value_col="EDSS", date_col="date", outcome="edss",
subjects=4,
relapse=toydata_relapses,
event="multiple", baseline="roving", # <---
verbose=0)
# print(output$results, row.names=FALSE) # results
DT::datatable(output$results, rownames=F,
options = list(dom="t", scrollX=T, scrollY=F, paging = FALSE)
)
```
The baseline has now been moved to the confirmation visit of the first CDI event (visit 3, EDSS=3.5). With respect to this baseline, a subsequent CDW event is detected at visit 4 (EDSS=5), confirmed at visit 5.
Other valid values for the `event` argument are:
* `"first"`: only the very first confirmed event -- CDI or CDW
* `"firstCDI"`: first CDI event
* `"firstPIRA"`: first PIRA event
* `"firstRAW"`: first RAW event.
> **Note.** if, for example, `"event=firstPIRA"`, all non-PIRA events *preceding* the first PIRA are actually detected -- although not reported. This allows to combine this scenario with a roving baseline scheme, where the baseline is moved after every confirmed non-PIRA event. For instance, if a RAW event occurs first, the subsequent worsening would be counted from the re-baselined outcome value after the RAW.
# Event confirmation
An event is only validated if the change in the outcome value from the current baseline is maintained **up to**[^upto] a subsequent *confirmation visit* at a pre-specified distance from the event [@ontaneda2017]. The event is confirmed if the difference in the outcome value from the baseline score remains above-[threshold](#outcome) at all assessments up to the confirmation visit. For example, with the default EDSS thresholds, an increase in EDSS from 4 points to 6 points is confirmed if EDSS=5 at all subsequent visits; it is not confirmed if EDSS=4.5 at all subsequent visits.
[^upto]: The value change from baseline must also be maintained at all intermediate visits between the event and the confirmation visit. This behaviour may be changed by setting the `check_intermediate` argument to `FALSE` (in this case, the change only needs to be confirmed **at** the designated confirmation visit). We do not recommend this, as it may lead to including random fluctuations as events. We included this option to provide the possibility of replicating the results from previous studies that used this rationale.
The chosen confirmation period depends on the type of study and on the frequency of visits, and can be set in `MSprog()` by using the `conf_days` argument. A tolerance interval around `conf_days` can be specified using the `conf_tol_days` argument, given as a sequence of two integers (lower and upper tolerance)[^lrtol]. Setting the right end of the interval to `Inf` allows event confirmation to occur at any visit *after* `conf_days`. Let's see two examples.
i. A common setting for **clinical trial data** would be: `conf_days=7*12`, `conf_tol_days=c(0, Inf)`,
i.e., "confirmation over 12 or more weeks".
ii. A common setting for **observational data** would be: `conf_days=7*24`, `conf_tol_days=c(0,7*12)`,
i.e., the confirmation visit must lie between 24 weeks after the event, and 36 weeks after the event.
[^lrtol]: `conf_tol_days` can also be specified as a single integer if symmetric lower and upper tolerance is desired.
> **Note.** "Confirmed over 12 *or more* weeks" (i.e., no upper bound) $\neq$ "sustained over the whole follow up".
Setting no upper bound to the confirmation interval means that **the first visit *at least* 12 weeks after the initial change is selected as confirmation visit**. The confirmation window is then defined as the interval between the initial worsening and that confirmation visit, and scores are checked therein.
## Additional options
* If `conf_days` is specified as a sequence of values (e.g., `conf_days=c(7*12,7*24)`), events are retained if confirmed by *at least one visit* falling within any of the specified periods (here, 12 **or** 24 weeks with their relative tolerance interval)[^1224]. Confirmation dates/values at *all* specified values will be included in the results table (if `include_dates=T` / `include_values=T` in `MSprog()`).
[^1224]: An event is only confirmed if the value change from baseline is maintained **at all visits up to the confirmation visit**. So an event can only be "confirmed over 24 weeks" and *not* "confirmed over 12 weeks" if there are no valid confirmation visits falling within the 12-week window (unless `check_intermediate=FALSE`).
* If the confirmation visit occurs in the vicinity of a clinical relapse, it is typically considered invalid. The `relapse_to_conf` argument allows to specify the minimum accepted distance (in days) of a confirmation visit from the onset of a previous relapse. For example, `relapse_to_conf=30` implements the constraint "no relapses within the preceding 30 days" for a visit to be used as confirmation. If two values are provided, they are interpreted as intervals before *and after* the event -- e.g., `relapse_to_conf=c(30, 2)` implements the constraint "no relapses within the preceding 30 days or within the following 2 days". If relapse **end dates** are available (may be provided as an additional column in the relapse file whose name is specified by argument `renddate_col` in `MSprog()`), the first value of `relapse_to_conf` is overwritten by the relapse duration, unless it was set to 0 (in which case it stays 0 and confirmation visits are **not** discarded based on the influence of previous relapses).
* By default, a disability worsening occurring at the last available assessment is ignored by `MSprog()`. This behaviour can be changed using the `impute_last_visit` argument. If set to `TRUE`, any disability worsening occurring at the last visit is retained even though unconfirmed. If set to a numeric value N, unconfirmed worsening events are included only if occurring within N days of follow up (e.g., in case of early discontinuation).
* In scenarios (e.g., a clinical trial) where disability course is compared between two cohorts with different relapse rates, there may be an imbalance in visit frequency between the two groups, due to unscheduled visits after relapses. This implies a higher probability of detecting outcome changes in the group with more assessments. Such issue is sometimes addressed by only using scheduled visits as confirmation visits [@hauser2017]. It is possible to implement this with `MSprog()` by including an additional column in the longitudinal data, specifying which visits can (`TRUE`) or cannot (`FALSE`) be used as confirmation visits. The name of such column must be specified using the `validconf_col` argument.
```{r echo=FALSE, out.width="40%", fig.cap = "*Figure 2. Example of data with unscheduled visits.*"}
knitr::include_graphics(paste0(getwd(), "unscheduled.png"))
```
In the example depicted in Figure 2, unscheduled visits can be excluded from consideration as confirmation visits by setting:
``` {r, eval=FALSE}
output <- MSprog(...
validconf_col="scheduled",
...)
```
## Sustained CDW or CDI {#sustained}
In addition to event confirmation, some studies require events to be *sustained* for a specified period of time. The `require_sust_days` argument allows to specify, if desired, the length of such period. For example, if `require_sust_days=7*48`, an event is only retained if the change in the outcome measure from the current baseline is confirmed at *all* visits falling within the following 48 weeks. If the event is sustained for the entire follow-up, it is retained even if the follow-up period ends <48 weeks after the event. Setting `require_sust_days=Inf`, events are retained only when sustained for the entire follow-up duration.
The `require_sust_days` argument may be of use in the following scenarios.
* Suppose the maximum follow-up in the population is 96 weeks excluding the baseline visit. If one wants to detect CDWs sustained over the whole follow-up, coding it as "worsening confirmed over 96 weeks or more" (`conf_days=7*96`, `conf_tol_days=c(0, Inf)`) would discard CDWs of patients whose follow-up is shorter than 96 weeks. In a scenario where patients have different follow-up lengths, the `require_sust_days` argument offers a more reliable way of implementing this, e.g., as "worsening confirmed over 12 weeks or more and sustained for the whole follow-up period" (`conf_days=7*12`, `conf_tol_days=c(0, Inf)`, `require_sust_days=Inf`).
* In observational data with coarsely- and unevenly-spaced visits, the `require_sust_days` argument may be used to detect sustained CDW (e.g., `require_sust_days=7*96`) while still requiring the presence of a confirmation visit within a specified interval from the event (e.g., `conf_days=7*12`, `conf_tol_days=c(0,7*12)`).
* Sustained disability worsening can be required to exclude transient accumulation of disability (e.g., as a consequence of relapses) followed by complete recovery.
# Relapse-based classification of CDW events {#rawpira}
Detected CDW events may be further categorised based on their timing with respect to relapses[^reldata] and identified as RAW or PIRA. This has to be explicitly enabled by setting the flag `RAW_PIRA=TRUE` in `MSprog()` (unless the endpoint of interest is already `firstPIRA` or `firstRAW`). On top of that, the definitions used to label a CDW as RAW or as PIRA are controlled by arguments `relapse_assoc` and `relapse_indep`, as detailed below.
[^reldata]: Relapse data is to be provided using the optional `relapse` argument in `MSprog()`. It should be given as a data frame containing subject IDs and relapse onset dates (see ["Input data"](#input)). If the names of columns with subject IDs and dates in the relapse database are different from the main database, they must be specified using arguments `rsubj_col` and `rdate_col`.
## Relapse-associated worsening (RAW) {#raw}
RAW events are typically defined as CDW events occurring within a specified interval from the onset of a relapse. The length (in days) of such interval can be specified using the `relapse_assoc` argument in `MSprog()`. Common values are 30 or 90 days. Additionally, one may also provide a maximum distance from relapses whose onset is *after* the event. The logic is that the reported onset date of a relapse may be slightly delayed. The examples below illustrate the usage of the `relapse_assoc` argument:
* `relapse_assoc=90` (equivalent to `relapse_assoc=c(90, 0)`): RAW = "a CDW event occurring within 90 days after the onset of a relapse"
* `relapse_assoc=c(90, 7)`: RAW = "a CDW event occurring within 90 days after or within 7 days before the onset of a relapse".
## Progression independent of relapse activity (PIRA) {#pira}
In the literature, PIRA is generally defined as CDW occurring in the absence of relapses within predefined intervals.
In most cases (see, e.g., [@kappos2020; @cagol2022; @muller2023]), relapse-free intervals are anchored to (possibly a subset of):
* a visit preceding the event (e.g., baseline)
* the initial worsening
* an eligible confirmation visit.
The `relapse_indep` argument in `MSprog()` allows to specify custom relapse-free intervals. The auxiliary function `relapse_indep_from_bounds()` organises the given interval bounds into a named list to be given to `MSprog()` through the `relapse_indep` argument.
``` {r, eval=FALSE}
output <- MSprog(...
relapse_indep=relapse_indep_from_bounds(p0, p1, e0, e1, c0, c1),
...)
```
where: `p0` and `p1` specify the interval around a visit preceding the event (can be the current reference, the last visit before event onset, or the last visit before event onset with a clinically meaningful score difference from it [^pira]); `e0` and `e1` specify the interval around the event; `c0` and `c1` specify the interval around the confirmation visit; see Figure 1. If both ends of an interval are 0 (e.g., if both `p0=0` and `p1=0`), the checkpoint is ignored. To merge two intervals together, set both the right end of the first interval and the left end of the second interval to `NULL` (e.g., "between baseline and event onset": `p1=NULL` and `e0=NULL`).
```{r echo=FALSE, out.width="50%", fig.cap = "*Figure 1. Relapse-free intervals characterising PIRA, as defined by arguments `p0`, `p1`, `e0`, `e1`, `c0`, `c1`.*"}
knitr::include_graphics(paste0(getwd(), "relapse_indep_def.png"))
```
For example, in @muller2023, the authors recommend an absence of relapses during the 90 days before and 30 days after the event, and during the 90 days before and 30 days after confirmation for a CDW event to be considered as PIRA. This translates into: `p0<-0`, `p1<-0`, `e0<-90`, `e1<-30`, `c0<-90`, `c1<-30`. When a high specificity is desired, they recommend an absence of relapses in the whole period between reference and confirmation, which is implemented by setting: `p0<-0`, `p1<-NULL`, `e0<-NULL`, `e1<-NULL`, `c0<-NULL`, `c1<-0`.
For more details and examples on PIRA computation, please refer to the dedicated package vignette *Assessing progression independent of relapse activity (PIRA) in MS*.
## Example
The following code detects multiple events with the `RAW_PIRA` flag enabled.
```{r}
output <- MSprog(data=toydata_visits,
subj_col="id", value_col="EDSS", date_col="date", outcome="edss",
event="multiple", RAW_PIRA=TRUE, baseline="roving",
relapse=toydata_relapses)
```
The `output$event_count` data frame (and the summary printed out by the function)
now includes RAW and PIRA counts:
```{r}
# print(output$event_count)
DT::datatable(output$event_count,
options = list(dom="t", scrollX=T, scrollY=F, paging = FALSE)
)
```
In the `output$results` data frame, each CDW event is now further characterised by a `CDW_type` column:
```{r}
# print(output$results, row.names=FALSE)
DT::datatable(output$results, rownames=F,
options = list(dom="t", scrollX=T, scrollY=F, paging = FALSE)
)
```
# `MSprog()` outputs
## What to include in results
The `results` attribute provides extended info on each event for all subjects. The following arguments can be used to control the information included.
* `include_dates`: if `TRUE`, the results will include the dates of: event onset; baseline; last visit before event onset at clinically meaningful score difference from it; first confirmation visit.
* `include_value`: if `TRUE`, the results will include the outcome value at: event onset; baseline; last visit before event onset at clinically meaningful score difference from it; first confirmation visit.
* `include_stable`: if `TRUE`, subjects with no confirmed events are included in the results, with `time2event` = total follow up.
For example:
```{r}
output <- MSprog(data=toydata_visits,
subj_col="id", value_col="EDSS", date_col="date",
outcome="edss", relapse=toydata_relapses, verbose=0,
include_dates=T, include_value=T, include_stable=F) # <---- !
# print(output$results, row.names=FALSE)
DT::datatable(output$results, rownames=F,
options = list(dom="t", scrollX=T, scrollY=F, paging = FALSE)
)
```
## Printing progress info {#progressinfo}
The `verbose` argument controls the amount of info printed out by the `MSprog()` function. When setting `verbose=0`, the function prints no info. When setting `verbose=1` (default), the function only prints out concise info. When setting `verbose=2`, the function prints out an extended log of the ongoing computations. See the example below.
```{r}
output <- MSprog(data=toydata_visits,
subj_col="id", value_col="EDSS", date_col="date", outcome="edss",
event="multiple", baseline="roving",
relapse=toydata_relapses, verbose=2)
```
For further insight into the event detection process, all score changes from baseline that were not identified as valid events (e.g., not confirmed) are stored in `output$unconfirmed`. In this example, this looks like:
```{r}
# print(output$unconfirmed, row.names=FALSE)
DT::datatable(output$unconfirmed, rownames=F,
options = list(dom="t", scrollX=T, scrollY=F, paging = FALSE)
)
```
# Time to disability milestone
Instead of studying disability course with respect to a baseline value, one can focus on the time taken to reach a specific disability milestone (e.g., EDSS $\geq$ 6). This can be computed using another function from the `msprog` package, `value_milestone()`.
The following code detects the time to EDSS $\geq$ 6 for each subject in the toy data.
```{r}
vm <- value_milestone(toydata_visits, milestone=6,
subj_col="id", value_col="EDSS", date_col="date",
outcome="edss", relapse=toydata_relapses,
verbose=0)
```
The function returns a data frame indexed by subject IDs:
```{r}
print(vm)
```
where: `"date"` contains the date of the first confirmed EDSS $\geq$ 6 (or last date of follow-up if milestone is not reached or not confirmed); `"EDSS"` contains the first EDSS value $\geq$ 6, if present, otherwise no value; `"time2event"` contains the time to reach EDSS $\geq$ 6 (or total follow-up length if not reached or not confirmed); `"observed"` indicates whether EDSS $\geq$ 6 was reached (1) or not (0).
Several arguments controlling the behaviour of the `MSprog()` function are also available for the `value_milestone()` function (e.g., `require_sust_days`, `impute_last_visit`...). Please refer to the documentation (by typing `?value_milestone`) for a complete illustration of each of the function arguments and their default values.
For more examples on the usage of the `value_milestone()` function, please refer to the package vignette *Time to event*.
# References