021_ Providing a use case of longitudinal data for the manuscript "Phenendo: a tool for clustering of cross-sectional and longitudinal phenotype data"
Research Question and Aims
As mentioned above, the aim of this proposal is not the answer a scientific question, but to provide a use case that demonstrates the abilities of the toolbox PhenEndo to cluster longitudinal data. For this purpose, the use case should include more than one content-related group of variables and at least one of these variable groups should contain different data types (mixed data).
Schizophrenia is a heterogeneous illness regarding occurring symptoms and level of functioning. Many, but not all patients with schizophrenia not only experience episodes of positive symptoms, such as delusions and hallucinations, but also suffer from negative and/ or depressive symptoms. Especially negative symptoms are hard to treat and often have a high impact on the quality of life and level of functioning of patients. Therefore, we were interested to see, which clusters of patients emerge from the PsyCourse data with regards to level of functioning and negative as well as depressive symptoms.
Also, the severity of symptoms can be either rated by an external rater (e.g. clinician or interviewer in a study) or be assessed via self-report of a patient. Thus, we wanted to explore whether clusters of patients show comparable patterns over time between ratings on negative and depressive symptoms performed by an interviewer and self-report of depressive symptoms.
We selected three groups of variables for dimension reduction: psychosocial functioning, negative and depressive symptoms rated by an interviewer and depressive symptoms rated on a self-report form by study participants.
I) Sample selection
We selected a subsample of study participants with a DSM-IV diagnosis of schizophrenia and complete data at all four study visits (n = 76 from dataset version PsyCourse3.0 in long format). We selected three groups of variables for dimension reduction: functioning, negative and depressive symptoms rated by an interviewer and depressive symptoms rated on a self-report form by study participants.
Group 1: Functioning ? mixed data
- Global Assessment of Functioning Score GAF ("gaf", continuous)
- Current employment status ("curr_paid_empl", ordinal)
- Current relationship status ("partner", categorical)
Group 2: Negative and depressive symptoms (rating by interviewer)
- Inventory of Depressive Symptomatology IDS-C30, 30 items ("idsc_", ordinal )
- The Negative Scale (= 7 items) from the Positive and Negative Syndrome Scale PANSS ("panss_n", ordinal)
Group 3: Depressive symptoms (self-report)
- Beck Depression Inventory II BDI-II, 21 items ("bdi2_", ordinal )
II) Clustering pipeline
Data are uploaded into the toolbox PhenEndo. In a first step, a dimension reduction method, factor analysis of mixed data (FAMD) is applied to each variable group. Next, the cluster algorithm flexmix is used to identify clusters of participants with similar trajectories across all variable groups.
III) Descriptive characterization of clusters
The identified clusters are compared regarding age, sex, functioning, current work status, current relationship status, depression sum scores, negative symptoms sum scores and an overall rating of course of illness.
The analyses will be performed in a subset of the PsyCourse sample (version PsyCourse 3.0long). Only participants with a DSM-IV diagnosis of schizophrenia and data at all four study visits will be included (n = 76).
The phenotypic variables for analysis will be as follows:
Variables for longitudinal clustering
Variables for descriptive characterization of clusters
Of note, these data will NOT be published in raw form. Only the results will be published, and illustrations of these results will be contained in the documentation of the toolbox. Individual pseudonyms will not be published.
No biological data are needed.