PsyCourse

2020-11-16

037_ Multi-level integrative omics to identify biomarkers in a Schizophrenia and other major psychoses (MulioBio): SmallRNAome comparison of broad diagnostic groups

Research Question and Aims

Despite many years of research and many promising candidates there are no validated and reliable biomarkers (BM) in clinical use for schizophrenia or other major psychoses (MP). In the MulioBio project, we postulate that phenotype definition is the reason BM could not be discovered. To resolve this, we intended to classify patients of the longitudinal PsyCourse cohort, that show the same behavioral profiles over time, into three to four transdiagnostic groups. In the MulioBio study, former PsyCourse participants will be re-contacted, PBMCs will be obtained from them under the auspices of the MulioBio study, and a multi-level BM screening of molecular phenotypes will be performed (RNAome, smallRNAome, epigenome, proteome), eventually leading to potential biomarkers able to differentiate between the transdiagnostic groups. Initially, we planned to validate robust BM in an independent cohort. To find phenotypic groups within the PsyCourse cohort, we initially planned to use the longitudinal clustering pipeline developed by the Helmholtz Institute for Computational Biology (either the RShiny app PhenEndo or R functions implementing this pipeline). This pipeline is able to cluster PsyCourse individuals into groups that show different phenotypic trajectories over time. As development of this pipeline has been delayed, we will, for now, use a separation of PsyCourse participants according to broad diagnostic group of the participants (Affective versus Psychotic). We will use those data to identify potential biomarkers of these groups with Differential Expression Analysis and Weighted Gene Correlation Network Analysis (WGCNA) and integrate the result obtained from those data into the MulioBio project.

Analytic Plan

We hypothesize that the expression of circulating small RNA is related to broad diagnostic groups in psychosis. To prepare the MulioBio project, we intent to study smallRNAome data from biological samples collected at the first visit of the PsyCourse study. We will in a first attempt try to evaluate differences and similarity between the smallRNAome of the different diagnosis groups. After the newly collected samples of the MulioBio project are processed, we will integrate results of the PsyCourse participants with newly obtained data from.
This will be realized using the following steps:
1 The COMPSRA pipeline (https://github.com/cougarlj/COMPSRA) will be used to analyze PsyCourse smallRNAome data. We will compare the different groups of patients using
2. Differential expression analysis (with R DESeq2 package) and
3. Weighted Correlation Network Analysis (with R WGCNA package) to identify relevant modules and candidates smallRNAs.
4. We will annotate the results using gene ontology analysis (with R TopGO package) and do further literature research.
Later in the project, we also aim to integrate available SNP data of the PsyCourse participants into the obtained results.

Addendum 1:
In our proposal, we asked for the small RNAome fastq and SNPs data generated using the V1 biomaterial.
A TWAS is an approach using samples for whom both gene expression and genotype were measured and GWAS summary statistics to identify association between genes and GWAS phenotype (Gusev et al. 2016) while looking only to the expression that is driven by the phenotype. The approach compute weights (correlation) between the genes expression and the SNPs located in +/- 500 pb from the TSS (cis-SNP). Those weights are integrated with previously computed linkage disequilibrium to impute the expression driven by the cis-SNP in the GWAS summary data before testing for an association between the imputed expression and the GWAS phenotype.
Previous study already realized TWAS on a cohort of patients with schizophrenia and bipolar disorder using summary GWAS data (Gusev et al. 2018). While small RNAs play important roles in the regulation of the gene expression and of the immune system, there is not a psychosis related TWAS using small RNA data. Our first objective is to use the PSYCOURSE genotype and smallRNAome to compute the TWAS weights and linkage disequilibrium.
Those data will be used with public GWAS summary data (from the Psychiatric Genomics Consortium) to identify transcriptome-wide associations of miRNAs with schizophrenia and other psychoses. We will then carry out a literary research to check if the TWAS associated miRNAs regulate the expression of genes involved in schizophrenia.

Addendum 2:
Our analysis of the smallRNAome enabled the identification of 52 unique miRNAs that were deregulated between different clinical groups. The TWAS analysis identified miRNA associated to the schizophrenia (12) or the bipolar disorder (9) that aren’t among the miRNAs identified by the previous DGE analysis.
To continue the analysis, we intent to:
1- Verify if the miRNAs we identified were not already associated to psychosis (it could be considered a form of validation of our results)
2- Check if SNPs are present in the mature miRNAs (in particular, we want to check the seed regions)
3- Verify the cells in which the miRNAs are expressed.
4- Use the identified miRNAs to realize target, pathway, and interactions analysis.
5- Use the fastq to verify if we can identify novel miRNAs
The points 4 and 5 will be realized by Igor Jurisica of the Krembil institute. Eventually we could add the data (after publication) to a data interaction portal, NeuroDIP (not yet public). To make our research accessible.

Individual level data will not be published without access restrictions.

Resources needed

Phenotype data:
v1_id
v1_dsm_dx
v1_scid_dsm_dx_cat
v1_curr_paid_empl v1_partner
v1_sex
v1_age
v1_dur_illness
v1_evr_ill_drg
v1_ever_smkd,
v1_age_smk,
v1_no_cig
v1_alc_pst12_mths,
v1_alc_5orm,
v1_lftm_alc_dep
v1_evr_ill_drg
v1_Antidepressants
v1_Antipsychotics
v1_Mood_stabilizers
v1_Tranquilizers
v1_gaf v1_idsc_itm1
v1_idsc_itm10
v1_idsc_itm15
v1_idsc_itm16
v1_idsc_itm17
v1_idsc_itm18
v1_idsc_itm19
v1_idsc_itm2
v1_idsc_itm20
v1_idsc_itm21
v1_idsc_itm22
v1_idsc_itm23
v1_idsc_itm24
v1_idsc_itm25
v1_idsc_itm26
v1_idsc_itm27
v1_idsc_itm28
v1_idsc_itm29
v1_idsc_itm3
v1_idsc_itm30
v1_idsc_itm4
v1_idsc_itm5
v1_idsc_itm6
v1_idsc_itm7
v1_idsc_itm8
v1_idsc_itm9
v1_idsc_11_12
v1_idsc_13_14
v1_panss_p1
v1_panss_p2
v1_panss_p3
v1_panss_p4
v1_panss_p5
v1_panss_p6
v1_panss_p7
v1_panss_n1
v1_panss_n2
v1_panss_n3
v1_panss_n4
v1_panss_n5
v1_panss_n6
v1_panss_n7
v1_panss_g1
v1_panss_g2
v1_panss_g3
v1_panss_g4
v1_panss_g5
v1_panss_g6
v1_panss_g7
v1_panss_g8
v1_panss_g9
v1_panss_g10
v1_panss_g11
v1_panss_g12
v1_panss_g13
v1_panss_g14
v1_panss_g15
v1_panss_g16
v1_ymrs_itm1
v1_ymrs_itm2
v1_ymrs_itm3
v1_ymrs_itm4
v1_ymrs_itm5
v1_ymrs_itm6
v1_ymrs_itm7
v1_ymrs_itm8
v1_ymrs_itm9
v1_ymrs_itm10
v1_ymrs_itm11
v1_smRNAome_id
v1_gwas_id

Biological data:
Small RNAome data in fastq format: visit 1 smallRNAome data
Imputed GWAS data (GSA chip, whenever available)