041_ Outlier analyses in whole blood transcriptomes to identify rare genetic variants in individuals with schizophrenia
Research Question and Aims
Even in common complex genetic diseases, such as schizophrenia, common genetic alterations explain only part of the heritability of the disease. Therefore, there is a possibility that not only common genetic alterations play a role In the genetic architecture of schizophrenia (PMID 31835028; PMID 29483656; PMID 29056061), but also that rare genetic variants with stronger effects on the phenotype contribute to this in a proportion of cases. It is known that there is a direct relationship between the frequency of a genetic variant in the general population and its potential effect on the phenotype (PMID 19812666), such that rarer variants have a stronger effect. Some such rare variants are already known and well characterized in the context of schizophrenia (PMID 28650482; PMID 25821909; PMID 25132547), but there is a strong possibility that there are many more rare genetic variants with a strong effect on the schizophrenia phenotype that have not been recognized to date. Such rare genetic variants with a strong effect will be identified in this project. Specifically, the project will use a newly developed bioinformatics method (transcriptomic outlier analysis) to identify functionally relevant rare genetic variants in a dataset consisting of 550 whole transcriptome analyses (WTAs) from the blood of schizophrenia patients belonging to the PsyCourse cohort.
The goal of this research project is to identify rare genetic variants that play a role in the development of schizophrenia. To this end, outlier analyses in transcriptome sequencing data will be used to identify individuals in whom rare genetic variants with a strong functional effect are present.
We will include those individuals with schizophrenia belonging to the PsyCourse Cohort for whom whole blood transcriptome data are available. Whole transcriptome analyses (WTA) for these individuals were performed in the context of a research project not related to the project described here in 550 patients with schizophrenia (Lexogen 3'RNA Sequencing, Illumina Platform, Next Generation Sequencing (NGS) Competence Center Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn; bioinformatic processing by the Institute for Systems Biology and Bioinformatics, University of Rostock).
In our analysis, the already existing WTA data from the blood of schizophrenia patients from the PsyCourse study will be used for an analysis of expression and splicing outliers. For this purpose, after a quality control step, .bam files will be analyzed using the DROP workflow (PMID 33462443; PMID 33483494). This workflow involves, among other things, the use of an autoencoder to control for latent variables in the data set and to identify statistical outliers in the data. Instead of excluding these from the analysis as in most case-control comparisons, the outliers are the focus of this analysis.
To interpret the detected outliers with aberrant gene expression or aberrant RNA splicing, genetic data are needed. To this end, we will use available genome-wide genotyping data and, where and if available, genome-wide sequencing (whole genome sequencing, WGS) data (funding applied for but not yet received), since both common and rare regulatory variants in non-coding regions of the genome can often be responsible for changes in gene expression or RNA splicing. Where possible, already available proteomics data will further be used to corroborate findings from the transcriptomic outlier analyses.
All bioinformatics analyses will be performed in R or Python as applicable.
raw medication data sets (v1_med_clin_orig)
raw medication data sets (v3_med_clin_orig)
FASTQ and BAM files of Lexogen 3' RNA Seq
QC-ed intensities and protein levels
Whole blood transcriptomes have already been sequenced by the Next Generation Sequencing Core Facility of the University of Bonn. Bioinformatic work-up was performed - in part - by the Department of Systems Biology and Bioinformatics at the University of Rostock as part of a different project lead by PD Dr. Franziska Degenhardt. Plasma proteome MS data for PsyCourse individuals are already available to the applicant through collaboration with the Mathias Mann Lab at MPI Biochemistry in Munich.
Plasma proteome antibody-based data for PsyCourse individuals are already available to the applicant through collaboration with the Peter Nilsson Lab at Scilifelab, School of Biotechnology, KTH Royal Institute of Technology in Sweden.