2025-12-17
103_ The microRNA signatures of a Common Executive Function factors in the PsyCourse Study: A cross-sectional analysis
Research Question and Aims
Executive functions (EF) are a set of high-order skills that regulate goal-oriented behavior, and their impairment is a shared feature of psychiatric disorders (1,2). Although encompassing various correlated constructs such as working memory, set shifting and inhibitory control, EF can be modeled using a higher-order common latent factor that explains most of the variability among the lower-order construct, known as common Executive Function factor (cEF) (3). Although EFs are being influenced by the environment, and subject to improvement by training, the cEF is highly heritable according to various twin studies and the largest UK Biobank GWAS on cEF to date (n > 400,000) (1).
MicroRNAs are non-coding RNAs that regulate gene expression and had been reported to be dysregulated in various psychiatric disorders (4), and due to their favorable properties as high stability in cell free environments, the presence of brain derived microRNAs in peripheral circulation, and the possibility to manipulate their expression using blockers or mimics, they are promising as biomarkers and therapeutic targets of cognitive decline (5).
In this study we propose to analyze the circulating microRNAome expression associated with the latent cEF, factor which will be calculated based on the individual executive function test scores. The main objective is to propose a 2 or 3 microRNA combinatory signature that relates with altered cEF scores in a transdiagnostic fashion. The secondary aim is to conduct a bioinformatics analysis of the candidates and explore potential related biological processes.
Analytic Plan
Assessment of the phenotype
We hypothesize that there is a characteristic microRNAs signature related to the phenotype of the cEF. As the phenotype of interest, we will use the previously calculated scores for a phenotypic latent common EF factor (Heilbronner et al, Underreview), following a similar procedure as Hatoum, 2023 (1). This latent analysis includes the scores of the Trail Making Tests A and B, the Verbal Digit Span Forwards and Backwards, and the Digit Symbol test.
Independent variables
The differential gene expression analysis will be conducted using the newly obtained microRNAome sequencing data from the DFG Sequencing Costs in Projects from the first visit (n=808). Additional variables of interest are: sex, age at first interview, year of birth, educational status, and medication status.
QC and preprocessing
For the quality control and preprocessing of the microRNA data the FASTQ files, we will follow an analysis pipeline established by Fischer et al (5,6). Briefly, after the sequencing, QC will be done using miRTrace (v1.11.0), while alignment and mapping of the resulting reads will be conducted using the default functions of miRDeep2 (v2.0.1.2) and the read counts will be calculated using the quantifier script also from miRDeep2. Transcripts with low read number (sum of read counts < 50) will be filtered out, as well as those enriched in blood.
The identification of covariates related to the variation in the data will be done using the variancePartition R package (7). The factors of unwanted variation will be removed using the RUVSeq package (8).
The differential expression analysis will be performed using DESeq2 v1.3.40 (9). The transcripts will be considered as significantly dysregulated presented the following parameters: adjusted p-value < 0.05, log2 fold change > 0.5 or < -0.5. The PCA plots will be generated using the function from DeSeq2 and the volcano plots will be created with the EnhancedVolcano package (Bioconductor).
Identification of the signature
Two models of differential gene expression will be tested. First, the cEF scores will be categorized in tertiles using the innate function in R. The content of the tertiles will be label high, medium, and low. Contrast will be selected to create the three possible comparisons. Second, the cEF scores will be used as a continuous variable and the dose-response significantly dysregulated candidates will be discovered. The overlap of significantly dysregulated miRNAs from the extreme comparison (Low vs High) and the dose-response analysis will be used for further analysis.
For the selection of the composite signature, logistic regression models will be tested in R. ROC curves will be calculated, and the best signatures will be selected based on the AUC. Moreover, machine learning methods will be employed to identify a composite 2 to 3 microRNA signature in Python (v3.8.1) using random forest. The signatures will be compared, for the selection of the final candidates.
Bioinformatics Analysis
mRNA targets will be identified using an in-house tool developed by Krueger et al (10).
Identification of biological pathways related to the common targets of the microRNAs will be conducted using gene ontology analysis with the enrichGO tool in R.
A network analysis of the targets of those microRNAs will be explored using Cytoscape (11)
Resources needed