forked from ayushnoori/apoe-glia
-
Notifications
You must be signed in to change notification settings - Fork 1
/
ROSMAP-preprocess.Rmd
96 lines (61 loc) · 1.62 KB
/
ROSMAP-preprocess.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
---
title: "Pre-Process ROSMAP Data"
description: |
This script removes low-expressed ROSMAP genes and adjusts by age, sex, and post-mortem interval (PMI).
output:
distill::distill_article:
toc: true
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(eval = FALSE)
```
# Dependencies
Load requisite packages.
```{r load-packages}
library(data.table)
```
# Load Data
Load `.Rdata` object saved by prior script.
```{r load-data}
# load data
load("../Data/ROSMAP-24.Rdata")
rosmap_fpkm = rosmap$fpkm
cov = rosmap$cov
```
# Filter Genes
Remove low-expressed genes.
```{r filter-genes}
# remove low expressed genes
gene_sel = rowMeans(rosmap_fpkm) > 0.1
rosmap_fpkm = rosmap_fpkm[gene_sel, ]
```
# Adjust and Transform
Adjust by age, sex, and post-mortem interval (PMI), then log-transform data.
```{r adjust-covariates}
# create empty matrix to populate
adj_exp = matrix(NA, ncol = ncol(rosmap_fpkm), nrow = nrow(rosmap_fpkm))
colnames(adj_exp) = cov$projid
rownames(adj_exp) = rownames(rosmap_fpkm)
# adjust by age, sex, and PMI
msex = cov$msex
age_at_death = cov$age_at_death
pmi = cov$pmi
# impute missing values with mean
pmi[is.na(pmi)] = mean(pmi, na.rm = T)
```
Log-transform data.
```{r log-transform}
# log2(fpkm + 0.0001) adjusted by age, sex, and PMI
for(gene in rownames(rosmap_fpkm)){
exp = log2(as.numeric(rosmap_fpkm[gene,]) + 0.1^4)
lm_fit = lm(exp ~ msex + age_at_death + pmi)
adj_exp[gene, ] = lm_fit$residuals
}
```
# Save Data
```{r save-data}
# save data
expSet = list(mData = adj_exp,
cov = cov)
save(expSet, file = "../Data/ROSMAP-24-adj-low.expr.genes.removed.Rdata")
```