You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/Bioinformatics_Concepts/GWAS_By_Subtraction.md
+17-15Lines changed: 17 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,8 +14,8 @@ It is useful to understand GWAS-by-subtraction via linear algebra.
14
14
Consider a [Euclidian space](https://en.wikipedia.org/wiki/Euclidean_space) in which:
15
15
16
16
- GWAS traits are vectors.
17
-
- The [inner product](https://en.wikipedia.org/wiki/Inner_product_space) of two traits is their [genetic covariance](Genetic_Correlation.md). Denote the inner product of $u$ and $v$ as $\langle u,v \rangle$.
18
-
- We assume all phenotypes have been normalized to have variance of 1. Under this assumption, a trait's squared [Euclidian norm](https://en.wikipedia.org/wiki/Inner_product_space#Norm_properties) is its heritability: $\lVert v \rVert^2=h^2_v$ where $h^2_v$ is the heritability of $v$.
17
+
- The [inner product](https://en.wikipedia.org/wiki/Inner_product_space) of two traits is their [genetic covariance](Genetic_Correlation.md#genetic-covariance). Denote the inner product of traits $u$ and $v$ as $\langle u,v \rangle$.
18
+
- We assume all phenotypes have been normalized to have variance of 1. Under this assumption, a trait's squared [Euclidian norm](https://en.wikipedia.org/wiki/Inner_product_space#Norm_properties) is its heritability: $\lVert v \rVert^2=h^2_v$ where $h^2_v$ is the heritability of trait $v$.
19
19
20
20
21
21
@@ -90,13 +90,14 @@ Where:
90
90
- $x\in\mathbb{R}^M$ is the random genotype. We assume $x$ has mean zero, but unlike in [LDSC](LDSC.md), we do not assume it has been variance standardized. Let $H_i$ be the variance of the $i$th variant.
91
91
- $\beta_F,\beta_R\in\mathbb{R}^M$ are the underlying causal effects of the genetic variants.
92
92
- $F,R$ are the two orthonormal underlying factors.
93
-
- $\delta_1, \delta_2$ are the non-genetic components of the two traits. We assume these effects are independent of all genotypes.
93
+
- $a_F,a_R,b\in\mathbb{R}$ are the scalar multipliers that relate the normalized factors $F,R$ to the unnormalized factors $F',R'$.
94
+
- $\delta_1, \delta_2\in\mathbb{R}$ are the random non-genetic components of the two traits. We assume these effects are independent of all genotypes.
94
95
95
96
96
97
### Marginal Model
97
98
98
99
99
-
Let's now focus on SNP $i$, and develop a model around the marginal GWAS regression on this SNP.
100
+
Let's now focus on arbitrary SNP $i$, and model the marginal GWAS regression on this SNP.
100
101
101
102
102
103
Define
@@ -125,32 +126,32 @@ R &= \hat\beta_{R,i}x_i+\zeta_{R,i}\\
125
126
\end{align}
126
127
$$
127
128
128
-
We assume $\zeta_{F,i},\zeta_{R_i}$ are approximately independent of $x_i$. This is a good approximation so long as individual variant effects ($\beta_{R,i},\beta_{R,i}$) are small, as is the case for most non-Mendelian traits.
129
+
We assume $\zeta_{F,i},\zeta_{R_i}$ are approximately independent of $x_i$. While not strictly true, this is a good approximation so long as individual variant effects ($\beta_{R,i},\beta_{R,i}$) are small, as is the case for polygenic traits.
129
130
130
131
### Theoretical covariance
131
132
132
-
Next, let us examine the genetic covariance structure of the random variables $(x_i, T_1, T_2)$.
133
+
Next, let us examine the genetic covariance structure of the scalar random variables $(x_i, T_1, T_2)$.
133
134
134
-
We will denote by $\mathrm{GCov}$ and $\mathrm{GVar}$ the genetic covariance and variance respectively\footnote{Because of our earlier assumption that phenotype variance has been normalized to 1, genetic variance equals heritability.}.
135
+
We will denote by $\mathrm{GCov}$ and $\mathrm{GVar}$ the genetic covariance and variance respectively[^covnote].
135
136
136
137
137
138
$$
138
139
\begin{align}
139
-
&\mathrm{GCov}(X_i, T_1)\\
140
-
&=\mathrm{GCov}(X_i, a_F F + a_R R + \delta_1)\\
141
-
&=\mathrm{Cov}(X_i, a_F F + a_R R ) & \text{Since $\delta_1$ is non-genetic}\\
Copy file name to clipboardExpand all lines: docs/Bioinformatics_Concepts/Genetic_Correlation.md
+7-3Lines changed: 7 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,8 +8,8 @@ The model is
8
8
9
9
$$
10
10
\begin{align}
11
-
Y_A &= E_A + G_A,\\
12
-
Y_B &= E_B + G_B.
11
+
Y_A &= E_A + G_A, \label{model1} \\
12
+
Y_B &= E_B + G_B. \label{model2}
13
13
\end{align}
14
14
$$
15
15
@@ -27,4 +27,8 @@ What does it mean biologically when two traits are genetically correlated? The m
27
27
Besides these straightforward cases, there are more exotic possible causes of genetic correlation, as discussed [here](https://gcbias.org/2016/04/19/what-is-genetic-correlation/). Briefly,
28
28
29
29
- Two traits can be genetically correlated because genetics affects the behavior of a parent, which affects the phenotype of their child.
30
-
- Two traits can be genetically correlated because individuals with these traits tend to mate at a higher rate than would be expected under random mating.
30
+
- Two traits can be genetically correlated because individuals with these traits tend to mate at a higher rate than would be expected under random mating.
31
+
32
+
## Genetic Covariance
33
+
34
+
Some applications require the calculation of the genetic covariance between two traits. In the context of the model of $(\ref{model1},\ref{model2})$, the genetic covariance is $\mathbb{Cov}(G_A, G_B)$. Note that genetic covariance depends strongly on how the traits are scaled.
where the last line follows from the Projection Theorem (pg. 345 in Grimmet and Stirzaker[@grimmett2020probability]). Where before we needed to assume $\mathbb{Cov}(E,G)=0$, here this property is automatic.
90
+
where the last line follows from the Projection Theorem (pg. 345 in Grimmet and Stirzaker[@grimmett2020probability]). Where before we needed to assume $\mathrm{Cov}(E,G)=0$, here this property is automatic.
91
91
92
92
93
93
- This approach has the **advantage** of its mathematical clarity. Whereas the standard definition of heritability requires some fairly restrictive assumptions, this alternative definition is applicable to any phenotype representable by a random variable in $L_2$. Mathematically, it is now crystal clear what we mean when we speak of $G$ and $E$.
0 commit comments