-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathloading-tree.qmd
More file actions
188 lines (126 loc) · 9.17 KB
/
Copy pathloading-tree.qmd
File metadata and controls
188 lines (126 loc) · 9.17 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
# Loading a Phylogenetic Tree {#sec-itol-load}
## Introduction
As noted in @sec-introduction, you will generally encounter phylogenetic trees as a plain text file in one of a handful of common file formats, such as:
- [`newick`](https://en.wikipedia.org/wiki/Newick_format), file ending `.new`, `.nwk`, `.newick`
- [`NEXUS`](https://en.wikipedia.org/wiki/Nexus_file), file ending `.nex`, `.nexus`
- [`phylip`](https://en.wikipedia.org/wiki/PHYLIP), file ending `.phy`, `.phylip`
Whichever software tool you use to read and visualise your tree, you will need to _load_ your data into that tool. In `iToL` you need to upload the data to the server. In `figtree` or `dendroscope` you would open the file in the application. Using `ggtree` in `R`, you would write code to open the file and run commands to visualise it.
::: { .callout-note }
In this workshop we will be using the online `iToL` service to visualise and interpret trees. This does not require you to install any software on your machine. The `iToL` service is available at the link below.
- [**`iToL`**](https://itol.embl.de/)
:::
.](assets/images/itol-landing.png){#fig-itol-landing width=80%}
## Load your tree data
Your tree data is in a file called `tree_newick.nwk`. This tree file describes a phylogeny with 13 samples, and it should be downloaded to your computer from the link below.
::: { .callout-note }
If you single-click on the link below, the tree file will open in your browser.
If you right-click on the link, you will see a _context menu_ with the option `Save link as…` (or similar, depending on your operating system/browser). This will allow you to save the file to your computer.
:::
- [**`tree_newick.nwk`**](https://raw.githubusercontent.com/sipbs-compbiol/BM211-Workshop-5/main/assets/trees/tree_newick.nwk)
::: { .callout-important title="Task" }
Download the tree file `tree_newick.nwk` to your computer.
:::
::: { .callout-tip collapse="true" }
## I need a hint!
Right-click on the link above and use `Save link as…` to save the file on your computer.
:::
::: { .callout-important title="Task" }
Upload the tree file `tree_newick.nwk` to `iToL`.
:::
::: { .callout-tip collapse="true" }
## I need a hint!
- Click on the `Upload` button (@fig-itol-upload-button) to see the tree file upload page
- Follow the instructions on the page
{#fig-itol-upload-button width=-80%}
:::
::: { .callout-warning collapse="true" }
## Help! I'm stuck!
- Click on the `Upload` button (@fig-itol-upload-button) to see the tree file upload page
- Enter a name for the tree in the `Tree name:` field.
- Click on the `Browse` button, and select your downloaded `tree_newick.nwk` file. The name of the file will appear next to the button.
- Click on the `Upload` button
{#fig-itol-tree-upload width=80%}
:::
::: { .callout-caution collapse="true" }
## Check your result
After uploading the tree, `iTol` should present the default tree view, as in (@fig-itol-tree-newick-01).
{#fig-itol-tree-newick-01 width=80%}
:::
## Understanding your tree
Considering the `newick` file contents you saw in @sec-introduction (and shown below), you might expect that the tree shows relationships between species represented by the letters `A`-`M` in that file:
```text
(((((((A:4,B:4):6,C:5):8,D:6):3,E:21):10,((F:4,G:12):14,H:8):13):13,((I:5,J:2):30,(K:11,L:11):2):17):4,M:56);
```
The tree `iTol` presents at first (@fig-tree-newick-01) shows these relationships.
{#fig-tree-newick-01 width=80%}
- the tree is a set of _bifurcating branches_ that spread out from the most recent common ancestor (MRCA) and represent an **estimated** evolutionary history
- this is a _rooted tree_, so the most recent common ancestor (MRCA) is considered to be represented at the very far left of the tree, at its _root_
- **in this tree, _ancestors_ are on the left, and _descendants_ are on the right**
- the horizontal dimension in this plot shows the amount of genetic change since the MRCA - less change towards the left-hand side, more towards the right
- **this tree is a _phylogram_, which means that the lengths of the branches are meaningful and represent the amount of genetic change**
::: { .callout-note }
Branch lengths are usually drawn in units of _substitutions per site_ - the estimated total number of nucleotide substitutions, divided by (_normalised to_) the length of the sequence.
You may sometimes see alternative units being used, such as estimated time, or the percentage of sites that have changed.
:::
::: { .callout-important collapse="true" }
## Question 01
In @fig-tree-newick-01 which species shows the most genetic change since the MRCA?
- `A`
- `E`
- `G`
- `K`
:::
### Trees and time
The general relationship between the phylogram you have made and time is shown in @fig-tree-newick-02.
{#fig-tree-newick-02 width=80%}
As the tree represents genetic change, and the rate of genetic change may not be constant in all sequences or organisms, we can't immediately interpret branch lengths in terms of _elapsed time_. But, when the sequences/organisms used to build the tree are known to exist _now_, we can say that the leaves of the tree approximately represent the current date.
::: { .callout-note }
There are phylogenetic techniques that allow us to convert genetic distances into approximate times, but they are beyond the scope of this workshop.
:::
- **currently existing species (`A`-`M`) - the input for building the tree - are shown at the _leaves_** (or _leaf nodes_) of the tree, at the ends of the _branches_
- the branching events (_bifurcations_) show how evolutionary lineages split, as the amount of genetic change since the MRCA increases from left to right; they represent events in evolution where a population divides into two subgroups the go on to have distinct genetic histories
- **in a biological context, this process is often referred to as _speciation_**
- **the order of bifurcation since the MRCA can be interpreted as the order in which speciation events occurred**
- the amount of genetic change between branching events is represented by the length of the branch that separates them; the scale bar allows you to interpret the branch length quantitatively
::: { .callout-important collapse="true" }
## Question 02
@fig-tree-newick-01 implies that which speciation event was the earliest in time?
- the speciation separating `A` and `B`
- the speciation separating `B` and `C`
- the speciation separating `B` and `D`
- the speciation separating `C` and `E`
:::
### Ancestry and history
Phylogenetic trees represent patterns of shared ancestry and history between lineages. For instance, in @fig-tree-newick-03 common and unique ancestors of `A`, `B` and `C` are indicated, as are common and unique histories of `I` and `J`.
{#fig-tree-newick-03 width=80%}
::: { .callout-important collapse="true" }
## Question 03
In @fig-tree-newick-01 which of the following accumulated the greatest amount of genetic change?
- the unique history of `E`
- the shared history of `I` and `J`
- the shared history of `I`, `J`, `K`, and `L`
- the unique history of `D`
:::
### Clades
A _clade_ is a grouping on the tree that includes a common ancestor and all of its descendants. We call a group with these properties _monophyletic_ (i.e. it comprises a single phylum). @fig-tree-newick-04 shows examples of clades in the `iToL` tree you generated.
{#fig-tree-newick-04 width=80%}
::: { .callout-important collapse="true" }
## Question 04
In @fig-tree-newick-01 which of the following groups of leaf nodes form a _clade_?
1. (`J`, `K`, `L`)
2. (`F`, `G`, `H`)
3. (`D`, `E`)
4. (`K`, `L`)
:::
## Summary
::: { .callout-note title="Well Done!"}
After successfully working through this section you should be able to:
- upload a phylogenetic tree into `iToL`
- explain the meaning of branching events and branch lengths in a phylogenetic tree, and interpret the speciation events in a tree
- explain how a phylogenetic tree represents history and ancestry
- explain the concept of a _clade_ in phylogenetics, and identify one on a phylogenetic tree
:::
::: { .callout-important }
**Please answer the questions below in the formative quiz on MyPlace**
- [MyPlace formative quiz]({{< var myplace.quiz1 >}})
:::