In this lab we'll begin to explore phylogenetic analysis. One lab does not give enough time to gain experience with the most common methods, to explore and adjust and rerun the analysis. We'll use one method, work through the analysis, and try to understand the results.
We will use an interesting protein, SRY, the male-determining protein in mammals and some other groups.
Linked above are the sequences of the SRY polypeptide. One file contains human, gorilla, marmoset, horse, pig, sheep, European bison. The second file contains SRY from a marsupial animal, the dunnart. Use the dunnart as the outgroup in your analysis.
- 1a.Examine the NCBI phylogeny for this group of organisms. Use the "Taxonomy common tree" tool linked from the Taxonomy section of NCBI. Include a screen shot of the NCBI phylogeny.
- 1b. Why has the dunnart sequence been chosen as the outgroup? Explain briefly.
- 2a.Perform a multiple alignment of these SRY proteins (using ClustalW EBI web site). Note the perfect conservation of AAs MNAF (at about AA 70). Are these AAs useful in phylogenetic analysis of this group of sequences?
- 2b. Clustal uses a neighbor joining algorithm to construct its guide tree. Does the Clustal guide tree agree precisely with the NCBI phylogeny? (yes or no).
Use JalView to trim the alignment of the poorly aligned N-terminal and C-terminal regions of the alignment to give you a alignment of roughly 100-170aa. Export this alignment in FASTA format.
We'll use the PHYLIP PROTPARS program to generate phylogenetic trees. Check the bootstrap option, enter a random seed, and 100 replicates. Check the 'Compute a consensus tree' box. This runs CONSENSE on the output of PROTPARS. Under "Other options", indicate your outgroup sequence. The number is its order in the input file. Run the analysis.
- 3a. What type of algorithm does PROTPARS use? Give a one or two sentence answer.
- 3b. Does PROTPARS generate rooted or unrooted trees
- 3c. From the outfile, copy the first tree and a tree that shows a different relationship between the sequences to your lab report. Use a monospaced font.
- 4a. The Mobyle server should have run CONSENSE on the output to PROTPARS (see #3 above). This will be below the PROTPARS output. CONSENSE generates a consensus tree from the set of trees produced by PROTPARS. Copy the consensus tree to your lab report.
- 4b. Compare the consensus PROTPARS tree to the NCBI phylogeny. Indicate if they are in perfect agreement or briefly describe how the consensus tree differs from the NCBI tree.
- 4c. Briefly explain the meaning of the numbers on the consensus tree.
- 4d. Give an example of a well-supported clade in the tree in 4c (list the species human-pig-dog) and an unreliable clade. If no clades are unreliable, just indicate that.
- 4e. For one of the sets not included in the consensus tree, give the line from the output file and indicate which sequences are in this set.
Example set line:
.***... 7.50
- 5. Indicate two things you could do to improve this phylogenetic analysis.
BIO520
|