0% found this document useful (0 votes)
3 views

Bioinformatics assignment-1

The document discusses Sanger sequencing and Next Generation Sequencing (NGS), highlighting Sanger's accuracy for specific mutations and NGS's broader application. It compares UPGMA and Neighbor Joining methods for phylogenetic analysis, emphasizing the limitations of UPGMA in variable evolution rates. Additionally, it includes a distance matrix for sequence comparison and outlines clustering steps for constructing phylogenetic trees.

Uploaded by

molemosaul
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Bioinformatics assignment-1

The document discusses Sanger sequencing and Next Generation Sequencing (NGS), highlighting Sanger's accuracy for specific mutations and NGS's broader application. It compares UPGMA and Neighbor Joining methods for phylogenetic analysis, emphasizing the limitations of UPGMA in variable evolution rates. Additionally, it includes a distance matrix for sequence comparison and outlines clustering steps for constructing phylogenetic trees.

Uploaded by

molemosaul
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Bioinformatics Module 2: ABBFY5A

Assignment 2

Next Generation Sequencing and Phylogenetic Analysis

Question 1

1.1 Briefly explain the principle behind Sanger sequencing (5 marks):

An alternative method which is called chain termination method or Sanger’s sequencing


utilizes nucleotides called dideoxynucleotides abbreviated as ddNTP. These ddNTPs do
not possess a 3.’-OH group, which makes the strand to come to an end. During the
synthesis of the DNA, various size pieces result from it and these are resolved by gel or
capillary electrophoresis. Thus, reading the final base of each fragment or, better still,
resorting to chain determination, one can always reconstitute the DNA sequence.

1.2 Explain a situation that would warrant a researcher to use Sanger sequencing
rather than NGS (4 marks):

For that reason, the Sanger sequencing is beneficial in conditions when greater
accuracy of the small or selected number of DNA sequences is necessary. An example
would be clinical diagnostics, where a gene must be sequenced to identify a mutation
that will already be familiar to the researcher. Sanger sequencing is very accurate and
can therefore be used when the diagnosis involves sequencing one or more specific
point mutations in certain genes and NGS is convenient when the diagnosis is done on
the whole genome or the whole exome.

1.3 Select the incorrect statement about Sanger and NGS (3 marks):

The incorrect statement is:

 "In next-generation sequencing, one needs to have prior knowledge of the


sequence identity of the 3’ and 5’ ends of the gene to be sequenced."
NGS does not require prior knowledge of sequence ends, as it sequences
randomly fragmented DNA through a method known as shotgun sequencing.

1.4 Select the correct workflow for library generation in NGS (3 marks):
The correct workflow is:

 Shearing/Fragmentation of the target → Generate blunt end fragments →


Addition of an 'A' base to the 3' end of each strand → Ligation of Adapter to
the fragment → PCR amplification of the target.

Question 2

2.1 UPGMA method for phylogenetic analysis and comparison to Neighbor


Joining (6 marks):

UPGMA is a simple method of constructing phylogenetic trees based on a distance


matrix available with the measures of dissimilarity between two objects. But, this one
assumes a constant rate of evolution, by grouping together taxa by the smallest
distances, and gives a rooted tree. Nevertheless, there are some incongruities of the
trees created by UPGMA, these are; UPGMA underestimates resolution resulting to
inaccurate trees if the rates of evolution are unequal. Neighbor Joining provides the tree
of the evolution without restrictive assumptions that the rate of evolution is constant. It
also decreases the amount of branch length of the tree, and a phylogenetic tree is more
accurate when evolution rates vary. Neighbor Joining results in unrooted trees of
species phylogenetic relationships.

2.2 Value of constructing phylogenetic trees in bioinformatics studies (1 mark):

A phylogenetic tree predicts evolutionary relationships between some species or genes.


From them people can learn about historical relatedness, and organize genes in terms
of the passing of diseases and evolution of life forms.

2.3 Construct the distance matrix (8 marks):

Using the following sequences:

 A: TCGCCGGGTTTATATATACG

 B: ACGCCGGGTTTATATATACC

 C: TGGCCGGCTATATATAAACG

 D: TCGCCGGGAATATATATAGC
 E: AAGCCGGGTTTATATAGGGG

 F: ACGCCGGGTTTATATATACG

Here is the pairwise distance matrix (the number of differing nucleotides between each
pair):

A B C D E F
A - 2 4 3 5 1
B - - 6 7 3 2
C - - - 4 8 4
D - - - - 6 3
E - - - - - 5
F - - - - - -

2.4 Using the distance matrix method, construct the phylogenetic tree using UPGMA.

2.5 Construct the phylogenetic tree using MEGA and include a screenshot.
2.6 Second cluster and respective distance

Original Distance Matrix:

A B C D E
A - 2 4 3 5
B - - 6 7 3
C - - - 4 8
D - - - - 6
E - - - - -
Identifying the First Cluster:

The first cluster was A and B ( distance of 2).

After clustering A and B, the updated distance matrix is:

(A-B) C D E
(A-B) - 4 5 4
C - - 4 8
D - - - 6
E - - - -
The updated matrix is:

(A-B-C) D E
(A-B-C) - 4.5 6
D - - 6
E - - -
Next Smallest Distance:

The next smallest distance is now 4.5, which occurs between (A-B-C) and D.

Final Clustering:

Cluster (A-B-C) with D to form (A-B-C-D).

Summary of Clusters:

First Cluster: (A-B) at distance 2


Second Cluster: (A-B-C) at distance 4.5

Next Clustering Step: (A-B-C-D)

You might also like