DEPARTMENT OF CHEMICAL ENGINEERING
IIT BOMBAY
CL 662: INTRODUCTION TO COMPUTATIONAL BIOLOGY
Home Work 3: Sequence Analysis Due on 19/2/2018. Marks: 20
Note: 1. You are expected to answer the assignments in your own words. Cite references and
websites / web servers wherever possible. Copying from any source or from other students in the
class (or past students) will result in FR grade. Both the source and sink will receive the same
penalty in cases of copying. Therefore, please do not share your work with others even after the
deadline.
2. Note that you will need to refer to text books, research or review articles, web sources, databases
or web servers to answer some of the questions. You are expected to find an appropriate source
when not mentioned in the assignment. Please cite your source.
3. Report concise summary of your results and analysis rather than submitting pages after pages of
output from a software program. Your comments on the results will carry weightage.
1. What are the differences between an iterated blast search (PSI-BLAST) and a simple BLAST
search? What are the advantages of PSI-BLAST? [2 marks]
2. Show the above difference with an example using the genome sequence provided to you for
earlier assignment (HW2). Show the PSSM generated from the search and compare with
BLOSUM62. [6 Marks]
3. Describe in brief the steps involved in creation of BLOSUM62 matrix. [2 Marks]
4. What is the difference between PAM and BLOSSUM substitution matrices? Please show in a
tabular format. [2 marks]
5. Given the wobble rules for codon-anticodon pairing in bacteria (Table 1), estimate the
minimum number of anticodons needed to recognize all 61 codons in bacteria. Show your
answer in a step-wise manner elucidating the minimum anticodons for each amino acid from
Table 3. Additionally, you may refer to any Biochemistry textbook such as Lehninger
Principles of Biochemistry for further details. [4 marks]
Table 1. The rule for wobble base-pairing between codon and anticodon.
Wobble Codon base Possible anticodon base
U A, G, or I
C G or I
A U or I
G C or U
Table 2. Codon preference in E. coli. The numbers represent the average frequency of codon
usage per 1000 codons based upon the DNA sequences of selected genes in E coli.
1
6. The genetic code was elucidated with polyribonucleotides synthesized either enzymatically or
chemically in the laboratory. Polynucleotide phosphorylase catalyzes the formation of RNA
polymers starting from ADP, UDP, CDP and GDP. This enzyme requires no template and
makes polymers with a base composition that directly reflects the relative concentrations of
the nucleoside 5'-diphosphate precursors in the medium. E.g., when presented with UDP
only, the enzyme makes poly(U). Assume that this enzyme and all the necessary reagents are
available with you.
a. Given that we now know about the genetic code, how would you make a
polyribonucleotide that could serve as an mRNA coding predominantly for many Phe
residues and a small number of Leu and Ser residues? What other amino acid(s) would
be coded for by this mRNA? [1 marks]
b. If U and C were used in the ratio 80:20 while synthesizing the mRNA, what would be
the proportion of various amino acids in your synthetic mRNA? [3 marks]
***************End*****************
2
Additional practice questions (not required to be submitted)
1) BLOSUM62 matrix is computed by counting frequencies of substitutions across
groups of multiple sequence alignments, where each group has 60-80% identity
among the sequences. Given below are three such groups from a block (an ungapped
alignment). Calculate the log odd score for the following substitutions: sT,W ; sT,P ; sR,A
; sN,Q ; sF,P. Clearly show your calculations while computing the values of qT,W ; qT,P ;
qR,A ; qN,Q ; qF,P from the data below. The values of pa and pb have to be based on the
amino acid sequences given below only. [5 Marks]
T P D E N
T P D R Q Group 1
T W D E Q
W F D A N Group 2
W A D A Q
P A D R N Group 3
Table 1: BLOSUM62 matrix
2. Messenger RNA Translation Predict the amino acid sequences of peptides formed by
ribosomes in response to the following mRNA sequences, assuming that the reading frame
begins with the first three bases in each sequence.
(a) GGUCAGUCGCUCCUGAUU
(b) UUGGAUGCGCCAUAAUUUGCU (c)
CAUGAUGCCUGUUGCUAC
(d) AUGGACGAA
3. How Many Different mRNA Sequences Can Specify One Amino Acid Sequence?
3
Write all the possible mRNA sequences that can code for the simple tripeptide segment Leu–
Met–Tyr. Your answer will give you some idea about the number of possible mRNAs that
can code for one polypeptide.
4. Can the Base Sequence of an mRNA Be Predicted from the Amino Acid Sequence of
Its Polypeptide Product? A given sequence of bases in an mRNA will code for one and only
one sequence of amino acids in a polypeptide, if the reading frame is specified. From a given
sequence of amino acid residues in a protein such as cytochrome c, can we predict the base
sequence of the unique mRNA that coded it? Give reasons for your answer.
5. Coding of a Polypeptide by Duplex DNA The tem- plate strand of a segment of double-
helical DNA contains the sequence
(5’)CTTAACACCCCTGACTTCGCGCCGTCG(3’)
(a) What is the base sequence of the mRNA that can be transcribed from this strand?
(b) What amino acid sequence could be coded by the mRNA in (a), starting from the 5’ end?
(c) If the complementary (nontemplate) strand of this DNA were transcribed and translated,
would the resulting amino acid sequence be the same as in (b)? Explain the bio- logical
significance of your answer.
6. Methionine Has Only One Codon Methionine is one of two amino acids with only one
codon. How does the single codon for methionine specify both the initiating residue and
interior Met residues of polypeptides synthesized by E. coli?
7. Predicting Anticodons from Codons Most amino acids have more than one codon and
attach to more than one tRNA, each with a different anticodon. Write all possible anti-
codons for the four codons of glycine: (5’)GGU, GGC, GGA, and GGG.
(a) From your answer, which of the positions in the anti- codons are primary determinants of
their codon specificity in the case of glycine?
(b) Which of these anticodon-codon pairings has/have a wobbly base pair?
(c) In which of the anticodon-codon pairings do all three positions exhibit strong Watson-
Crick hydrogen bonding?
8. Effect of Single-Base Changes on Amino Acid Sequence Much important confirmatory
evidence on the genetic code has come from assessing changes in the amino acid sequence of
mutant proteins after a single base has been changed in the gene that encodes the protein.
Which of the following amino acid replacements would be consistent with the genetic code if
the replacements were caused by a single base change? Which cannot be the result of a
single-base mutation? Why?
(a) PheLeu (b) IleLeu (c) LysAla (d) HisGlu (e) AlaThr (f) ProSer
(g) PheLys
9. Resistance of the Genetic Code to Mutation The following RNA sequence represents the
beginning of an open reading frame. What changes (if any) can occur at each position without
generating a change in the encoded amino acid residue?
4
(5’)AUGAUAUUGCUAUCUUGGACU
***************End*****************