0% found this document useful (0 votes)

23 views23 pages

Unix Commands for Computational Biology

Uploaded by

Krish krish

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views23 pages

Unix Commands for Computational Biology

Uploaded by

Krish krish

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

DEPARTMENT OF BIOTECHNOLOGY

KUMARAGURU COLLEGE OF
TECHNOLOGY

Each question should be addressed and codes for each section should be marked correctly
P18BTI2203L: Computational Biology Academic Year:2021-
Laboratory 22
1 - Unix Commands &
Instructor: Dr. Ram Scripting Scribes: R18-
K MBT1
Answer all the following [(CO1,5), (K2)]
1. (5 points) A listing of all processes that you are currently running on the machine you are using, sorted by the command
name in reverse alphabetical order. The output should consist only of the processes you are running.
2. (5 points) The number of words in the file /usr/dict/words (*) which contain all of the letters ”ass”,”bae”,”zer”. List
them individually1
3. (5 points) A ”long” listing of the largest 5 files in the /etc directory whose name contains the string ”.conf”, sorted by
decreasing file size.
4. (5 points) Create multiple folder with the starting name as 20MBTxxx followed by (001..018). Copy a file called
”sample” into all the folder
5. (2 points) List all files in the tmp directory owned by root
6. (3 points) Create a file ”detail” which contains the names of those files in the My Documents directory, which begins with
”a”, which have been modified in the last three days.
7. (5 points) Display the date in the mm/dd/yy format, along with the present time in AM/PM
8. (10 points) Create a file called places whose sample data is as follows and answer the questions below
2.

bombay india 45 asia

67
7
karachi pakistan 54 Asia
87
6
nairobi Kenya 32 africa
19
6

(a) List the details for the countries usa, kenya and canada
(b) list the detials for the continent asia ignoring case-sensitive
(c) Display the list of those countries whose population is between 40000 and 60000
(d) Extract the lines which end with ”fa”

Question: 1 2 3 4 5 6 7 8 Total
Points: 5 5 5 5 2 3 5 1 40
0
Score:

Course Coordinator
****
1
Note: On some Unix/Linux systems, the dictionary has the filename /usr/share/dict/words
2
Create as much as entries as you require for solutio

1
Experiment 1-UNIX commands
1 # Process monitoring2
3 >top -d 5 -b | grep -i "COMMAND" -A 154
5 PID USER PR NIVIRTRES SHR S %CPU %MEM TIME+ COMMAND
6 1685 bioinfo 20 0 1739020 140976 60364 S 1.0 3.6 0:44.08 cinnamon
7 4722 bioinfo 20 0 41784 3712 3120 R 0.4 0.1 0:00.02 top
8 2318 bioinfo 20 0 1321048 249192 117016 S 0.2 6.3 1:29.26 chrome
9 2379 bioinfo 20 0 585604 118740 57660 S 0.2 3.0 0:20.54 chrome
10 4081 bioinfo 20 0 1026908 242260 75716 S 0.2 6.1 0:34.96 chrome
11 4554 root 20 0 0 0 0 S 0.2 0.0 0:00.30 kworker/u16:0
12 1 root 20 0 119784 5908 3956 S 0.0 0.1 0:01.40 systemd
13 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
14 4 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
15 6 root 20 0 0 0 0 S 0.0 0.0 0:00.03 ksoftirqd/0
16 7 root 20 0 0 0 0 S 0.0 0.0 0:00.77 rcu_sched
17 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
18 9 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
19 10 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 lru-add-drain
20 11 root rt 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
21
22
23 # Word count
24 > cd /usr/share/dict/
25 bioinfo@bioinfo-OptiPlex-380 /usr/share/dict $ ls
26 american-english british-english cracklib-small [Link]-wordlist words
[Link]-dictionaries-common
27 bioinfo@bioinfo-OptiPlex-380 /usr/share/dict $ grep -i "ass" words | wc -w28 710
29 bioinfo@bioinfo-OptiPlex-380 /usr/share/dict $ grep -i "bae" words | wc -w30 8
31 bioinfo@bioinfo-OptiPlex-380 /usr/share/dict $ grep -i "zer" words | wc -w32 144
33
34
35 # Long listing
36 bioinfo@bioinfo-OptiPlex-380 ~ $ cd /etc/
37 bioinfo@bioinfo-OptiPlex-380 /etc $ ls -l -a -s -s /etc/*.conf | head -5
38 4 -rw-r--r-- 1 root root 3028 Nov 24 2017 /etc/[Link]
39 4 -rw-r--r-- 1 root root 112 Jan 10 2014 /etc/[Link]
40 24 -rw-r--r-- 1 root root 23444 Apr 28 2016 /etc/[Link]
41 8 -rw-r--r-- 1 root root 6488 Nov 24 2017 /etc/[Link]
42 4 -rw-r--r-- 1 root root 429 Nov 24 2017 /etc/casper.conf43
44 # Create multiple files
45 bioinfo@bioinfo-OptiPlex-380 ~ $ mkdir test01
46 bioinfo@bioinfo-OptiPlex-380 ~ $ cd test01/
47 bioinfo@bioinfo-OptiPlex-380 ~/test01 $ ls
48 bioinfo@bioinfo-OptiPlex-380 ~/test01 $ mkdir 20MBT{001..018}
49 bioinfo@bioinfo-OptiPlex-380 ~/test01 $ ls
50 20MBT001 20MBT003 20MBT005 20MBT007 20MBT009 20MBT011 20MBT013 20MBT015
20MBT017
51 20MBT002 20MBT004 20MBT006 20MBT008 20MBT010 20MBT012 20MBT014 20MBT016
20MBT018
52
53 # Files owned by root
54 bioinfo@bioinfo-OptiPlex-380 ~ $ ls -l /tmp/ | grep "root"55
drwx------ 3 root root 4096 Feb 11 2016
[Link]-LkiuAp
56 drwx------ 3 root root 4096 Feb 11 2016
[Link]-Qsm9kT
57 bioinfo@bioinfo-OptiPlex-380 ~ $58
59 # List only files starting with "A"
60 bioinfo@bioinfo-OptiPlex-380 ~/Documents $ ls -d a* > [Link]
61 bioinfo@bioinfo-OptiPlex-380 ~/Documents $ cat [Link]
62 [Link]
63 [Link]
64 bioinfo@bioinfo-OptiPlex-380 ~/Documents $
65

2
P18BTI2203-Computational biology 21MBT011
CBL-001
66 # date function
67 bioinfo@bioinfo-OptiPlex-380 ~/Documents $ date +%D%r
68 08/17/[Link] PM IST
69
70 # Grep function
71 bioinfo@bioinfo-OptiPlex-380 ~/Documents $ egrep "usa | Kenya | canada " [Link]
72 nairobi Kenya 32196 africa
73 bioinfo@bioinfo-OptiPlex-380 ~/Documents $
74
75 # Remove case sensitive grep
76 bioinfo@bioinfo-OptiPlex-380 ~/Documents $ grep -i "asia" [Link]
77 bombay india 45677 asia
78 karachi pakistan 54876 Asia
79 bioinfo@bioinfo-OptiPlex-380 ~/Documents $
80
81
82 # population between 40K to 60K
83 bioinfo@bioinfo-OptiPlex-380 ~/Documents $ cat [Link] | grep
"[4-6][0-9][0-9][0-9][0-9]"
84 bombay india 45677 asia
85 karachi pakistan 54876 Asia
86 bioinfo@bioinfo-OptiPlex-380 ~/Documents $
87
88 # Search lines with "ia"
89 bioinfo@bioinfo-OptiPlex-380 ~/Documents $ grep ia [Link]
90 bombay india 45677 asia
91 karachi pakistan 54876 Asia
92 bioinfo@bioinfo-OptiPlex-380 ~/Documents $
93
94
95
96

3
21MBT011
P18BTI2203 Computational biology CBL-002

P18BTI2203L: Computational Biology Academic Year:2021-

Laboratory 22
8–Molecular Visualization
Instructor: Dr. Ram f \ v v v v Visualization Scribes: R18-
K MBT1

Each question should be addressed and codes for each section should be marked correctly

Answer all the following (CO5), (K2)]

1. (10 points) Perform the following to represent the out membrane surface protein of [Link] [OmpV] which is been
structurally solved using x-ray crystallography.
(a) Compare the close homologous structure. Align them and highlight the structural difference
(b) Identify the cofactor/ ligand bound to the protein and represent them
(c) 2D-Label the regions of interaction and represent them

Questio 1 Total
n:
Points: 10 10
Score:

Course Coordinator
****

4
21MBT011
P18BTI2203 Computational biology CBL-002
Experiment 2 – Molecular visualization
A) Compare the close homologue structure Align them and highlight the Structural
difference.

Fig 1.1 comparision of two homologous protein

B) Identify the cofactors/Ligand bound to protein and represent them

Fig 1.2 Ligand of the protein structure Fig 1.3 Distance between two HOH molecules

C) 2D label the regions of interaction and represent them.

Fig 1.4 two labeling of two protein structures along with the completely aligned sequence (green )

5
21MBT011
P18BTI2203 Computational biology CBL-002

Inference:-
Outer membrane surface protein of vibrio cholerae [OmpV] which is
structurally solved using x-ray crystallography.
It is found that 2WK7 and 2WK9 are homologous in structure and sequence they are
very close in evolution as they found in similar organism.
2WK7 - Structure of APO form of vibrio cholerae CqsA
2WK9 - Structure of Plp-Thr aldimine form of Vibrio cholerae CqsA
In Fig 2.1 wk7 is represented in grey colour and 2WK9 is represented in light
bluecolour. Two proteins were aligned.
In Fig 2.2 the Ligands are represented in red colour. Which is PLG 600 B c5 and
PLP600 A c5
In Fig 2.3 the distance between two atoms has been calculated. One HOH to
anotherHOH atom is 4.304 Å.
In Fig 2.4 2D labelling has been done for the visualization along with the
representation of Portion of similar sequence of two protein. It is identified that the A-
chain of 2WK7 and B-chain of 2WK9 and nearly identical. In Fig 4 Representation is
done in green colour starts from ASN 47A to PRO 4A.

6
P18BTI2203-Computational Biology 21MBT011
CBL-003

P18BTI2203L: Computational Biology Academic Year:2021-

Laboratory 22
3 - Sequence Similarity using BLAST
Program
Instructor: Dr. Ram K Scribes: R18-
MBT1
Each question should be addressed and codes for each section should be marked correctly

Answer all the following [(CO6,5), (K4)]

1. (10 points) Obtain the human HBA and HBB protein sequences. Perform pairwise alignment at the
NCBI BLAST website.
(a) Use a comparison tool from the EBI website.
(b) Vary the scoring matrix (e.g. try different PAM and BLOSUM matrices) and record the effects
on the score, the number of gaps, the percent identity, and the length of the aligned region.
(c) For the NCBI BLASTP program note that the output of a pairwise alignment includes a dot
matrix view.

Questio 1 Total
n:
Points: 10 10
Score:

Course Coordinator
***
*

7
P18BTI2203-Computational Biology 21MBT011
CBL-003

Experiment 3- BLAST
1. Identify all homologous protein of human retinol binding protein of Human Retinol-
binding protein 4 (RBP4; NP_006735) using blast P

Fig 3.1 Identification of homologue sequence using Blast P

Fig 3.2 Graphical summary for the selected sequence

8
P18BTI2203-Computational Biology 21MBT011
CBL-003

Fig 3.3 phylogenetic tree and Multiple alignment results for the

Blast P has been done and Homolog sequence of different organism has been found. Along
with that blast tree has been viewed. Multiple sequence has been for top 10 sequences

9
P18BTI2203-Computational Biology 21MBT011
CBL-003

2. Identify a distant homolog of the above protein using PSI-BLAST

Fig 3.4 Distant homolog after running 4 iteration using PSI Blast

Fig 3.5 distant homologue protein is found Which has high scoring

Fig 3.6 amino acid sequence of the distant homologue protein

10
P18BTI2203-Computational Biology 21MBT011
CBL-003

Fig 3.7 Graphical summary for the selected sequence in PSI BLAST

Fig 3.8 Multiple alignment results and Phylogenetic tree for the selected sequence in PSI BLAST
In Web logo height of the stack indicates the sequence conservation at that position while
height of symbols within stack indicates the relative frequency of each amino acid at that
position.

11
P18BTI2203-Computational Biology 21MBT011
CBL-003

[Link] the signature and search for related proteins using PHI Blast
We found the conserved sequence domain for RBP4 Protein which is
DCRVSSFRVKE Red marked are hydrophobic amino acids. These hydrophobic amino acids
helps in stabilizing the structure of protein.

Fig 3.9 Searching in PHI Blast with the conserved sequence

Fig 3.10 Conserved sequence found in web logo

12
P18BTI2203-Computational Biology 21MBT011
CBL-003

Inference:
2. PSI-BLAST provides a distant relationship between given protein. PSSM is used to
further search database for new matches, and is updated for subsequent iterations with the
newly detected sequences.
The identified distant homolog of this protein is retinol binding protein 4 (phyllostomus
discolor). GenBank common name: pale spear-nosed bat
Kingdom: Animalia Phylum: chordata Class: Mammalia Order: Chiroptera family:
phyllostomidae Genus: phyllostomus
Distribution and habitat: The species found in southern Mexico to northern Peru and Bolovia
when we query a database, our sequence gets compared to every other sequence until top hits
are found and reported in results with quality metrices.
Some hits may report the same scores and so differentiating the varying levels of confidence
that each parameter describes is necessary to choose sequence for the next phase of analysis.
The results defined as
Maximum bit score: 398, is the highest alignment score (bit-score)
between the query sequence and the database segments. It is inversely proportional to the e-
value. The higher the bits core, the better the sequence similarity Total score: 398, is the sum of
the alignment scores of all sequences from the same database
Percent query coverage: here it is 90% to 100% after three iterations, it
describes how similar the query is to the aligned sequence. The e value is observed as 2𝑒−146.
It is the number of expected hits of similar quality(Score) that could be found just by chance,
given the same size of random database. it is the first quality filter for the BLAST search
result, to obtain only results equal to or better than the number given by the e value option.

The BLAST hits with E-value smaller than 1𝑒−50 includes database
matches of very high quality, Blast hits with E-value smaller than 0.01 can still be considered
as good hit for homology [Link] PSSM captures the conservation pattern in alignment
and stores it as a matrix of scoresand weakly conserved position receives scores as zero.
The newly detected sequences from second round of search, which are above specified score
(e value) threshold is again added to alignment and the profile is refined for another round of
searching.
This process is iteratively continued until desired or until convergence, which is the state
where no new sequence is detected above the defined threshold.

13
P18BTI2203 Computational biology 21MBT011
CBL-004

Experiment 4 - Artificial Neural Network

P18BTI2203L: Computational Biology Academic Year:2021-

Laboratory 22
4- Artificial Neural Network
Instructor: Dr. Ram Scribes: R18-
K MBT1
Each question should be addressed and codes for each section should be marked correctly

Answer all the following (CO5),

(K2)]
1. (5 points) Construct an artificial neural network for predicting percentage of adsorption for the biochar used.
The training data is given in table below
(a) Where X1 is Temperature, X2 in ◦C, X2 , X3 and X4 are pH, initial concentration (mg/L)
and biochar dose(g)
2. (5 points) Given the seed dataset for miRNA-mRNA interaction. Prediction have been made to detect
whether the interaction would result in Class 1 (Oncogene) or Class 2 (Tumour suppressor gene). Construct
an ANN architecture and report your inference.

Question: 1 2 Total
Points: 5 5 10
Score:

Course Coordinator
****

14
P18BTI2203 Computational biology 21MBT011
CBL-004

1. Construct an artificial neural network for predicting percentage of adsorption for the biocharused. The
training data is given in table below
a) Where X1 is Temperature, X2 in˚C, X2, X3 and X4 are pH, initialconcentration
(mg/L) and biochar dose(g)
construction of artificial neural network:

Fig 4.1 Neural network structure for the inputs

Fig 4.2 best Validation Performance at epoch 3

15
P18BTI2203 Computational biology 21MBT011
CBL-004

Fig 4.3 Regression line of Training, Validation, Test and All with respect to Target

Fig 4.4 Final out put after simulating the test value

The input data given to the train and test predicts the expected target output i.e 68.1348 for the given set of
data. The first 70% of data is taken as train values and target values and tested withthe remaining data to
predict the expected target. The expect target with minimal error is obtained in the 7th iteration.
Information flows through the neural network constructed with input, output and output layer in two ways.
Patterns of information are fed into the network via input units, which trigger the layers of hidden units, and
these in turn arrive at output units. It

16
P18BTI2203 Computational biology 21MBT011
CBL-004

takes the input and computes the wighted sum of inputs and includes a bias. This computation is
represented in the form of a transfer function. It determines weighted total is passed as an input to an
activation function to produce output. In the first iteration the values of R are: Training: 0.,81779,
validation: 0.32095, test: 0.78845 All: 0.85211 Here validation and test valuesare nowhere near 0.9, the
output obtained is 57.52 but our expected target value should be nearer to 68. Therefore, some iteration
has been run to get out expected target output value. In the seventh iteration the values of R are: Training:
0.98651, validation: 0.93437, test: 0.99996 All:0.98932 Here all the values of training, test and target
values are nearest to 0.9 which has minimal error, the final result obtained is 68.1348 which is same as our
expected target value which is 68.18. As the neural network has arrived to the closest expected target
value, the iteration can be stopped right here.

2. Given the seed dataset for miRNA-mRNA interaction. Prediction have been made to detectwhether
the interaction would result in Class 1 (Oncogene) or Class 2 (Tumour suppressor gene). Construct an
ANN architecture and report your inference.

Fig 4.5 Neural network structure for the inputs

17
P18BTI2203 Computational biology 21MBT011
CBL-004

Fig 4.6 Best Validation Performance at epoch 4

Fig 4.7 Regression line of Training, Validation, Test and All with respect to Target

18
P18BTI2203 Computational biology 21MBT011
CBL-004

Fig 4.8 Final out put after simulating the test value

for the given set of data. The first 70% of data is taken as train values and target values andtested with the
remaining data(30%) to predict the expected target. The expect target with minimal error is obtained in the
4th iteration.
Prediction have been made to detected as Class 2 (Tumour suppressor gene). The ANN process input data
by looping over time steps and updating the network state. The network contains information remembered
over all previous steps. In each time step of input sequence,the network learns to predict the value of the
next step. The training progress displayed in formof plot. The prepared test data use the same steps of the
training data. For the training/evaluation/test dataset splitting, the model was trained on the training
dataset with enough epochs, evaluated on the evaluation dataset and finally the performance was tested on
the test dataset. Three iterations were done to predict the targeted value.
In the first iteration the values of R are: Training: 0.4783, validation: 0.4252, test: 0.3818 All: 0.4410 Here
training, validation and test values are nowhere near 0.9, the output obtained is
1.18. Therefore, another iteration can be run to get out expected target output value.

In the 4th iteration the values of R are: Training: 0.99033, validation: 0.8365, test: 0.8471 All: 0.9486.
Here test value is near to, the output (2) which is the same obtained in the 4th iteration.
Therefore, expected target value is obtained. ANN automatically extracts pattern from canonical and non-
canonical pairing between the miRNAs and its targets which is Class 2(Tumour suppressor gene).

19
P18BTI2203-Computational biology 21MBT011
CBL-005

Answer all the following [(CO4), (K4)]

1. (10 points) Perform a multiple sequence alignment of beta globins of plant origin and construct a
phylogenetic tree for the same.
Quest 1 Tot
ion: al
Point 10 10
s:
Score
:

Course Coordinator
****

Generate a multiple sequence alignment of beta globin among various species toprove that the"Regions
around the home-binding regions are highly conserved
(a) Identify the conserved domain & hypervariable regions and infer the changes from the
alignment
(b) Structurally is there any changes among the aligned homologs?

20
P18BTI2203-Computational biology 21MBT011
CBL-005

Solution:

Globin is an singular structural unit of haemoglobin which involves binding gaseous ligandssuch as O2,
NO and CO. there are different types of globin such as heme, myoglobin, cytoglobin etc. in this experiment
we are going to take 6 different globin from different species to find theconserved domain.

Name Source ID Amino acid length

beta-globin Podocnemis unifilis BAJ46574.1 147 aa

beta-globin A subunit Archilochus alexandri APA23495.1 147 aa

HBB protein Urocynchramuspylzowi NWU01539.1 147 aa

hemoglobin subunitbeta Catharus ustulatus XP_032907297.1 147 aa

hemoglobin beta Aegithalos caudatus AVA16350.1 147 aa

subunit A

beta-globin A subunit Schistes geoffroyi APA23487.1 147 aa

Fig 5.1 Conserved domain for selected globin sequence

By using MEGAX software tool the sequence has been aligned and the conserved region is found
conserved one is marked with(*) we found five conserved regions which can be shownwith the help of
Web logo tool where the dominant sequence is shown in big letter. In Fig 2 the

21
P18BTI2203-Computation biology 21MBT011
CBL:005

conserved regions are marked.

In Fig 3 the Plot con graph has also done for the aligned sequence. It is found that there is aparticular peak
is found which represent the highly conserved domain. It lies in 26 to 43.
Fig 2: Highly conserved domain R1, R2, R3, R4, R5 in Web logo

R R
R R

Fig 5.2: Highly conserved domain R1, R2, R3, R4, R5 in Web logo

Fig 5.3: Plot con graph for aligned sequence

22
P18BTI2203-Computation biology 21MBT011
CBL:005

Fig 5.4: Protein structure of beta-globin (Podocnemis unifilis) with their conserved region

Inference:
By Doing MSA for the selected sequences with MEGAX tool and doing the Plot con
graph it is found that the highly conserved region lies between 20 to 50 amino acids. In word logo we
also found that there are four other conserved regions. From the fig 4 we interpret thatin different species
the amino acids surrounding the active sites are highly conserved (R1, R2,R3). Some other parts of the
sequence are also conserved (R4, R5).

UNIX Commands for Computational Biology
No ratings yet
UNIX Commands for Computational Biology
26 pages
UNIX Commands and Computational Biology
No ratings yet
UNIX Commands and Computational Biology
18 pages
Linux Commands for Bioinformatics Tutorial
No ratings yet
Linux Commands for Bioinformatics Tutorial
3 pages
Whole Bioinfo Record
No ratings yet
Whole Bioinfo Record
47 pages
Linux File Extraction Assignment
No ratings yet
Linux File Extraction Assignment
3 pages
Bioinformatics Laboratory UNIX Commands
No ratings yet
Bioinformatics Laboratory UNIX Commands
49 pages
Whole Bioinfo Record
No ratings yet
Whole Bioinfo Record
49 pages
Linux File Extraction Assignment Guide
No ratings yet
Linux File Extraction Assignment Guide
3 pages
FOSS Lab Manual for B.Tech Students
No ratings yet
FOSS Lab Manual for B.Tech Students
51 pages
Introduction to Bioinformatics Concepts
No ratings yet
Introduction to Bioinformatics Concepts
12 pages
Bioinformatics Linux Commands Guide
No ratings yet
Bioinformatics Linux Commands Guide
21 pages
Linux Lab Manual: CS505 Experiments
No ratings yet
Linux Lab Manual: CS505 Experiments
31 pages
Unix & Shell Programming Lab Record
No ratings yet
Unix & Shell Programming Lab Record
25 pages
Introduction to Bioinformatics Course
No ratings yet
Introduction to Bioinformatics Course
35 pages
Bioinformatics Module: Genome Databases
No ratings yet
Bioinformatics Module: Genome Databases
20 pages
RIP Tutorials Bioinformatics
No ratings yet
RIP Tutorials Bioinformatics
19 pages
Biopython: Tools for Computational Biology
No ratings yet
Biopython: Tools for Computational Biology
9 pages
Os Lab Record
No ratings yet
Os Lab Record
102 pages
Bioinformatics Exercises on TIGR and BLAST
100% (1)
Bioinformatics Exercises on TIGR and BLAST
6 pages
Overview of Bioinformatics Techniques
No ratings yet
Overview of Bioinformatics Techniques
43 pages
Bioinformatics Exercises: Phylogenetic Trees
No ratings yet
Bioinformatics Exercises: Phylogenetic Trees
8 pages
Bioinformatics Basics: Key Concepts Explained
No ratings yet
Bioinformatics Basics: Key Concepts Explained
13 pages
Command Line Basics for Genomics
No ratings yet
Command Line Basics for Genomics
10 pages
Linux Shell Programming Lab Record
No ratings yet
Linux Shell Programming Lab Record
127 pages
Linux Bootcamp Exercises
No ratings yet
Linux Bootcamp Exercises
9 pages
Gene Counting with Bedtools Windows
No ratings yet
Gene Counting with Bedtools Windows
7 pages
OSY Practice Questions
No ratings yet
OSY Practice Questions
4 pages
NCBI Biological Databases Overview
No ratings yet
NCBI Biological Databases Overview
13 pages
Bioinformatics Lab Manual V Semester
No ratings yet
Bioinformatics Lab Manual V Semester
28 pages
Linux Permissions and Scripting Exercise
No ratings yet
Linux Permissions and Scripting Exercise
2 pages
Data Pre-Processing for Gene Datasets
No ratings yet
Data Pre-Processing for Gene Datasets
2 pages
Bioinformatics Course Syllabus
No ratings yet
Bioinformatics Course Syllabus
7 pages
OS Lab Record
No ratings yet
OS Lab Record
96 pages
Cyber Security Lab Record: OS & UNIX Commands
No ratings yet
Cyber Security Lab Record: OS & UNIX Commands
67 pages
The Pgfmolbio Package - Molecular Biology Graphs With Tikz: Wolfgang Skala 2013/08/01
No ratings yet
The Pgfmolbio Package - Molecular Biology Graphs With Tikz: Wolfgang Skala 2013/08/01
122 pages
Unix Commands: Simple UNIX Commands File Related Commands Directory Related Commands
No ratings yet
Unix Commands: Simple UNIX Commands File Related Commands Directory Related Commands
29 pages
Bioinformatics Applications in Research
No ratings yet
Bioinformatics Applications in Research
8 pages
Bioinformatics Tool Assignments Guide
No ratings yet
Bioinformatics Tool Assignments Guide
21 pages
Handling Vim Swap Files in Lab04
No ratings yet
Handling Vim Swap Files in Lab04
5 pages
OS Lab File by Abhishek Chaurasia
No ratings yet
OS Lab File by Abhishek Chaurasia
24 pages
Biopython Tutorial and Cookbook Guide
No ratings yet
Biopython Tutorial and Cookbook Guide
237 pages
CB3402 Operating Systems Lab Record
No ratings yet
CB3402 Operating Systems Lab Record
66 pages
FOSS Lab Experiment Manual
No ratings yet
FOSS Lab Experiment Manual
34 pages
BMB 402/502 Bioinformatics Syllabus
No ratings yet
BMB 402/502 Bioinformatics Syllabus
11 pages
Unix Commands and File Management Guide
No ratings yet
Unix Commands and File Management Guide
13 pages
BLP Practical Exam Command Guide
No ratings yet
BLP Practical Exam Command Guide
2 pages
Simplified Data Visualization for Life Scientists
No ratings yet
Simplified Data Visualization for Life Scientists
8 pages
UNIX Tools and Programming Lab Syllabus
No ratings yet
UNIX Tools and Programming Lab Syllabus
38 pages
Basics in Bioinformatics Course Plan
No ratings yet
Basics in Bioinformatics Course Plan
5 pages
Open Elective Subjects for B.Tech VI Semester
No ratings yet
Open Elective Subjects for B.Tech VI Semester
16 pages
vsearch User Manual for Microbiome Analysis
No ratings yet
vsearch User Manual for Microbiome Analysis
58 pages
Bioinformatics Lab Report Overview
No ratings yet
Bioinformatics Lab Report Overview
39 pages
Intro to Bioinformatics Course Notes
No ratings yet
Intro to Bioinformatics Course Notes
56 pages
Linux Programming Lab Manual for BTech AI & DS
No ratings yet
Linux Programming Lab Manual for BTech AI & DS
16 pages
Bioperl Course Overview and Resources
100% (1)
Bioperl Course Overview and Resources
96 pages
Unix Commands Tutorial for Beginners
No ratings yet
Unix Commands Tutorial for Beginners
20 pages
Operating Systems Lab Manual - IT Dept.
No ratings yet
Operating Systems Lab Manual - IT Dept.
58 pages
Biopython: Tools for Bioinformatics
No ratings yet
Biopython: Tools for Bioinformatics
5 pages
Linux File Operations and Redirection Guide
No ratings yet
Linux File Operations and Redirection Guide
9 pages
Bioinformatics Question Bank and Formats
No ratings yet
Bioinformatics Question Bank and Formats
53 pages
Drosophila as a Model Organism in Research
No ratings yet
Drosophila as a Model Organism in Research
37 pages
Bioinformatics Terminology Guide
No ratings yet
Bioinformatics Terminology Guide
4 pages
C3 Deficiency: Amino Acid Mutation Impact
No ratings yet
C3 Deficiency: Amino Acid Mutation Impact
6 pages
Biological Sequence Database Overview
No ratings yet
Biological Sequence Database Overview
6 pages
Dali Server Tutorial for Protein Alignment
No ratings yet
Dali Server Tutorial for Protein Alignment
37 pages
Unit 4 Biology SAC 2 Assessment Guide
No ratings yet
Unit 4 Biology SAC 2 Assessment Guide
9 pages
Master Regulatory Genes in Development
No ratings yet
Master Regulatory Genes in Development
10 pages
2024 RNA Machine Learning Benchmark
No ratings yet
2024 RNA Machine Learning Benchmark
40 pages
Gene Identification Techniques Overview
No ratings yet
Gene Identification Techniques Overview
37 pages
Bioinformatics Concepts and Techniques
No ratings yet
Bioinformatics Concepts and Techniques
13 pages
HAQERs: Key to Human Neurodevelopment
No ratings yet
HAQERs: Key to Human Neurodevelopment
41 pages
Types and Applications of Sequence Alignment
No ratings yet
Types and Applications of Sequence Alignment
27 pages
Shannon Entropy For GATE
No ratings yet
Shannon Entropy For GATE
5 pages
NCBI Conserved Domain Database Overview
No ratings yet
NCBI Conserved Domain Database Overview
5 pages
Genomic Insights of Lactiplantibacillus HMX2
No ratings yet
Genomic Insights of Lactiplantibacillus HMX2
14 pages
Casein Micelle Structure and Functions
No ratings yet
Casein Micelle Structure and Functions
45 pages
Journal of Clinical Microbiology-1999-Ringuet-852.full
No ratings yet
Journal of Clinical Microbiology-1999-Ringuet-852.full
6 pages
Madusanka Et Al, 2019, Galectin-8 Sebates Schlegelii
No ratings yet
Madusanka Et Al, 2019, Galectin-8 Sebates Schlegelii
14 pages
PSI-BLAST and Sequence Analysis
No ratings yet
PSI-BLAST and Sequence Analysis
92 pages
Amino Acid Recruitment in Genetic Code
No ratings yet
Amino Acid Recruitment in Genetic Code
9 pages
Behavioral Isolation in Meadowlarks
No ratings yet
Behavioral Isolation in Meadowlarks
42 pages
Pursuing a Ph.D. in Bioinformatics
No ratings yet
Pursuing a Ph.D. in Bioinformatics
7 pages
Multiple Sequence Alignment Overview
No ratings yet
Multiple Sequence Alignment Overview
14 pages
Genbio2 12 Q3 SLM12
No ratings yet
Genbio2 12 Q3 SLM12
17 pages
Types of Gene Mutations Explained
No ratings yet
Types of Gene Mutations Explained
10 pages
Myth of Junk DNA Notes (50p)
100% (1)
Myth of Junk DNA Notes (50p)
50 pages
Understanding Orthology in Genetics
No ratings yet
Understanding Orthology in Genetics
2 pages
Genetic Variants in Early-Onset Obesity
No ratings yet
Genetic Variants in Early-Onset Obesity
9 pages
Using Entrez for Gene Research
No ratings yet
Using Entrez for Gene Research
6 pages

Unix Commands for Computational Biology

Uploaded by

Unix Commands for Computational Biology

Uploaded by

DEPARTMENT OF BIOTECHNOLOGY

bombay india 45 asia

P18BTI2203L: Computational Biology Academic Year:2021-

Answer all the following (CO5), (K2)]

Fig 1.1 comparision of two homologous protein

B) Identify the cofactors/Ligand bound to protein and represent them

C) 2D label the regions of interaction and represent them.

P18BTI2203L: Computational Biology Academic Year:2021-

Answer all the following [(CO6,5), (K4)]

Fig 3.1 Identification of homologue sequence using Blast P

Fig 3.2 Graphical summary for the selected sequence

2. Identify a distant homolog of the above protein using PSI-BLAST

Fig 3.6 amino acid sequence of the distant homologue protein

Fig 3.9 Searching in PHI Blast with the conserved sequence

Fig 3.10 Conserved sequence found in web logo

Experiment 4 - Artificial Neural Network

P18BTI2203L: Computational Biology Academic Year:2021-

Answer all the following (CO5),

Fig 4.1 Neural network structure for the inputs

Fig 4.2 best Validation Performance at epoch 3

Fig 4.5 Neural network structure for the inputs

Fig 4.6 Best Validation Performance at epoch 4

Answer all the following [(CO4), (K4)]

Name Source ID Amino acid length

beta-globin Podocnemis unifilis BAJ46574.1 147 aa

beta-globin A subunit Archilochus alexandri APA23495.1 147 aa

HBB protein Urocynchramuspylzowi NWU01539.1 147 aa

hemoglobin subunitbeta Catharus ustulatus XP_032907297.1 147 aa

hemoglobin beta Aegithalos caudatus AVA16350.1 147 aa

beta-globin A subunit Schistes geoffroyi APA23487.1 147 aa

Fig 5.1 Conserved domain for selected globin sequence

conserved regions are marked.

Fig 5.3: Plot con graph for aligned sequence

You might also like