0% found this document useful (0 votes)
71 views10 pages

What Is Bioinformatics

Bioinformatics is an interdisciplinary field that uses computer science, mathematics, and statistics to analyze and interpret large amounts of biological data. It aims to understand biological processes through developing tools and software to store, process, and analyze exponential amounts of biological data from areas like genomics and proteomics. Key applications include sequence alignment and comparison to determine evolutionary relationships and functional annotation of genes and proteins.

Uploaded by

maki ababi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views10 pages

What Is Bioinformatics

Bioinformatics is an interdisciplinary field that uses computer science, mathematics, and statistics to analyze and interpret large amounts of biological data. It aims to understand biological processes through developing tools and software to store, process, and analyze exponential amounts of biological data from areas like genomics and proteomics. Key applications include sequence alignment and comparison to determine evolutionary relationships and functional annotation of genes and proteins.

Uploaded by

maki ababi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

1 Introduction

Bioinformatics represents an interdisciplinary and rapidly evolving area of science that applies
mathematics, statistics, computer science, and biology to the understanding of living systems.
Bioinformatics is driven by the advent of fast and reliable technology for sequencing nucleic
acids and proteins that results in an ever-increasing volume of experimental data to be analyzed.
Many of the recent developments in the field use algorithmic techniques in order to reach
answers to key challenges in molecular biology research, including understanding the
mechanisms of genome evolution, elucidating the structure of protein interaction networks, and
determining the genetic basis for susceptibility to disease

A major application of Bioinformatics is the analysis of the DNA and protein sequences of
organisms that have been sequenced. Sequence comparison is one of the basic operations in
Bioinformatics, serving as a basis for many other more complex manipulations. It provides
important information for solving many key problems, such as determining the function of a
newly discovered sequence, determining the evolutionary relationships among genes and
proteins, and predicting the structure and function of proteins

When a new biological sequence is discovered, its function and structure must be determined. A
common approach is to compare the new sequence to known sequences belonging to biological
databases, in search for similarities. We can compare a sequence to another sequence,
performing a pairwise sequence comparison, which consists of deciding whether a pair of
sequences are evolutionary related, that is, whether they share a common evolutionary history.
We can also compare a sequence to a profile that models a family of sequences, performing a
sequence-profile comparison, which consists of deciding whether a sequence is evolutionarily
related to a known evolutionary family sequence.

When we recognize a significant similarity between a new sequence and a known sequence or
sequence family, we can transfer information about structure and/or function to the new
sequence. We say that the sequences are homologous and that we are transferring information by
homology [3].
Comprehensive databases of DNA and protein sequences are now established as major tools in
current molecular biology research. Given the advances in sequencing technologies, the
significant amount of biological sequence data produced, and the effectiveness of sequence
comparison, it is logical to systematically organize and store the biological sequences to be
compared. As a consequence, sequence databases have grown exponentially in the last decade.

The most accurate algorithms for solving the problems of pairwise sequence comparison and
sequence-profile comparison are usually based on the dynamic programming technique. Because
of the quadratic time and memory complexity of these algorithms and usually the long length of
biological sequences, the task of searching large databases can lead to very lengthy execution
times with huge memory requirements.

High-performance computing resources and techniques can be used to accelerate these


operations. Several solutions for parallel sequence comparison have been proposed, targeting
different high-performance platforms, such as multicore architectures, clusters, and field-
programmable gate arrays (FPGAs).

Graphics processing units (GPUs) have evolved into highly parallel platforms due to their vast
number of simple, data-parallel, deeply multithreaded cores. Their impressive computational
power, high memory bandwidth, and comparatively low cost make them an attractive platform to
solve problems based on computationally intensive algorithms. Moreover, GPUs are becoming
increasingly programmable, offering the potential of significant speedups for a wide range of
applications compared to general-purpose processors (CPUs).

The goal of this chapter is to discuss in detail and compare the recent advances in GPU solutions
for some biological sequence analysis applications. The problems discussed are two classes of
biological sequence comparison: pairwise sequence comparison and sequence-profile
comparison. The first one is widely used as a first step in the solution of complex problems such
as the determination of the evolutionary history of the species. The second one is extremely
important because it is used to decide whether a recently sequenced protein belongs to a
particular protein family. For both problems, several GPU solutions have been proposed that
obtained substantial speedups over the sequential implementation and over solutions in other
parallel platforms.

History

Bioinformatics has experienced many paradigm shifts throughout its history, in three different
dimensions:

1.

The amount of data observed in the 1970s was very small, but has increased dramatically, from
small sets of several sequences to modern large databases with many TB of data [72].

2.

The complexity level of the observed systems has changed significantly, from individual proteins
and RNA sequences to complex multi-peptide proteins and large RNA/DNA molecules, to
genome and proteome of a complete organism and comparative research on different
species [73].

3.

The focus of modeling biological data has shifted from a simple linear structure of molecules to
their 3D structure, and from simple molecular interactions to complete functional and
physiological interactions in a living cell [74].

In the first decades of the area, bioinformaticians were mainly interested in creating
methodological breakthroughs, while today most researchers in the field target very specific
biological problems and most of them use the already developed models and tools. This is the
best evidence that bioinformatics is a mature field today.

Bioinformatics is an interdisciplinary research area. Nowadays, a bioinformatics program in


bachelor level education is quite rare, while master and doctoral programs are slightly more
present. Consequently, the majority of current bioinformaticians are either biologists with
upgraded computer science skills, or, the other way around, mathematicians or informaticians
with upgraded biological skills. The width of the field is the most common source of problems in
both presenting new results and understanding other people's results.

What is Bioinformatics?

Bioinformatics  is an interdisciplinary field that develops methods and software tools for
understanding biological data. As an interdisciplinary field of science, bioinformatics combines
computer science, statistics, mathematics, and engineering to analyze and interpret biological
data.

Various biological analyses result in exponential amounts of biological data and it becomes very
hard to analyze them using manual means. This is where Computer Science comes to the rescue.
Various computational techniques are used to analyze hunks of biological data more accurately
and efficiently by means of automated processes. Hence, bioinformatics can be considered as a
field of data science for solving problems in biology and medicine.

The Objective of Bioinformatics


Customarily, the aim and Importance of Bioinformatics is increasing the biological process of
understanding. In computer science, its role is the same as for increasing the understanding of
this through several fields such as statistics and mathematics. In the same way, it has three aims
for the process. They are storing the biological data, developing the tools that are essential to
processing the data, and the important goal of this is to exploit the computational tools for
analyzing the data that simply depicts the results.

Bioinformatics – Goals

One of the primary goals of the bioinformatics sector is to generate an increase in knowledge
regarding various biological processes. Bioinformatics differs from other approaches that are
employed to accomplish this objective as a result of the fact that its focus on the development
and application of specific techniques which some refer to as “computationally intensive.” Some
of these techniques include visualization, machine learning algorithms, data mining, and pattern
recognition. Some of the major research work that has been completed in the field includes gene
finding, sequence alignment, drug discovery, drug design, protein structure prediction, protein
structure alignment, prediction of gene expression, protein-protein interactions, the modeling of
evolution, and genome-wide association studies.

Role of Bioinformatics
To mention that, processing the biological information is for inventing the corresponding data
that may also include implementing and executing the software programs. This programs may
also utilize

 Algorithms

 Graph theory

 Artificial intelligence

 Data mining

 Soft computing

 Computer simulation

 Image processing

Usually, this algorithms may even change with discrete mathematics. Information theory,
statistics, and control theory. Bioinformatics in Computer Science can be easily generated
through various algorithms.

Algorithms
Specifically, algorithms are essential in it for analyzing and accurately processing the data. The
scientists of this field prefer the computer science algorithms for sequencing and assembling the
data.

Graph Theory
Particularly graph methodology is for comparison. Likewise, graph theory in this field is for
sequence comparison. It also involves assembling the fragments and overlapping the graphs for
processing in scientifically. It furnishes the clear representation that can be understood easily.
Artificial Intelligence
In this, the name itself symbolizes that bioinformatics applications is mimicking the intelligence
of the human with computers. Artificial intelligence trends is an upgrading technique nowadays.
In the same way, it is in connection with several areas. Like that this takes part in bioinformatics
for DNA sequencing and sequence reconstructions. It also helps in generating the tools vital for
processing the data. Researchers usually prefer this kind of methodology.
Data mining
Generally, the data mining approach is vital in various areas, especially for prediction. This
involves classification and clustering algorithms for any process. Each comprises of certain
algorithms for determining and progressing the biological information to search biomedical
literature. This exclusive method helps in the development of Meta searching for the researchers
from the single point accessing to more online databases. The graph theory, data integration, and
text mining depend on this procedures.

Soft Computing
Moreover, many devices are upgrading for storing the biomedical information’s still computers
has its significant role and holds a special place with researchers and biologists. This has the
unique progress of expressing the data of the gene. In addition, it also expresses the
bioinformatics data. This evolves with neural network model and artificial neural networks.
Similarly, this is the easiest and reliable method for analyzing the process. The ultimate factor of
this method is proteomic and genomic applications. This is useful for the scientists to do the
experiments that result in a vast amount of data.

Computer Simulation
So far, many techniques are there for simulation, but computer simulation is reliable, flexible and
portable to maintain the information. Particularly, bioinformatics applications is effective for
generating the vast quantity of inputs. The name simulation is computation especially for
developing algorithms and software’s, constructing database and curation, and analyzing the
sequence, functions, and structures.

Image Processing
Customarily, this techniques is essential in every stage of the progress to perspective the
virtuality. It assists for the biologists and scientists to view their research in bioinformatics. This
technique shows every stage of implementations virtually. It is a vital method in computer
science while in comparison with other. It leads to visual communication trends.
Benefits of Bioinformatics
The above mentioned techniques are very useful to generate the research results, and it is very
helpful for the researchers. As a result of, using computer science in bioinformatics is a better
approach for analyzing, fragmenting and sequencing.

Why Learn and Apply Bioinformatics?

B ioinformatics has become an inter-disciplinary science and if


you are a biologist, you will find that having knowledge in
bioinformatics can benefit you immensely with your experiments
and research.

Today’s job industry is full of vacancies for people with skills in


bioinformatics. Major pharmaceutical, biotech and software
companies are seeking to hire professionals with experience in
bioinformatics where they will be working with huge amounts of
biological and health care information. You can check
out Indeed.com for several job opportunities in the field of
bioinformatics.

A major application of bioinformatics can be found in the fields


of precision medicine and preventive medicine. Precision
medicine consists of health care techniques customized for
individual patients, including treatments and practices. Rather
than treating or curing diseases, precision medicine focuses of
developing measures to prevent diseases. Some of the diseases
being focused are influenza, cancer, heart
disease and diabetes.

Researches are being carried out to identify genetic


alterations in patients allowing scientists to come up with better
treatments and even possible measures of prevention. Certain
types of cancer, being caused by such genetic alterations can be
identified beforehand and can be treated before the conditions get
worse. You can read more about the role of bioinformatics in
cancer treatment at National Cancer Institute.

How to Approach Bioinformatics?

B efore getting deep in to the subject, as the starting steps, you


will have to learn a little bit about biology; genetics and
genomics to be specific. This will include studying about genes,
DNA, RNA, protein structures, various synthesis processes etc.

Next, you will have to study about biological sequences (for


example, sequences found in DNA, RNA and proteins) and
techniques to discover and analyze various patterns and
informative sites in them. You will come across various algorithms
used by different techniques. Also, you will get the chance to use
various machine learning and data mining techniques such as
hidden Markov models, neural networks and clustering.

Since you will be dealing with large amounts of data, it is crucial to


have a good understanding in statistics as you have to analyze
data according to specific requirements.

Of course you will need good programming skills. R, Python,


and Bash are the most commonly used programming languages in
biological data analyses. Deciding which one to start with depends
on your goals. You can use other languages such as C/C++ and
Java as well.

After having the basic understanding about the fundamental


concepts, you can explore other areas such structural
bioinformatics, systems biology and biological networks.

T he human being is a fascinating creature and its genome is

even more fascinating. The entire genome being stored in a DNA


molecule is mind-blowing as how it is possible to encode such
huge amounts of data in a single minute entity and decode them
precisely to create unique human beings with their own unique
characters. However, certain alterations in gene expression can
cause fatal genetic diseases. Healthcare ecosystems require
measures to identify such diseases and provide treatment and
preventive measures to help save human lives.

bioinformatics has proven to possess great potential to identify


diseases beforehand, determine treatment and help make human
lives better. With the inspiration and knowledge of computer
science, fields such as gene technology, medicine and healthcare
can evolve from curing individual patients to healing entire
populations.

You might also like