Sequence Alignment Methods in Bioinformatics

 
Sequence
Alignment
 
Lecture – 5
 
Department of CSE, DIU
 
CONTENTS
 
1.
 Sequence Alignment
 
- Why align sequences
 
2.        Sequence Alignment Methods
 
- Pairwise Alignment
 
- Multiple Sequence Alignment
3.        Pairwise Sequence Alignment Methods
 
-Global Alignment (Needleman-Wunsch)
 
- Local Alignment (Smith-Waterman)
 
1. Sequence Alignment
 
Why and how align sequences
 
Sequence
Alignment
 
A way of arranging the sequences of DNA, RNA, or
protein to identify regions of similarity that may
be a consequence of functional, structural, or
evolutionary relationships between the
sequences
 
C
T
G
T
C
G
-
C
TG
CACG
-
T
G
C
-
C
G
-
TG
----
 
Wh
y
 
align
 
sequences?
 
Useful
 
for
 
discovering
Functional
Structura
l
 
and
E
volutionary
 
relationship
F
or example
To
 
find
 
whethe
r
 
two
 
(o
r
 
more
)
 
gene
s
 
o
r
 
protein
s
 
are
evolutionarily
 
related
 
to
 
each
 
other
Two 
proteins with similar 
sequences 
will probably 
be
structurally or functionally similar
 
2. Sequence Alignment Methods
 
Pairwise and Multiple
 
Pairwise Sequence Alignment
 
A pair of sequences as input
Align them in such a way that, for that particular
alignment the assumed region of similarity produces
higher score than all the other alignments
Methods
 
- Global Alignment (Needleman-Wunsch)
 
- Local Alignment (Smith-Waterman)
 
 
CTGTCG
C
TG
C
A
CG
--
------
-
TG
C
-
CG
TG
 
Pairwise Sequence Alignment
 
Idea:
Display one sequence above another with spaces inserted in both
to reveal similarity
 
Scorin
g
 
th
e
 
alignment
 
Scorin
g
 
th
e
 
alignment
 
Optimum
 
Alignment
 
The score of an alignment is a measure of its quality
Optimum alignment problem: Given a pair of sequences X and Y, find an
alignment (global or local) with maximum score
The similarity between X and Y, denoted sim(X,Y), is the maximum score of an
alignment of X and Y
 
Multiple Sequence Alignment
 
Three or more than three sequences as
input
Align all the sequences altogether in such a
manner that the alignment produces
highest score
 
 
3. Pairwise Sequence Alignment
 
Global and Local methods
 
 Globa
l
 
V
S
 
Local
 
Global Alignment
Attempts to align the maximum of the entire sequence
Suitable for similar and equal length sequences
 
C
T
G
T
C
G
-
C
TG
CACG
-
T
G
C
-
C
G
-
TG
----
Global
 
alignment
Local Alignment
 
G
athers islands of matches
 
Stretches of sequences with highest density of matches are
aligned
Suitable for partially similar, different length and conserved
region containing sequences
 
CTGTCG
C
TG
C
A
CG
--
------
-
TG
C
-
CG
TG
 
Local alignment
 
Global Alignment (Needleman-Wunsch)
 
3 Major Steps
 
-Create 2D Matrix
 
-Trace back
 
-Final Alignment
 
Create 2D Matrix
 
- Row x Col 2D matrix draw (Row , Col
 
   size of seq1 and seq2 respectively)
 
- Place 2 seqs as Row and Column
 
   Header
 
- Cell (0,0) = 0
 
- Cell (0,1) to Cell (0,Column) and Cell
 
   (1,0) to Cell (Row,0) value = delete
 
   gap value from previous cell value
 
- For other cell values, follow
 
   equation in (1)
 
Trace back
 
- Start from Cell (Row, Col)
 
- Go back up to Cell (0,0)
Global Alignment (Needleman-Wunsch) - Example
Input
    - seq1 =  AAAC
    - seq2 = AGC
Scoring Scheme
 
δ(x, x) = 1 (Match)
 
δ(x,-) = -2 (Gap)
 
δ(x, y) = -1 (Mis match)
Eq. 1: Cell Value
 
0
 
-2
 
-4
 
-6
 
-2
 
1
 
-1
 
-3
 
-4
 
-6
 
-8
 
-1
 
-3
 
-5
 
0
 
-2
 
-4
 
-2
 
-1
 
-1
 
Final
Alignment
-AGC
AAAC
 
Local Alignment (Smith-Waterman)
 
3 Major Steps
 
-Create 2D Matrix
 
-Trace back
 
-Final Alignment
 
Create 2D Matrix
 
- Row x Col 2D matrix draw (Row , Col
 
   size of seq1 and seq2 respectively)
 
- Place 2 seqs as Row and Column
 
   Header
 
- First Row, First Column all value = 0
 
- For other cell values, follow
 
   equation in (2)
 
Trace back
 
- Start from each Cell which has the maximum
 
   value in the entire matrix
 
- Go back up to the Cell where first time 0
 
   occurs
Local Alignment (Smith-Waterman) - Example
Input
    - seq1 =  AAAC
    - seq2 = AAG
Scoring Scheme
 
δ(x, x) = 1 (Match)
 
δ(x,-) = -2 (Gap)
 
δ(x, y) = -1 (Mis match)
Eq. 2: Cell Value
 
0
 
0
 
0
 
0
 
0
 
1
 
1
 
0
 
0
 
0
 
0
 
1
 
1
 
0
 
2
 
2
 
0
 
0
 
1
 
1
 
Final
Alignment
-AAG
AAAC
Slide Note
Embed
Share

Sequence alignment is crucial in bioinformatics for identifying similarities between DNA, RNA, or protein sequences. Methods like Pairwise Alignment and Multiple Sequence Alignment help in recognizing functional, structural, and evolutionary relationships among sequences. The Needleman-Wunsch algorithm for global alignment and Smith-Waterman algorithm for local alignment are commonly used approaches. Scoring the alignment involves rewarding matches, penalizing mismatches, and managing spaces to determine the best alignment.

  • Bioinformatics
  • Sequence Alignment
  • Needleman-Wunsch
  • Smith-Waterman
  • Algorithm

Uploaded on Sep 07, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Sequence Alignment Lecture 5 Department of CSE, DIU

  2. CONTENTS 1. Sequence Alignment -Why align sequences 2. Sequence Alignment Methods -Pairwise Alignment -Multiple Sequence Alignment 3. Pairwise Sequence Alignment Methods -Global Alignment (Needleman-Wunsch) -Local Alignment (Smith-Waterman)

  3. 1. Sequence Alignment Why and how align sequences

  4. Sequence Alignment A way of arranging the sequences of DNA, RNA, or CTGTCG-CTGCACG protein to identify regions of similarity that may be a consequence of functional, structural, or -TGC-CG-TG---- evolutionary relationships between the sequences

  5. Whyalignsequences? Useful fordiscovering Functional Structural and Evolutionary relationship For example Tofindwhether two(ormore)genesorproteins are evolutionarily related toeach other Two proteins with similar sequences will probably be structurally or functionally similar

  6. 2. Sequence Alignment Methods Pairwise and Multiple

  7. Pairwise Sequence Alignment A pair of sequences as input Align them in such a way that, for that particular alignment the assumed region of similarity produces higher score than all the other alignments CTGTCGCTGCACG-- -------TGC-CGTG Methods -Global Alignment (Needleman-Wunsch) -Local Alignment (Smith-Waterman)

  8. Pairwise Sequence Alignment Idea: Display one sequence above another with spaces inserted in both to reveal similarity A: C A T - T C A - C | | B: C - T C G C A G C | | |

  9. Scoringthealignment

  10. Scoringthealignment Reward for matches: 10 Mismatch penalty: Space penalty: 2 5 C T G T C G C T G C - T G C C G T G - -5 10 10 -2 -5 -2 -5 -5 10 10 -5

  11. OptimumAlignment The score of an alignment is a measure of its quality Optimum alignment problem: Given a pair of sequences X and Y, find an alignment (global or local) with maximum score The similarity between X and Y, denoted sim(X,Y), is the maximum score of an alignment of X and Y

  12. Multiple Sequence Alignment Three or more than three sequences as input Align all the sequences altogether in such a manner that the alignment produces highest score

  13. 3. Pairwise Sequence Alignment Global and Local methods

  14. GlobalVSLocal Global Alignment Attempts to align the maximum of the entire sequence Suitable for similar and equal length sequences CTGTCG-CTGCACG CTGTCGCTGCACG-- -------TGC-CGTG -TGC-CG-TG---- Global alignment Local alignment Local Alignment Gathers islands of matches Stretches of sequences with highest density of matches are aligned Suitable for partially similar, different length and conserved region containing sequences

  15. Global Alignment (Needleman-Wunsch) Trace back 3 Major Steps -Start from Cell (Row, Col) -Go back up to Cell (0,0) -Create 2D Matrix -Trace back -Final Alignment Create 2D Matrix -Row x Col 2D matrix draw (Row , Col size of seq1 and seq2 respectively) -Place 2 seqs as Row and Column Header -Cell (0,0) = 0 -Cell (0,1) to Cell (0,Column) and Cell (1,0) to Cell (Row,0) value = delete gap value from previous cell value -For other cell values, follow equation in (1) Final Alignment -Start from Cell (Row, Col) -If then, place character in both seq -If or then character in start seq& gap in end seq

  16. Global Alignment (Needleman-Wunsch) -Example -AGC A -2 1 -1 -3 -5 G -4 -6 -1 -3 0 -2 -4 C Input -seq1 = AAAC -seq2 = AGC AAAC 0 Scoring Scheme Final Alignment A A A C -2 -4 -6 -8 (x, x) = 1 (Match) (x,-) = -2 (Gap) (x, y) = -1 (Mis match) -2 -1 -1 Eq. 1: Cell Value

  17. Local Alignment (Smith-Waterman) Trace back 3 Major Steps -Start from each Cell which has the maximum value in the entire matrix -Go back up to the Cell where first time 0 occurs -Create 2D Matrix -Trace back -Final Alignment Create 2D Matrix -Row x Col 2D matrix draw (Row , Col size of seq1 and seq2 respectively) -Place 2 seqs as Row and Column Header -First Row, First Column all value = 0 -For other cell values, follow equation in (2) Final Alignment -Start from each Cell with max value -If then, place character in both seq -If or then character in start seq& gap in end seq

  18. Local Alignment (Smith-Waterman) -Example -AAG A 0 1 1 1 0 A 0 1 2 2 0 G 0 0 0 1 1 Input -seq1 = AAAC -seq2 = AAG AAAC 0 0 0 0 0 Scoring Scheme Final Alignment A A A C (x, x) = 1 (Match) (x,-) = -2 (Gap) (x, y) = -1 (Mis match) Eq. 2: Cell Value

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#