Introduction to Arrays and Recursion in C++

An Introduction to Programming
though C++
Abhiram G. Ranade
Ch. 16: Arrays and Recursion
Arrays and Recursion
Recursion is very useful for designing
algorithms on sequences
Sequences will be stored in arrays
Topics
Binary Search
Merge Sort
Searching an array
 
Input: 
A:
 int array of length 
n
,  
x
 (called “key”) : int
Output: 
true
 if 
x
 is present in 
A, false 
otherwise.
Natural algorithm: scan through the array and return true if found.
 
for(int i=0; i<n; i++){
 
if(A[i] == x) return true;
}
return false;
 
Time consuming:
Entire array scanned if the element is not present,
Half array scanned on the average if it is present.
Can we possibly do all this with fewer operations?
Searching a sorted array
 
sorted array: (non decreasing order)
A[0] ≤ A[1] ≤ … ≤ A[n-1]
sorted array: (non increasing order)
A[0] ≥ A[1] ≥ … ≥ A[n-1]
How do we search in a sorted array (non
increasing or non decreasing)?
Does the sortedness help in searching?
Searching for x in
a non decreasing sorted array A[0..n-1]
 
Key idea for reducing comparisons: 
First compare 
x
 with
the “middle” element 
A[n/2]
 of the array.
Suppose 
x < A[n/2]
:
x 
is also smaller than 
A[n/2..n-1], 
because of sorting
x
 if present will be present only in 
A[0..n/2-1]
.
So in the rest of the algorithm we will only search first half of 
A
.
Suppose 
x >= A[n/2]
:
x
 if present will be present in 
A[n/2..n-1]
Note: x
 may be present in first half too,
In the rest of the algorithm we will only search second half.
How to search the “halves”?
Recurse
!
Plan
 
We will write a function 
Bsearch
 which will search a
region of an array instead of the entire array.
Region: specified using 2 numbers: starting index 
S
,
length of region 
L
When 
L == 1
, we are searching a length 1 array.
So check if that element, 
A[S] == x
.
Otherwise, compare 
x
 to the “middle” element of
A[S..S+L-1]
Middle element:  
A[S + L/2]
Algorithm is called “Binary search”, because size of the
region to be searched gets roughly halved.
The code
 
bool Bsearch(int A[], int S, int L, int x)
// Search for x in A[S..S+L-1]
{
 
if(L == 1) return A[S] == x;
 
int H = L/2;
 
if(x < A[S+H]) return Bsearch(A, S,   H,   x);
 
else           return Bsearch(A, S+H, L-H, x);
}
 
int main(){
  int A[8]={-1, 2, 2, 4, 10, 12, 30, 30};
  cout << Bsearch(A,0,8,11) << endl;
  // searches for 11.
}
How does the algorithm execute?
 
A = {-1, 2, 2, 4, 10, 12, 30, 30}
First call: 
Bsearch(A, 0, 8, 11)
comparison: 
11 < A[0+8/2] = A[4] = 10
Is 
false
.
Second call: 
Bsearch(A, 4, 4, 11)
comparison: 
11 < A[4+4/2] = A[6] = 30
Is 
true
.
Third call: 
Bsearch(A, 4, 2, 11)
comparison: 
11 < A[4+2/2] = A[5] = 12
Is true.
Fourth call: 
Bsearch(A, 5, 1, 11)
Base case.  Return 
11 == A[5]
.  So 
false
.
Proof of correctness 1
 
Claim:
 
Bsearch(A,S,L,x)
 returns 
true 
Iff 
x
 is present in 
A[S,S+L-1]
, where
0 <= S,S+L-1 <
 length of 
A
, where A is an array sorted in non decreasing order.
 
Proof:
 Induction over 
L
.
Base case L = 1.  Obvious.
Otherwise: L > 1.  Algorithm first computes 
H = L/2
.  Note that 
0 < H < L
.
 
If 
x < A[S+H]
,
 
then x if present must be in A[S,S+H-1]
So algorithm must call
 Bsearch(A, S, H, x), which it does.
The length argument, 
H, 
is smaller than 
L. 
So by induction call returns correctly.
 
If 
x ≥ A[S+H]
,  then x if present, must be in A[S+H, L].
So algorithm must call
 Bsearch(A, S+H, L-H, x), 
which it does.
The length argument
, L-H
, is smaller than L.  So by induction call returns correctly.
 
Hence the algorithm will work correctly for all 
L
.
Remarks
 
If you are likely to search an array frequently, it is useful to
first sort it. The time to sort the array will be be
compensated by the time saved in subsequent searches.
How do you sort an array in the first place?  
Next.
Binary search can be written without recursion.  Exercise.
Even professional programmers make mistakes when
writing binary search.
Should condition use x <= A[H] or x < A[H]?
Need to ensure correct even if length is odd.
Precise subranges to be searched and precise lengths to be
searched should be exactly correct.
Very important to write down precisely what the function does:
“searches A[S..S+L-1]” – be careful about -1 etc.
Estimating time taken
 
General idea: “standard operations” take 1 cycle.
Arithmetic, comparison, copying one word
address calculation, pointer dereference
Convenient idealization.
We characterize running time as a function of an agreed upon problem
size “n”:
n = Number of keys to be sorted in a sorting problem.
n = Size of matrices in matrix multiplication
We worry only how the time grows as the problem size increases: e.g. the
time is “linear” in n or is “quadratic”…
For large enough problem size, linear e.g. 100n is better than n
2
/2
Computers deal with large problems…
If time taken is different for different inputs of the same size, we consider
the max time amongst them.
We want to claim: “No matter what the input is, the time is linear in n”
Sorting
 
Selection Sort (Chapter 14)
Find smallest in A[0..n-1].  Exchange it with A[0].
Find smallest in A[1..n-1].  Exchange it with A[1].
Selection sort time: we count comparisons
(Other operations will take proportional time.)
n-1 comparisons to find smallest
n-2 comparisons to find second smallest …
Total n(n-1)/2.   “About n
2
”.  (Quadratic)
Algorithms requiring fewer comparisons are known:
”About nlog n”
One such algorithm is Merge sort.
Mergesort idea
 
To sort a long sequence:
Break up the sequence into two small sequences.
Sort each small sequence. 
(Recurse!)
Somehow “merge” the sorted sequences into a
single long sequence.
Hope: “merging” sorted sequences is easier than
sorting the large sequence.
Our hope is correct, as we will see soon!
Example
 
Suppose we want to sort the sequence
50, 29, 87, 23, 25, 7, 64
Break it into two sequences.
50, 29, 87, 23 and 25, 7, 64.
Sort both
We get 23, 29, 50, 87 and 7, 25, 64.
Merge
Goal is to get 7, 23, 25, 29, 50, 64, 87.
Merge sort
 
void mergesort(int S[], int n){
// Sorts sequence S of length n.
 if(n==1) return;
 int U[n/2], V[n-n/2]; // local arrays
 for(int i=0; i<n/2; i++) U[i]=S[i];
 for(int i=0; i<n-n/2; i++) V[i]=S[i+n/2];
 mergesort(U,n/2);
 mergesort(V,n-n/2);
//”Merge” sorted U, V into S.
 merge(U, n/2, V, n-n/2, S, n);
// U, V merge into original array S.
}
Merging example
 
U: 23, 29, 50, 87.
V: 7, 25, 64.
S:
The smallest overall must move into S.
Smallest overall = smaller of smallest in U and
smallest in V.
So after movement we get:
U: 23, 29, 50, 87.
V: -, 25, 64.
S: 7.
What do we do next?
 
U: 23, 29, 50, 87.
V: -, 25, 64.
S: 7.
Now we need to move the second smallest into S.
Second smallest:
smallest in U,V after smallest has moved out.
smaller of what is at the “front” of U, V.
So we get:
U: -, 29, 50, 87.
V: -, 25, 64.
S: 7, 23.
General strategy
 
While both U, V contain a number:
Move smallest from those at the head of U,V to the end of S.
If only U contains numbers: move all to end of S.
If only V contains numbers: move all to end of S.
uf: index denoting which element of U  is currently at the
front.
U[0..uf-1] have moved out.
vf: similarly for V.
sb: index denoting where next element should move into S
next  (sb: back of S)
S[0..sb-1] contain elements that have moved in earlier.
Merging two sequences
 
void merge(int U[], int p, int V[], int q, int S[], int n){
// S should receive all elements of U,V, in sorted order.
 int uf=0, vf=0;                   // uf, vf : front of u, v
 for(int sb=0; sb < p + q;  sb++){ // sb = back of s
      //Invariant: s[0..sb-1] contain smallest sb,
      //           u[uf..p-1], v[vf..q-1] contain rest
   if(uf<p && vf<q){  // both U,V are non empty
     if(U[uf] < V[vf]){ S[sb] = U[uf]; uf++;}
     else{ S[sb] = V[vf]; vf++;}
   }
   else if(uf < p){ // only U is non empty
      S[sb] = U[uf]; uf++;
   }
   else{            // only V is non empty
      S[sb] = V[vf]; vf++;
   }
 }
}
Time Analysis: merging
Time required to merge two sequences of length
p, q:
Loop runs for p+q iterations.
In each iteration a fixed number of operations
are performed.
So time is proportional to p+q.
Time proportional to n, if n=p+q.
Time analysis: sorting
 
T
i
 = maximum time required for mergesort to sort any sequence of length i.
T
1
 ≤ c, where c is some constant.
T
n
 ≤ dn + 2T
n/2
 + en
dn : time required to create U,V from S.
T
n/2
 : time to sort sequences of length n/2.  Assume n/2 integer.
en : upper bound on time to merge sequence of net length n.
T
n
 ≤ fn + 2T
n/2
  for f=d+e
Inequality applies to T
n/2
 also
T
n/2
 ≤ fn/2 + 2T
n/4
T
n
 ≤ fn + 2(fn/2 + 2T
n/4
) = 2fn + 4T
n/4
Continuing we get T
n
 ≤ kfn + 2
k
T
n/2
k
If n=2
k
 or k = log
2
 n: T
n 
≤ fn log n + nT
1
 = fnlog
2
 n + nc
Thus T
n
≤ gn log
2
 n for some constant g.
Remarks
Mergesort is much faster than selection sort in
practice.
The eight queens puzzle
 
“Place eight queens on a chess board so that no
queen captures another”.
A queen can capture anything that is in a square
exactly to the East, West, North, South, NE, NW,
SE, SW.
Queens should be in distinct rows, distinct
columns, and distinct “diagonals”
Good example of “constraint satisfaction
problem”.
Solution uses recursion.
Can we represent the problem
mathematically?
 
Frame a question of the form: “Find numbers
x,y,z... such that they satisfy constraints ...”
Constraints: equalities, inequalities, …
It should be possible to interpret the numbers
in terms of queen positions on the board.
Mathematical representation 1
 
For each board square (i,j), let x
ij
 “encode”
whether a queen is present or not present in
that square.
x
ij
 = 1 : queen present
x
ij
 = 0 : queen absent
At most one queen in each row i:
x
i1
 + x
i2
 + ... + x
in
 ≤ 1
Similarly for other conditions
“Solve”!
Example: 3x3 board
 
Row conditions:
x
11
+x
12
+x
13
 ≤ 1,   x
21
+x
22
+x
23
 ≤ 1,   x
31
+x
32
+x
33
 ≤ 1
Column conditions:
x
11
+x
21
+x
31
 ≤ 1,   x
12
+x
22
+x
32
 ≤ 1,   x
13
+x
23
+x
33
 ≤ 1
Diagonal conditions:
x
21
+x
32
 ≤ 1,   x
11
+x
22
+x
33
 ≤ 1,   x
12
+x
23
 ≤ 1
x
12
+x
21
 ≤ 1,   x
13
+x
22
+x
31
 ≤ 1,   x
23
+x
32
 ≤ 1
Place 3 queens:
x
11
+x
12
+x
13
+x
21
+x
22
+x
23
+x
31
+x
32
+x
33
 = 3
0-1 constraint:
All  x
ij
 
 {0, 1}
Another representation
 
What do we want to find?
Positions for 8 queens.
Position in 2d space : 2 numbers, (x, y)
Are we looking for 16 numbers then?
Real or integers?
Integers, in the range 1..8
Distinct columns:
All x
i
 should be distinct.
Distinct rows:
All y
i
 should be distinct.
Distinct diagonals:
For all i,j where i ≠ j: |x
i
 - x
j
| ≠ |y
i
 
 y
j
|
Other examples and variations on
constraint satisfaction problems
 
”Find x
1
, x
2
, ..., x
n
 such that ...” is called a constraint
satisfaction problem.
8 queens problem is a constraint satisfaction problem.
Another example: solving equations: find x such that
3x
2
 + 4x + 1 = 0.
We may in addition have an “objective function”: of all
x
i
 satisfying the constraints, report one that maximizes
some given function f(x
1
, x
2
, ..., x
n
)
Example of a constraint satisfaction problem with an
objective function: finding GCD of x,y.
Find r such that r divides x, and r divides y, and f(r)=r is
maximum.
Solving constraint satisfaction
problems
 
Perform algebraic manipulation and deduce solution.
Quadratic equation: factorize...
“Try all possibilities”
Works if each variable that we want to solve for has a finite domain
8 queens formulation 1: 64 variables, each either 0 or 1
8 queens formulation 2: 16 variables, each from {1,2,...,8}
We construct each candidate solution and check if it satisfies our
constraints.  If yes, we report it.
How to construct all solutions?
Aren’t there a huge number of them?
Number of candidate solutions for 8 queens formulation 1: 2
64
Number of candidate solutions for 8 queens formulation 2: 8
16
8
16
 = 2
48
 < 2
64
Can we reduce this number further?
Formulation 3
 
No column can hold more than 1 queen; else there will be captures.
We want 8 queens, need at least 1 queen in each of the 8 columns
So place exactly 1 queen in each column.
New representation: Let y
i
 = row position of queen in column i.
What conditions should y
1
, y
2
, ..., y
8
 satisfy?
Distinct columns condition:
Automatically satisfied
Distinct rows condition:
y
i
 should be distinct.
Distinct diagonals condition:
For all i,j, i≠j : |y
i
 - y
j
| ≠ |i-j|
Size of search space: 8
8
, much smaller than previous 8
16
 or 2
64
.
A program for 4 queens
 
int y[4];  // y[i]: row position of column i queen
// For all ways of placing all queens:
for(y[0] = 0; y[0] < 4; y[0]++){
   for(y[1] = 0; y[1] < 4; y[1]++){
      for(y[2] = 0; y[2] < 4; y[2]++){
         for(y[3] = 0; y[3] < 4; y[3]++){
            if(!capture(y,4))
               cout <<y[0]<<y[1]<<y[2]<<y[3]<<endl;
         }
      }
   }
}
Function to check for capture
 
bool capture(int y[], int n){
// Decides whether any queen captures any
// other.  n = board size.
// check for all pairs j>k
 for(int j=1; j<n; j++){
   for(int k=0; k<j; k++){
     if((y[j] == y[k]) ||
        (abs(j-k) == abs(y[j]-y[k]))
       return true;
   }
 }
 return false;
}                  // Loop invariant?
Will the same idea work for any n?
 
Many programming languages will not allow you
to nest more than a certain number of loops.
In other constraint satisfaction problem, the
number of variables to be selected could be very
large, making it difficult to do so much nesting
Recursion comes to our rescue!
We will write a recursive program which will not
have much nesting but will have the same effect
as writing a program with nesting.
A different view of searching through
the candidate configurations
 
S = Set of all possible ways (“configurations”) to place
queens, one queen per column.
= All possible ways to assign values to y
0
,y
1
,…,y
7
Suppose n = 3.  S has 27 elements:
 {000, 001, 002, 010, 011, 012, 020, … 222}
 
Algorithm outline for n = 3 :
Store first configuration in 
y[0..2]
, then call capture.
Store second configuration in 
y[0..2]
, then call capture.
Store 27
th
 configuration in 
y[0..2]
, then call capture.
“Searching the set S of configurations”
How to search S
 
Observation: S = S
0
 
 S
1
 
 S
n-1
, where
S
i
 = set of configurations in which queen in column 0 is in
row i, and other queens anywhere.
3 Queens: S
0 
= {000,001,002,010,011,012,020,021,022}
Searching S = searching S
0
,…,S
n-1
.
But S
i
 is also a union of smaller sets (recursion!):
S
i
 = S
i0
 
 S
i1
 
 S
i,n-1
, where
S
ij
 = set of configurations in which queen in column 0 is in
row i, and queen in column 1 in row j, and other queens
anywhere.
What is S
02
 for the 3 queen problem?
{020, 021, 022}
General case
 
Notational change: We will write S(x) rather
than S
x
.
S(i
0
,i
1
,…,i
k-1
) : Configurations with queens in first
k columns in rows  i
j
 for j=0..k-1.
S(i
0
,i
1
,…,i
k-1
) = S(i
0
,i
1
,…,i
k-1
,0)  
  S(i
0
,i
1
,…,i
k-1
,1)   
                         ...  
   S(i
0
,i
1
,…,i
k-1
,n-1)
How to search S (contd.)
 
void search(int n, int y[], int k){
// n = number of queens, also length of array y.
// Function searches subspace S(y[0],y[1],...y[k-1]) of all candidate
// positions and of these prints those in which there is no capture.
 if(k == n){       
// base case
   if(!capture(y,k)){
     for(int j=0; j<k; j++) cout << y[j];
     cout << endl;
   }
 }
 else{            
// Search S(y[0],...y[k-1]) recursively
   for(int j=0; j<n; j++){
     y[k] = j;    
// red decomposition given earlier
     search(n, y, k+1);
   }
 }
}
How to search S (contd)
 
The function search has an important post condition: it
does not modify y[0..k-1].
Only because of this can we merely set y[k] = j before
recursion.
 
The main program is natural.
 
int main(){
 const int n=8;
 int y[n];
 search(n, y, 0);
}
Recursion tree for n=3
Search(3,[-,-,-],0)
Search(3,[0,-,-],1)
Search(3,[1,-,-],1)
Search(3,[2,-,-],1)
Search(3,[1,0,-],2)
Search(3,[1,1,-],2)
Search(3,[1,2,-],2)
Search(3,[1,2,0],3)
. . . 
. . . 
. . . 
. . . 
. . . 
. . . 
. . . 
. . . 
. . . 
. . . 
. . . 
. . . 
. . . 
. . . 
An improvement: Early check
 
S(i
0
,i
1
,…,i
k-1
) : Set of configurations of k queens such
that queen in column j is in row i
j
 for j=0..k-1.
If any of the first k queens capture each other, we
need not worry about the remaining n-k queens,
clearly there is no non capturing configuration in
this Subspace.
Key idea: whenever we place the kth queen, we
first check if it captures the previous queens.
Only then do we bother to explore the set of
configurations further.
Checking if the kth queen captures any
previous queens
 
bool lastCaptures(int y[], int k){
// check whether queen in column k
// captures any in columns 0..k-1.
for(int j=0; j<k; j++){
  if((y[j] == y[k]) ||
     (abs(j-k) == abs(y[j]-y[k]))
    return true;
}
return false;
}
Search function and main program
 
void search(int n, int y[], int k){
 if(k == n){
   for(int j=0; j<k; j++) cout << y[j];
   cout << endl;
 }
 else {
   for(int j=0; j<n; j++){
     y[k] = j;
     if(!lastCaptures(y,k)) search(n,y,k);
   }
 }
}
// Precisely state what this function expects and does.
// main program:
//      as before.
Remarks
 
Recursion is very powerful even with arrays.
Idea of mergesort: divide input into parts, sort
each part, then combine, is called “divide-
conquer-combine”.
“Divide-conquer-combine” is useful for other
problems besides sorting.
“Try out all possibilities” works for many
problems, see problems at the end of the
chapter.
“Early condition checking” also works quite often.
Exercise 1
Suppose the comparison in our binary search
code used <= rather than <.
Give an instance (set of input values) for
which the algorithm will produce a wrong
answer.
Where would the proof not work?
Exercise 2
Suppose the binary search code returns the
index of the last element it examines, rather
than a Boolean value.  What value would this
be, assuming (1) the element x being searched is
not present in the array, (2) there is exactly one
occurrence of x in the array, (3) there are
multiple occurrences of x?
Exercise 3
Instead of specifying the subarray to be
searched by giving the starting index and the
length, you might give the starting and ending
indices.  Write the code using this, and prove its
correctness.
Exercise 4
Suppose I represent a set of integers using an array
that holds the integers in sorted order.   Given two
such arrays representing two sets, give an algorithm
that prints their union.  (The common elements
must be printed just once.)  Your algorithm should
run in time proportional to the size of the union.
Suppose the elements were stored without sorting.
How many comparisons do you think the natural
algorithm would perform?  Does sorting help?
Exercise 5
A seller receives bids for an item from n buyers.  The seller examines the bids and sells
to the highest bidder.  Each buyer b bids H
b
 if she is happy, and S
b
 if she is sad.  Buyers
are happy/sad with equal probability, and independently of each other.  Write a
program that reads in n and the values H
b
 and S
b
 for each buyer and prints the
expected value of the winning bid.  Your program should do this in two ways, and print
the two answers on separate lines.
1.
The probability space has 2
n
 points, corresponding to the different ways in which
the buyers can be happy or sad.  Your program should go over each point and
add up the contribution of different points to the expectation.   This should be
done using a recursive function, similar to what is used in n queens.
2.
What is the probability that the highest in all the numbers H, S actually becomes
the winning bid?  What can you say going down the sorted order of bids?
Develop this idea into a very fast algorithm and code that.  For sorting, use the
standard library function as discussed in Section 22.3.2. To sort an array of
structures, you should use a “lambda expression” as an extra argument, as
shown at the top of page 321.  It is shown for vectors, but is the same for arrays.
Lambda expressions are discussed earlier in the book, and you may find it
interesting to understand them fully.
Exercise 6
The basic idea of binary search can be used even when the values over
which to search are not given explicitly.  Here is an example.
The input consists of numbers x
1
,x
2
,...,x
n
 which are lengths of
consecutive billboards located along a road.  We have an additional
input, k, giving the number of painters available to paint the boards.
Each board requires time proportional to its length to paint.  We must
assign some number of consecutive boards to each painter such that
the maximum sum of the length of boards assigned to any painter is as
small as possible, i.e. so as to finish in minimum time.
Observe that it is easy to check whether in T time the k painters can
finish the job.
If the check succeeds, you can decide to check for a lower time T.
Develop this idea into an algorithm that determines the minimum
time.  Your algorithm should to about log n checks only.
Slide Note
Embed
Share

Arrays and recursion play a vital role in designing algorithms on sequences in programming. This introduction covers the implementation of searching in arrays, binary search, merge sort, and the concept of searching in sorted arrays using recursion. The use of recursion helps reduce comparisons and enhances efficiency in searching algorithms for both non-decreasing and non-increasing sorted arrays. The binary search algorithm is demonstrated through code examples, showcasing the effectiveness of dividing the search region in half to optimize the search process.

  • Arrays
  • Recursion
  • C++
  • Algorithms
  • Searching

Uploaded on Dec 11, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. An Introduction to Programming though C++ Abhiram G. Ranade Ch. 16: Arrays and Recursion

  2. Arrays and Recursion Recursion is very useful for designing algorithms on sequences Sequences will be stored in arrays Topics Binary Search Merge Sort

  3. Searching an array Input: A: int array of length n, x (called key ) : int Output: true if x is present in A, false otherwise. Natural algorithm: scan through the array and return true if found. for(int i=0; i<n; i++){ if(A[i] == x) return true; } return false; Time consuming: Entire array scanned if the element is not present, Half array scanned on the average if it is present. Can we possibly do all this with fewer operations?

  4. Searching a sorted array sorted array: (non decreasing order) A[0] A[1] A[n-1] sorted array: (non increasing order) A[0] A[1] A[n-1] How do we search in a sorted array (non increasing or non decreasing)? Does the sortedness help in searching?

  5. Searching for x in a non decreasing sorted array A[0..n-1] Key idea for reducing comparisons: First compare x with the middle element A[n/2] of the array. Suppose x < A[n/2]: x is also smaller than A[n/2..n-1], because of sorting x if present will be present only in A[0..n/2-1]. So in the rest of the algorithm we will only search first half of A. Suppose x >= A[n/2]: x if present will be present in A[n/2..n-1] Note: x may be present in first half too, In the rest of the algorithm we will only search second half. How to search the halves ? Recurse!

  6. Plan We will write a function Bsearch which will search a region of an array instead of the entire array. Region: specified using 2 numbers: starting index S, length of region L When L == 1, we are searching a length 1 array. So check if that element, A[S] == x. Otherwise, compare x to the middle element of A[S..S+L-1] Middle element: A[S + L/2] Algorithm is called Binary search , because size of the region to be searched gets roughly halved.

  7. The code bool Bsearch(int A[], int S, int L, int x) // Search for x in A[S..S+L-1] { if(L == 1) return A[S] == x; int H = L/2; if(x < A[S+H]) return Bsearch(A, S, H, x); else return Bsearch(A, S+H, L-H, x); } int main(){ int A[8]={-1, 2, 2, 4, 10, 12, 30, 30}; cout << Bsearch(A,0,8,11) << endl; // searches for 11. }

  8. How does the algorithm execute? A = {-1, 2, 2, 4, 10, 12, 30, 30} First call: Bsearch(A, 0, 8, 11) comparison: 11 < A[0+8/2] = A[4] = 10 Is false. Second call: Bsearch(A, 4, 4, 11) comparison: 11 < A[4+4/2] = A[6] = 30 Is true. Third call: Bsearch(A, 4, 2, 11) comparison: 11 < A[4+2/2] = A[5] = 12 Is true. Fourth call: Bsearch(A, 5, 1, 11) Base case. Return 11 == A[5]. So false.

  9. Proof of correctness 1 Claim: Bsearch(A,S,L,x) returns true Iff x is present in A[S,S+L-1], where 0 <= S,S+L-1 < length of A, where A is an array sorted in non decreasing order. Proof: Induction over L. Base case L = 1. Obvious. Otherwise: L > 1. Algorithm first computes H = L/2. Note that 0 < H < L. If x < A[S+H], then x if present must be in A[S,S+H-1] So algorithm must call Bsearch(A, S, H, x), which it does. The length argument, H, is smaller than L. So by induction call returns correctly. If x A[S+H], then x if present, must be in A[S+H, L]. So algorithm must call Bsearch(A, S+H, L-H, x), which it does. The length argument, L-H, is smaller than L. So by induction call returns correctly. Hence the algorithm will work correctly for all L.

  10. Remarks If you are likely to search an array frequently, it is useful to first sort it. The time to sort the array will be be compensated by the time saved in subsequent searches. How do you sort an array in the first place? Next. Binary search can be written without recursion. Exercise. Even professional programmers make mistakes when writing binary search. Should condition use x <= A[H] or x < A[H]? Need to ensure correct even if length is odd. Precise subranges to be searched and precise lengths to be searched should be exactly correct. Very important to write down precisely what the function does: searches A[S..S+L-1] be careful about -1 etc.

  11. Estimating time taken General idea: standard operations take 1 cycle. Arithmetic, comparison, copying one word address calculation, pointer dereference Convenient idealization. We characterize running time as a function of an agreed upon problem size n : n = Number of keys to be sorted in a sorting problem. n = Size of matrices in matrix multiplication We worry only how the time grows as the problem size increases: e.g. the time is linear in n or is quadratic For large enough problem size, linear e.g. 100n is better than n2/2 Computers deal with large problems If time taken is different for different inputs of the same size, we consider the max time amongst them. We want to claim: No matter what the input is, the time is linear in n

  12. Sorting Selection Sort (Chapter 14) Find smallest in A[0..n-1]. Exchange it with A[0]. Find smallest in A[1..n-1]. Exchange it with A[1]. Selection sort time: we count comparisons (Other operations will take proportional time.) n-1 comparisons to find smallest n-2 comparisons to find second smallest Total n(n-1)/2. About n2 . (Quadratic) Algorithms requiring fewer comparisons are known: About nlog n One such algorithm is Merge sort.

  13. Mergesort idea To sort a long sequence: Break up the sequence into two small sequences. Sort each small sequence. (Recurse!) Somehow merge the sorted sequences into a single long sequence. Hope: merging sorted sequences is easier than sorting the large sequence. Our hope is correct, as we will see soon!

  14. Example Suppose we want to sort the sequence 50, 29, 87, 23, 25, 7, 64 Break it into two sequences. 50, 29, 87, 23 and 25, 7, 64. Sort both We get 23, 29, 50, 87 and 7, 25, 64. Merge Goal is to get 7, 23, 25, 29, 50, 64, 87.

  15. Merge sort void mergesort(int S[], int n){ // Sorts sequence S of length n. if(n==1) return; int U[n/2], V[n-n/2]; // local arrays for(int i=0; i<n/2; i++) U[i]=S[i]; for(int i=0; i<n-n/2; i++) V[i]=S[i+n/2]; mergesort(U,n/2); mergesort(V,n-n/2); // Merge sorted U, V into S. merge(U, n/2, V, n-n/2, S, n); // U, V merge into original array S. }

  16. Merging example U: 23, 29, 50, 87. V: 7, 25, 64. S: The smallest overall must move into S. Smallest overall = smaller of smallest in U and smallest in V. So after movement we get: U: 23, 29, 50, 87. V: -, 25, 64. S: 7.

  17. What do we do next? U: 23, 29, 50, 87. V: -, 25, 64. S: 7. Now we need to move the second smallest into S. Second smallest: smallest in U,V after smallest has moved out. smaller of what is at the front of U, V. So we get: U: -, 29, 50, 87. V: -, 25, 64. S: 7, 23.

  18. General strategy While both U, V contain a number: Move smallest from those at the head of U,V to the end of S. If only U contains numbers: move all to end of S. If only V contains numbers: move all to end of S. uf: index denoting which element of U is currently at the front. U[0..uf-1] have moved out. vf: similarly for V. sb: index denoting where next element should move into S next (sb: back of S) S[0..sb-1] contain elements that have moved in earlier.

  19. Merging two sequences void merge(int U[], int p, int V[], int q, int S[], int n){ // S should receive all elements of U,V, in sorted order. int uf=0, vf=0; // uf, vf : front of u, v for(int sb=0; sb < p + q; sb++){ // sb = back of s //Invariant: s[0..sb-1] contain smallest sb, // u[uf..p-1], v[vf..q-1] contain rest if(uf<p && vf<q){ // both U,V are non empty if(U[uf] < V[vf]){ S[sb] = U[uf]; uf++;} else{ S[sb] = V[vf]; vf++;} } else if(uf < p){ // only U is non empty S[sb] = U[uf]; uf++; } else{ // only V is non empty S[sb] = V[vf]; vf++; } } }

  20. Time Analysis: merging Time required to merge two sequences of length p, q: Loop runs for p+q iterations. In each iteration a fixed number of operations are performed. So time is proportional to p+q. Time proportional to n, if n=p+q.

  21. Time analysis: sorting Ti= maximum time required for mergesort to sort any sequence of length i. T1 c, where c is some constant. Tn dn + 2Tn/2+ en dn : time required to create U,V from S. Tn/2: time to sort sequences of length n/2. Assume n/2 integer. en : upper bound on time to merge sequence of net length n. Tn fn + 2Tn/2for f=d+e Inequality applies to Tn/2also Tn/2 fn/2 + 2Tn/4 Tn fn + 2(fn/2 + 2Tn/4) = 2fn + 4Tn/4 Continuing we get Tn kfn + 2kTn/2k If n=2kor k = log2n: Tn fn log n + nT1= fnlog2n + nc Thus Tn gn log2n for some constant g.

  22. Remarks Mergesort is much faster than selection sort in practice.

  23. The eight queens puzzle Place eight queens on a chess board so that no queen captures another . A queen can capture anything that is in a square exactly to the East, West, North, South, NE, NW, SE, SW. Queens should be in distinct rows, distinct columns, and distinct diagonals Good example of constraint satisfaction problem . Solution uses recursion.

  24. Can we represent the problem mathematically? Frame a question of the form: Find numbers x,y,z... such that they satisfy constraints ... Constraints: equalities, inequalities, It should be possible to interpret the numbers in terms of queen positions on the board.

  25. Mathematical representation 1 For each board square (i,j), let xij encode whether a queen is present or not present in that square. xij= 1 : queen present xij= 0 : queen absent At most one queen in each row i: xi1+ xi2+ ... + xin 1 Similarly for other conditions Solve !

  26. Example: 3x3 board Row conditions: x11+x12+x13 1, x21+x22+x23 1, x31+x32+x33 1 Column conditions: x11+x21+x31 1, x12+x22+x32 1, x13+x23+x33 1 Diagonal conditions: x21+x32 1, x11+x22+x33 1, x12+x23 1 x12+x21 1, x13+x22+x31 1, x23+x32 1 Place 3 queens: x11+x12+x13+x21+x22+x23+x31+x32+x33= 3 0-1 constraint: All xij {0, 1}

  27. Another representation What do we want to find? Positions for 8 queens. Position in 2d space : 2 numbers, (x, y) Are we looking for 16 numbers then? Real or integers? Integers, in the range 1..8 Distinct columns: All xishould be distinct. Distinct rows: All yishould be distinct. Distinct diagonals: For all i,j where i j: |xi- xj| |yi yj|

  28. Other examples and variations on constraint satisfaction problems Find x1, x2, ..., xnsuch that ... is called a constraint satisfaction problem. 8 queens problem is a constraint satisfaction problem. Another example: solving equations: find x such that 3x2+ 4x + 1 = 0. We may in addition have an objective function : of all xisatisfying the constraints, report one that maximizes some given function f(x1, x2, ..., xn) Example of a constraint satisfaction problem with an objective function: finding GCD of x,y. Find r such that r divides x, and r divides y, and f(r)=r is maximum.

  29. Solving constraint satisfaction problems Perform algebraic manipulation and deduce solution. Quadratic equation: factorize... Try all possibilities Works if each variable that we want to solve for has a finite domain 8 queens formulation 1: 64 variables, each either 0 or 1 8 queens formulation 2: 16 variables, each from {1,2,...,8} We construct each candidate solution and check if it satisfies our constraints. If yes, we report it. How to construct all solutions? Aren t there a huge number of them? Number of candidate solutions for 8 queens formulation 1: 264 Number of candidate solutions for 8 queens formulation 2: 816 816= 248< 264 Can we reduce this number further?

  30. Formulation 3 No column can hold more than 1 queen; else there will be captures. We want 8 queens, need at least 1 queen in each of the 8 columns So place exactly 1 queen in each column. New representation: Let yi= row position of queen in column i. What conditions should y1, y2, ..., y8satisfy? Distinct columns condition: Automatically satisfied Distinct rows condition: yishould be distinct. Distinct diagonals condition: For all i,j, i j : |yi- yj| |i-j| Size of search space: 88, much smaller than previous 816or 264.

  31. A program for 4 queens int y[4]; // y[i]: row position of column i queen // For all ways of placing all queens: for(y[0] = 0; y[0] < 4; y[0]++){ for(y[1] = 0; y[1] < 4; y[1]++){ for(y[2] = 0; y[2] < 4; y[2]++){ for(y[3] = 0; y[3] < 4; y[3]++){ if(!capture(y,4)) cout <<y[0]<<y[1]<<y[2]<<y[3]<<endl; } } } }

  32. Function to check for capture bool capture(int y[], int n){ // Decides whether any queen captures any // other. n = board size. // check for all pairs j>k for(int j=1; j<n; j++){ for(int k=0; k<j; k++){ if((y[j] == y[k]) || (abs(j-k) == abs(y[j]-y[k])) return true; } } return false; } // Loop invariant?

  33. Will the same idea work for any n? Many programming languages will not allow you to nest more than a certain number of loops. In other constraint satisfaction problem, the number of variables to be selected could be very large, making it difficult to do so much nesting Recursion comes to our rescue! We will write a recursive program which will not have much nesting but will have the same effect as writing a program with nesting.

  34. A different view of searching through the candidate configurations S = Set of all possible ways ( configurations ) to place queens, one queen per column. = All possible ways to assign values to y0,y1, ,y7 Suppose n = 3. S has 27 elements: {000, 001, 002, 010, 011, 012, 020, 222} Algorithm outline for n = 3 : Store first configuration in y[0..2], then call capture. Store second configuration in y[0..2], then call capture. Store 27thconfiguration in y[0..2], then call capture. Searching the set S of configurations

  35. How to search S Observation: S = S0 S1 Sn-1, where Si= set of configurations in which queen in column 0 is in row i, and other queens anywhere. 3 Queens: S0 = {000,001,002,010,011,012,020,021,022} Searching S = searching S0, ,Sn-1. But Siis also a union of smaller sets (recursion!): Si= Si0 Si1 Si,n-1, where Sij= set of configurations in which queen in column 0 is in row i, and queen in column 1 in row j, and other queens anywhere. What is S02for the 3 queen problem? {020, 021, 022}

  36. General case Notational change: We will write S(x) rather than Sx. S(i0,i1, ,ik-1) : Configurations with queens in first k columns in rows ijfor j=0..k-1. S(i0,i1, ,ik-1) = S(i0,i1, ,ik-1,0) S(i0,i1, ,ik-1,1) ... S(i0,i1, ,ik-1,n-1)

  37. How to search S (contd.) void search(int n, int y[], int k){ // n = number of queens, also length of array y. // Function searches subspace S(y[0],y[1],...y[k-1]) of all candidate // positions and of these prints those in which there is no capture. if(k == n){ // base case if(!capture(y,k)){ for(int j=0; j<k; j++) cout << y[j]; cout << endl; } } else{ // Search S(y[0],...y[k-1]) recursively for(int j=0; j<n; j++){ y[k] = j; // red decomposition given earlier search(n, y, k+1); } } }

  38. How to search S (contd) The function search has an important post condition: it does not modify y[0..k-1]. Only because of this can we merely set y[k] = j before recursion. The main program is natural. int main(){ const int n=8; int y[n]; search(n, y, 0); }

  39. Recursion tree for n=3 Search(3,[-,-,-],0) Search(3,[0,-,-],1) Search(3,[1,-,-],1) Search(3,[2,-,-],1) . . . . . . . . . Search(3,[1,0,-],2) Search(3,[1,2,-],2) . . . . . . . . . Search(3,[1,1,-],2) . . . . . . . . . Search(3,[1,2,0],3) . . . . . . . . . . . . . . .

  40. An improvement: Early check S(i0,i1, ,ik-1) : Set of configurations of k queens such that queen in column j is in row ij for j=0..k-1. If any of the first k queens capture each other, we need not worry about the remaining n-k queens, clearly there is no non capturing configuration in this Subspace. Key idea: whenever we place the kth queen, we first check if it captures the previous queens. Only then do we bother to explore the set of configurations further.

  41. Checking if the kth queen captures any previous queens bool lastCaptures(int y[], int k){ // check whether queen in column k // captures any in columns 0..k-1. for(int j=0; j<k; j++){ if((y[j] == y[k]) || (abs(j-k) == abs(y[j]-y[k])) return true; } return false; }

  42. Search function and main program void search(int n, int y[], int k){ if(k == n){ for(int j=0; j<k; j++) cout << y[j]; cout << endl; } else { for(int j=0; j<n; j++){ y[k] = j; if(!lastCaptures(y,k)) search(n,y,k); } } } // Precisely state what this function expects and does. // main program: // as before.

  43. Remarks Recursion is very powerful even with arrays. Idea of mergesort: divide input into parts, sort each part, then combine, is called divide- conquer-combine . Divide-conquer-combine is useful for other problems besides sorting. Try out all possibilities works for many problems, see problems at the end of the chapter. Early condition checking also works quite often.

  44. Exercise 1 Suppose the comparison in our binary search code used <= rather than <. Give an instance (set of input values) for which the algorithm will produce a wrong answer. Where would the proof not work?

  45. Exercise 2 Suppose the binary search code returns the index of the last element it examines, rather than a Boolean value. What value would this be, assuming (1) the element x being searched is not present in the array, (2) there is exactly one occurrence of x in the array, (3) there are multiple occurrences of x?

  46. Exercise 3 Instead of specifying the subarray to be searched by giving the starting index and the length, you might give the starting and ending indices. Write the code using this, and prove its correctness.

  47. Exercise 4 Suppose I represent a set of integers using an array that holds the integers in sorted order. Given two such arrays representing two sets, give an algorithm that prints their union. (The common elements must be printed just once.) Your algorithm should run in time proportional to the size of the union. Suppose the elements were stored without sorting. How many comparisons do you think the natural algorithm would perform? Does sorting help?

  48. Exercise 5 A seller receives bids for an item from n buyers. The seller examines the bids and sells to the highest bidder. Each buyer b bids Hb if she is happy, and Sb if she is sad. Buyers are happy/sad with equal probability, and independently of each other. Write a program that reads in n and the values Hb and Sb for each buyer and prints the expected value of the winning bid. Your program should do this in two ways, and print the two answers on separate lines. 1. The probability space has 2n points, corresponding to the different ways in which the buyers can be happy or sad. Your program should go over each point and add up the contribution of different points to the expectation. This should be done using a recursive function, similar to what is used in n queens. 2. What is the probability that the highest in all the numbers H, S actually becomes the winning bid? What can you say going down the sorted order of bids? Develop this idea into a very fast algorithm and code that. For sorting, use the standard library function as discussed in Section 22.3.2. To sort an array of structures, you should use a lambda expression as an extra argument, as shown at the top of page 321. It is shown for vectors, but is the same for arrays. Lambda expressions are discussed earlier in the book, and you may find it interesting to understand them fully.

  49. Exercise 6 The basic idea of binary search can be used even when the values over which to search are not given explicitly. Here is an example. The input consists of numbers x1,x2,...,xn which are lengths of consecutive billboards located along a road. We have an additional input, k, giving the number of painters available to paint the boards. Each board requires time proportional to its length to paint. We must assign some number of consecutive boards to each painter such that the maximum sum of the length of boards assigned to any painter is as small as possible, i.e. so as to finish in minimum time. Observe that it is easy to check whether in T time the k painters can finish the job. If the check succeeds, you can decide to check for a lower time T. Develop this idea into an algorithm that determines the minimum time. Your algorithm should to about log n checks only.

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#