B-Trees: Efficient Data Storage and Retrieval

 
B-trees
 
Eduardo Laber
David Sotelo
 
What are B-trees?
 
Balanced search trees designed for secondary
storage devices
 
Similar to AVL-trees but better at minimizing
disk I/O operations
 
Main data structure used by DBMS to store
and retrieve information
 
What are B-trees?
 
Nodes may have many children (from a few to
thousands)
 
Branching factor can be quite large
 
Every B-tree of n keys has height O(log n)
 
In practice,  its height is smaller than the height
of an AVL-Tree
 
B-trees and Branching Factor
 
 
Definition of B-trees
 
 
B-tree is a rooted tree containing the following
five properties:
 
1.
Every node x has the following attributes:
a)
 
x.n
, the number of keys stored in node x
b)
The 
x.n 
keys: 
x.key
1
x.key
2
 
≤  ...  ≤ 
x.key
x.n
c)
The boolen 
x.leaf 
indicating if x is a 
leaf
 or an
internal node
 
Definition of B-trees
 
2.
If x is an internal node it contains 
x.n + 1
pointers 
x.p
1
 , 
x.p
2
 
,  ...  , 
x.p
(x.n + 1) 
to its
children
 
3.
The keys 
x.key
i
 separate ranges of trees
stored in each subtree  
(x.p
i
 , x.p
i+1 
)
 
4.
All leaves have the same depth ==  tree’s
height.
 
 
Definition of B-trees
 
5.
Bounds on the number of keys of a node:
 
Let 
B 
 be a positive integer representing the 
order
 of
the B-tree.
 
Every node (except the root) must have 
at least 
B
keys
.
 
Every node (except the root) must have 
at most 
2B
keys
.
 
Root  is free to contain between 
1
 and 
2B
 nodes
(why?)
 
 
 
 
 
 
Example of a B-tree
 
 
Exercise 1
 
 
 
Enumerate all valid B-trees of order 2
that represent the set {1, 2, ... , 8}
 
Exercise 1
Solution:
 
 
 
 
 
 
The height of a B-tree
 
 
Theorem 
: Let 
h
 be the height of a B-tree of 
n
keys and order 
B > 1
.  Then:  
h ≤ log 
B
 (n+1)/2
 
 
Proof:
Root contains at least one key.
All other nodes contain at least B keys
At least one key at depth 0
At least 2B keys at depth 1
At least 2B
2 
+ B keys at depth 2
At least 2B
i 
+ B
i-1
 + B
i-2  
+  ... + B keys at depth i
 
 
 
 
 
 
 
 
 
 
Proof (continued)
 
 
 
 
 
 
 
 
 
        
 
 
 
 
 
 
 
 
 
 
 
Searching a B-tree
 
Similar to searching a binary search tree.
 
Multiway branching decision according to the
number of  the node’s chidren.
 
Recursive procedure with a time complexity of
 
O(B log 
B 
n) 
for a tree of 
n
 keys and order 
B
.
 
 
Searching a B-tree
B-TREE-SEARCH (x, k)
1
 
i = 1
2
 while
 i ≤ x.n  
and  
k > x.key
i
  
do
  i = i + 1
3
 if
 i ≤ x.n 
and
 k == x.key
i
  
then  return
 (x, i)
4
 if
 x.leaf 
then return 
NIL
5
 else
 
DISK-READ(x.p
i
)
 
return  
B-TREE-SEARCH (x.p
i 
, k)
 
Searching a B-tree
 
Search for the key 
F
 
 
 
 
Searching a B-tree
J
A   B
N   O
H   I
Q   R
D   E  F
L   M
T   U
K   P  S
C   G
 
Search for the key 
F
 
 
 
 
Searching a B-tree
J
A   B
N   O
H   I
Q   R
D   E  F
L   M
T   U
K
   P  S
C   G
 
Search for the key 
F
 
 
 
 
Searching a B-tree
J
A   B
N
   O
H   I
Q   R
D   E  F
L   M
T   U
K
   
P
  S
C   G
 
Search for the key 
N
 
 
 
 
Searching a B-tree
 
Lemma: 
 The time complexity of procedure
B-TREE-SEARCH
 
 is O(B log 
B
 n)
 
Proof:
Number of recursive calls is equal to tree’s
height.
The height  of a B-tree is O(log 
B
 n)
Cost between 
B
 and 
2B
 iterations per call.
Total of O(B log 
B
 n) steps.   
 
 
Exercise 2
 
Suppose that B-TREE-SEARCH is implemented
to use  binary search rather than linear search
within each node.
 
Show that this changes makes the time
complexity O(lg n), independently of how 
B
might be chosen as a function of 
n
.
 
Exercise 2
 
Solution:
By using binary search the number of steps of
the algorithm  becomes 
O(lg B log 
B
 n) 
.
 
Observe that  
log 
B
 n = lg n / lg B 
.
 
Therefore 
O(lg B log 
B
 n)  = O(lg n)
.
 
 
 
Linear or Binary B-tree search ?
 
Lemma: 
 If 1 < B < n then  
lg n
 
 
B log
B
 n
 
Proof:
 
 
 
 
 
 
Inserting a key into a B-tree
 
The new key is always inserted into an existing leaf node
(why?)
 
Firstly we search for the leaf position at which to insert the
new key.
 
If such a node is full we split it.
 
A 
split operation 
splits a full node around its 
median key
into two nodes having 
B
 keys each.
 
Median key moves up into splitted node
’s parent
 
(insertion recursive call).
 
Split operation
 
Inserting key 
F
 into a full node (
B
 = 2)
A  C  E  G
K  M  O   Q
J
 
Split operation
 
Node found but already full
A  C  E  
F
 G
K  M  O   Q
J
 
Split operation
 
Median key identified
A  C  
E
  F G
K  M  O   Q
J
 
Split operation
 
Splitting the node
A  C
K  M  O  Q
E   
J
F  G
 
Inserting a key into a B-tree
 
Insertion can be propagated upward (
B
 = 2)
A  C
K  M  O  Q
E    J    T    X
F  G
U W
Y  Z
 
Inserting a key into a B-tree
 
Insertion can be propagated upward (
B
 = 2)
A  C
K  M 
N 
O  Q
E    J    T    X
F  G
U W
Y  Z
 
Inserting a key into a B-tree
 
Insertion can be propagated upward (
B
 = 2)
A  C
K  M
E    J   
N
   T    X
F  G
U W
Y  Z
O  Q
 
SPLIT
 
Inserting a key into a B-tree
 
Insertion can be propagated upward (
B
 = 2)
A  C
K  M
F  G
E  J
N
O  Q
Y  Z
U  W
T  X
 
SPLIT
 
Inserting a key into a B-tree
 
B-TREE-INSERT (x, 
k, y
)
1
 
i = 1
2
 while
 i ≤ x.n  
and  
k < x.key
i
  
do
  i = i + 1
3
 
x.n =  x.n + 1
4
 
x.key
i
 = k
5
 
x.p
i+1
 = y
6
 for
 j = x.n 
downto
 i+1 do
7
      
x.key
j
 = x.key
j-1
8
      
x.p
j
 = x.p
j-1
9
 
end-for
10
 
DISK-WRITE(x)
 
Inserting a key into a B-tree
 
B-TREE-INSERT (x, k)
11
 if
 x.n >  2*B 
then
12
     
[m, z] = SPLIT (x)
13
 
    
if
 x.parent  !=  NIL 
then
14
          
DISK-READ (x.parent)
15
 
    
end-if
16
     else
17
         
x.parent = ALLOCATE-NODE()
18
 
        DISK-WRITE (x)
19
 
        root = x.parent
20
     end-else
21
     
B-TREE-INSERT (x.parent, m, z)
22
 end-if
 
Inserting a key into a B-tree
 
SPLIT
 (x)
1
 
z = ALLOCATE-NODE()
2
 
m = FIND-MEDIAN (x)
3
 
COPY-GREATER-ELEMENTS(x, m, z)
4
 
DISK-WRITE (z)
5
 
COPY-SMALLER-ELEMENTS(x, m, x)
6
 
DISK-WRITE (x)
7
return [m, z]
 
Inserting a key into a B-tree
 
Function B-TREE-INSERT has three arguments:
The node 
x 
at which an element of key 
k 
should be
inserted
The key
 k 
to be inserted
A pointer 
y
 to the left child of  
k
 to be used as one of the
pointers of 
x
 during insertion process.
 
There is a global variable named 
root
 which is a pointer to the
root of the B-Tree.
 
Observe that the field 
x.parent
 was not defined as an original
B-tree attribute, but is considered just to simplify the process.
 
The fields 
x.leaf
 should also be updated accordingly.
 
 
Inserting a key into a B-tree
 
Lemma: 
 The time complexity of  B-TREE-INSERT
 
is O(B log 
B
 n)
Proof:
Recall that B-TREE-SEARCH function is called first and costs
 
O(log  n) by using binary search. Then, B-TREE-INSERT starts by
visiting a node and proceeds upward.
 
At most one node is visited per level/depth  and  only visited nodes
can be splitted. A most one node is created during the insertion
process. Cost for splitting is proportional to 
2B
.
 
Number of visited nodes is equal to tree’s height and the height  of
a B-tree is O(log 
B
 n). Cost between 
B
 and 
2B
 iterations per visited
node. Total of O(B log 
B
 n) steps.   
 
 
Some questions on insertion
 
Which split operation increases the tree’s height?
 
The split of the root of the tree.
 
How many 
DISK-READ
 operations are executed
by the insertion algorithm?
 
Every node was read at least twice.
 
Does binary search make sense here?
 
Not exactly. We already pay O(B) to split a node
(for finding the median).
 
 
Drawbacks of our insertion method
 
Once that the key’s insertion node is found it may be
necessary to read its parent node again (due to splitting).
 
DISK-READ/WRITE
 operations are expensive and would be
executed al least 
twice
 for each node in the key’s path.
 
It would be necessary to store a nodes’s parent or to use
the recursion stack to keep its reference.
 
(Mond and Raz, 1985)
 provide a solution that spends one
DISK-READ/WRITE
 per visited node (
See at CLRS
)
 
Exercise 3
 
 
Show the results of inserting the keys
 
E, H, B, A, F, G, C, J, D, I
 
in order into an empty B-tree of order 1.
 
Exercise 3
Solution: 
(final configuration)
 
G  I
F
C  D
H
J
A
B
E
 
Exercise 4
 
 
Does a B-tree of order 1 is a good choice for a
balanced search tree?
 
What about the expression 
h ≤ log 
B
 (n+1)/2
when B = 1?
 
Deleting a key from a B-tree
 
Analogous to insertion but a little more complicated.
 
A key can be deleted from any node (not just a leaf) and
can affect  its parent and its children (insertion operation
just affect parents).
 
One must ensure that a node does not get to small
during deletion (less than 
B
 keys).
 
As a result deleting a node is the most complex operation
on B-trees. It will be considered  in 4 particular cases.
 
 
Deleting a key from a B-tree
 
 
Case 1: 
 The key is in a 
leaf
 node with 
more
than
 
B
 elements.
 
 
Procedure
:
Just remove the key from the node.
 
Deleting a key from a B-tree
 
Case 1
: The key is in a leaf node with more
than 
B
 elements (
B
 = 2)
A  
C
  D
K  M
F  G
E  J
N
O  Q
Y  Z
U  W
T  X
 
Deleting a key from a B-tree
 
Case 1
: The key is in a leaf node with more
than 
B
 elements (
B
 = 2)
A  D
K  M
F  G
E  J
N
O  Q
Y  Z
U  W
T  X
 
Deleting a key from a B-tree
 
Case 2:  The join procedure
The key 
k
1
 to be deleted is in a leaf 
x 
with exactly 
B
 elements.
Let 
y
 be a node that is an “adjacent brother” of 
x
.
Suppose that 
y
 has
 exactly 
B
 elements.
Procedure
:
Remove the key 
k
1
.
Let
 k
2
 
be the key
 
that separates nodes 
x
 and 
y
 in their parent.
Join
 the the nodes 
x
 and 
y 
and move the key 
k
2 
from the parent
to the new joined node.
If the parent of 
x 
becomes with 
B-1 elements
 and also has an
“adjacent brother” with 
B
 elements,  apply the  
join
 procedure
recursively for the parent of 
x 
(seen as 
x
) and its adjacent
brother (seen as 
y
).
 
Deleting a key from a B-tree
 
Case 2
: Delete key 
Q
  (
B = 2
)
H  I
F
O  Q
Y  Z
U  W
K       T      X
...
 
Deleting a key from a B-tree
 
Case 2
: Delete key 
Q
  (
B = 2
)
H  I
F
O  
Q
Y  Z
U  W
K       T      X
 
Node x
 
Node y
 
Parent
...
 
Deleting a key from a B-tree
 
Case 2
: Delete key 
Q
  (
B = 2
)
H  I
F
O
Y  Z
U  W
K       T      X
 
Node x
 
Node y
 
Parent
...
 
Deleting a key from a B-tree
 
Case 2
: Delete key 
Q
  (
B = 2
)
H  I
F
O
Y  Z
U  W
K       
T
      X
 
Node x
 
Node y
 
Parent
...
 
Deleting a key from a B-tree
 
Case 2
: Delete key 
Q
  (
B = 2
)
H  I
F
Y  Z
O  T  U  W
K            X
...
 
Parent
 
Join
 
Deleting a key from a B-tree
 
Case 2
: Delete key 
Q
  (
B = 2
)
H  I
F
Y  Z
O  T  U  W
K            X
...
 
Deleting a key from a B-tree
 
Case 3:  join and split
The key 
k
1
 to be deleted is in a leaf 
x 
with exactly 
B
 elements.
Let 
y
 be a node that is an “adjacent brother” of 
x
.
Suppose that 
y
 has
 more than 
B
 elements.
Procedure
:
Remove the key 
k
1
.
Let
 k
2
 
be the key
 
that separates nodes 
x
 and 
y
 in their parent.
Join
 the the nodes 
x
 and 
y 
and move the key 
k
2 
from the parent
to the new joined node 
z
.
Find
 
the median key 
m
 of 
z
Determine the new nodes 
x
 and 
y
 by splitting
 z
 around 
m
.
Insert
 m
 into the parent of 
x 
and 
y
.
 
Deleting a key from a B-tree
 
Case 3
: Delete key 
F
  (
B = 2
)
A  C  D
K  M
F  G
E  J
N
O  Q
Y  Z
U  W
T  X
 
Deleting a key from a B-tree
 
Case 3
: Delete key 
F
  (
B = 2
)
A  C  D
K  M
F 
 G
E  J
N
O  Q
Y  Z
U  W
T  X
 
Deleting a key from a B-tree
 
Case 3
: Delete key 
F
  (
B = 2
)
A  C  D
K  M
G
E  J
N
O  Q
Y  Z
U  W
T  X
 
Deleting a key from a B-tree
 
Case 3
: Delete key 
F
  (
B = 2
)
A  C  D
K  M
G
E
  J
N
O  Q
Y  Z
U  W
T  X
 
Parent
 
Node x
 
Node y
 
Deleting a key from a B-tree
 
Case 3
: Delete key 
F
  (
B = 2
)
A  C  D
K  M
G
E
  J
N
O  Q
Y  Z
U  W
T  X
 
Parent
 
Node x
 
Node y
 
Deleting a key from a B-tree
 
Case 3
: Delete key 
F
  (
B = 2
)
A  C  D  E  G
K  M
J
N
O  Q
Y  Z
U  W
T  X
 
Join
 
Deleting a key from a B-tree
 
Case 3
: Delete key 
F
  (
B = 2
)
A  C  
D
  E  G
K  M
J
N
O  Q
Y  Z
U  W
T  X
 
Median key
 
Deleting a key from a B-tree
 
Case 3
: Delete key 
F
  (
B = 2
)
A  C
K  M
D 
  J
N
O  Q
Y  Z
U  W
T  X
 
Split
E  G
 
Deleting a key from a B-tree
 
Case 3
: Delete key 
F
  (
B = 2
)
A  C
K  M
D
 
  J
N
O  Q
Y  Z
U  W
T  X
E  G
 
Deleting a key from a B-tree
 
Case 4:  internal node
The key 
k
1
 to be deleted is in a node 
x
 that is not a leaf
 
or
a 
 root
.
 
Procedure
:
Let 
k
2
 be the smallest key that is greater than 
k
1
.
Let 
y
 be the node of 
k
2
, which will be a leaf.
Insert key 
k
2
 into 
x
.
Remove the key 
k
1
 from 
x
.
Solve now the problem of removing 
k
2
 from a leaf 
y
,
previously considered.
 
Deleting a key from a B-tree
 
Case 4
: Delete key 
T
  (
B = 2
)
A  C
K  M
D
 
  J
N
O  Q
Y  Z
U  W
T  X
E  G
 
Deleting a key from a B-tree
 
Case 4
: Delete key 
T
  (
B = 2
)
A  C
K  M
D
 
  J
N
O  Q
Y  Z
U  W
T
  X
E  G
 
Node x
 
Deleting a key from a B-tree
 
Case 4
: Delete key 
T
  (
B = 2
)
A  C
K  M
D
 
  J
N
O  Q
Y  Z
U 
 W
T
  X
E  G
 
Node x
 
Node y
 
Deleting a key from a B-tree
 
Case 4
: Delete key 
T
  (
B = 2
)
A  C
K  M
D
 
  J
N
O  Q
Y  Z
W
U
  X
E  G
 
Node x
 
Node y
 
Deleting a key from a B-tree
 
Case 4
: Delete key 
T
  (
B = 2
)
A  C
K  M
D
 
  J
N
O  Q
Y  Z
W
U  X
E  G
 
Node x
 
Node y
 
Deleting a key from a B-tree
 
Case 4
: Delete key 
T
  (
B = 2
)
A  C
K  M
D
 
  J
N
O  Q
Y  Z
W
U  X
E  G
 
Node x
 
Node y
 
Deleting a key from a B-tree
 
Case 4
: Delete key 
T
  (
B = 2
)
A  C
K  M
D
 
  J
N
O  Q
Y  Z
W
U  
X
E  G
 
Node x
 
Node y
 
Deleting a key from a B-tree
 
Case 4
: Delete key 
T
  (
B = 2
)
A  C
K  M
D
 
  J
N
O  Q
W  X  Y  Z
U
E  G
 
Node x
 
Node y
 
Deleting a key from a B-tree
 
Case 4
: Delete key 
T
  (
B = 2
)
A  C
K  M
D
 
  J
N
O  Q
W  X  Y  Z
U
E  G
 
Node x
 
Node y
 
Deleting a key from a B-tree
 
Case 4
: Delete key 
T
  (
B = 2
)
A  C
K  M
D
 
  J
N
O  Q
W  X  Y  Z
U
E  G
 
Node x
 
Node y
 
Deleting a key from a B-tree
 
Case 4
: Delete key 
T
  (
B = 2
)
A  C
K  M
D
 
  J
N
O  Q
W  X  Y  Z
U
E  G
 
Node x
 
Node y
 
Deleting a key from a B-tree
 
Case 4
: Delete key 
T
  (
B = 2
)
A  C
K  M
D
 
  J
N
O  Q
W  X  Y  Z
U
E  G
 
Node x
 
Node y
 
Deleting a key from a B-tree
 
Case 4
: Delete key 
T
  (
B = 2
)
A  C
K  M
D  
 
  J     N     U
O  Q
W  X  Y  Z
E  G
Slide Note
Embed
Share

B-Trees are balanced search trees designed for secondary storage devices, commonly used by databases. They can have many children, allowing for efficient data organization. The branching factor of B-Trees keeps their height low, making them ideal for minimizing disk I/O operations. This article explains the properties and characteristics of B-Trees, along with examples and exercises to enhance understanding.

  • B-Trees
  • Data Storage
  • Retrieval
  • Database Systems

Uploaded on Jul 10, 2024 | 2 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. B-trees Eduardo Laber David Sotelo

  2. What are B-trees? Balanced search trees designed for secondary storage devices Similar to AVL-trees but better at minimizing disk I/O operations Main data structure used by DBMS to store and retrieve information

  3. What are B-trees? Nodes may have many children (from a few to thousands) Branching factor can be quite large Every B-tree of n keys has height O(log n) In practice, its height is smaller than the height of an AVL-Tree

  4. B-trees and Branching Factor

  5. Definition of B-trees B-tree is a rooted tree containing the following five properties: 1. Every node x has the following attributes: a) x.n, the number of keys stored in node x b) The x.n keys: x.key1 x.key2 ... x.keyx.n c) The boolen x.leaf indicating if x is a leaf or an internal node

  6. Definition of B-trees 2. If x is an internal node it contains x.n + 1 pointers x.p1, x.p2, ... , x.p(x.n + 1) to its children 3. The keys x.keyiseparate ranges of trees stored in each subtree (x.pi, x.pi+1 ) 4. All leaves have the same depth == tree s height.

  7. Definition of B-trees 5. Bounds on the number of keys of a node: Let B be a positive integer representing the order of the B-tree. Every node (except the root) must have at least B keys. Every node (except the root) must have at most 2B keys. Root is free to contain between 1 and 2B nodes (why?)

  8. Example of a B-tree

  9. Exercise 1 Enumerate all valid B-trees of order 2 that represent the set {1, 2, ... , 8}

  10. Exercise 1 Solution: 4 5 1 2 3 5 6 7 8 1 2 3 4 6 7 8 3 6 1 2 4 5 7 8

  11. The height of a B-tree Theorem : Let h be the height of a B-tree of n keys and order B > 1. Then: h log B (n+1)/2 Proof: Root contains at least one key. All other nodes contain at least B keys At least one key at depth 0 At least 2B keys at depth 1 At least 2B2 + B keys at depth 2 At least 2Bi + Bi-1 + Bi-2 + ... + B keys at depth i

  12. Proof (continued) h = i + i 1 2 n B 1 + 1 h 1 B + 1 2 n 1 B + 1 n h h 2 1 n B B 2 + 1 n log h B 2

  13. Searching a B-tree Similar to searching a binary search tree. Multiway branching decision according to the number of the node s chidren. Recursive procedure with a time complexity of O(B log B n) for a tree of n keys and order B.

  14. Searching a B-tree B-TREE-SEARCH (x, k) 1 i = 1 2 whilei x.n and k > x.keyido i = i + 1 3 ifi x.n and k == x.keyithen return (x, i) 4 if x.leaf then return NIL 5 else DISK-READ(x.pi) return B-TREE-SEARCH (x.pi , k)

  15. Searching a B-tree Search for the key F J K P S C G D E F H I L M Q R T U A B N O

  16. Searching a B-tree Search for the key F J K P S C G D E F H I L M Q R T U A B N O

  17. Searching a B-tree Search for the key F J K P S C G D E F H I L M Q R T U A B N O

  18. Searching a B-tree Search for the key N J KP S C G D E F H I L M Q R T U A B N O

  19. Searching a B-tree Lemma: The time complexity of procedure B-TREE-SEARCHis O(B log B n) Proof: Number of recursive calls is equal to tree s height. The height of a B-tree is O(log B n) Cost between B and 2B iterations per call. Total of O(B log B n) steps.

  20. Exercise 2 Suppose that B-TREE-SEARCH is implemented to use binary search rather than linear search within each node. Show that this changes makes the time complexity O(lg n), independently of how B might be chosen as a function of n.

  21. Exercise 2 Solution: By using binary search the number of steps of the algorithm becomes O(lg B log B n) . Observe that log B n = lg n / lg B . Therefore O(lg B log B n) = O(lg n).

  22. Linear or Binary B-tree search ? Lemma: If 1 < B < n then lg n B logB n Proof: ( ) B lg n = = = B B log log lg B n n n B B B lg B B B / n B n B ( ( ) ( ) lg ( ) B B B lg lg / lg / n B n B n n ) ( ) n 1 B B lg / lg n n n

  23. Inserting a key into a B-tree The new key is always inserted into an existing leaf node (why?) Firstly we search for the leaf position at which to insert the new key. If such a node is full we split it. A split operation splits a full node around its median key into two nodes having B keys each. Median key moves up into splitted node s parent (insertion recursive call).

  24. Split operation Inserting key F into a full node (B = 2) J A C E G K M O Q

  25. Split operation Node found but already full J A C E F G K M O Q

  26. Split operation Median key identified J A C E F G K M O Q

  27. Split operation Splitting the node E J A C F G K M O Q

  28. Inserting a key into a B-tree Insertion can be propagated upward (B = 2) E J T X Y Z U W A C F G K M O Q

  29. Inserting a key into a B-tree Insertion can be propagated upward (B = 2) E J T X Y Z U W A C F G K M N O Q

  30. Inserting a key into a B-tree Insertion can be propagated upward (B = 2) E J N T X A C F G K M O Q U W Y Z SPLIT

  31. Inserting a key into a B-tree Insertion can be propagated upward (B = 2) N SPLIT E J T X A C F G K M O Q U W Y Z

  32. Inserting a key into a B-tree B-TREE-INSERT (x, k, y) 1 i = 1 2 whilei x.n and k < x.keyido i = i + 1 3 x.n = x.n + 1 4 x.keyi = k 5 x.pi+1 = y 6 for j = x.n downto i+1 do 7 x.keyj = x.keyj-1 8 x.pj = x.pj-1 9 10 DISK-WRITE(x) end-for

  33. Inserting a key into a B-tree B-TREE-INSERT (x, k) 11 if x.n > 2*B then 12 [m, z] = SPLIT (x) 13 if x.parent != NIL then 14 DISK-READ (x.parent) 15 end-if 16 else 17 x.parent = ALLOCATE-NODE() 18 DISK-WRITE (x) 19 root = x.parent 20 end-else 21 B-TREE-INSERT (x.parent, m, z) 22 end-if

  34. Inserting a key into a B-tree SPLIT (x) 1 z = ALLOCATE-NODE() 2 m = FIND-MEDIAN (x) 3 COPY-GREATER-ELEMENTS(x, m, z) 4 DISK-WRITE (z) 5 COPY-SMALLER-ELEMENTS(x, m, x) 6 DISK-WRITE (x) 7 return [m, z]

  35. Inserting a key into a B-tree Function B-TREE-INSERT has three arguments: The node x at which an element of key k should be inserted The key k to be inserted A pointer y to the left child of k to be used as one of the pointers of x during insertion process. There is a global variable named root which is a pointer to the root of the B-Tree. Observe that the field x.parent was not defined as an original B-tree attribute, but is considered just to simplify the process. The fields x.leaf should also be updated accordingly.

  36. Inserting a key into a B-tree Lemma: The time complexity of B-TREE-INSERTis O(B log B n) Proof: Recall that B-TREE-SEARCH function is called first and costs O(log n) by using binary search. Then, B-TREE-INSERT starts by visiting a node and proceeds upward. At most one node is visited per level/depth and only visited nodes can be splitted. A most one node is created during the insertion process. Cost for splitting is proportional to 2B. Number of visited nodes is equal to tree s height and the height of a B-tree is O(log B n). Cost between B and 2B iterations per visited node. Total of O(B log B n) steps.

  37. Some questions on insertion Which split operation increases the tree s height? The split of the root of the tree. How many DISK-READ operations are executed by the insertion algorithm? Every node was read at least twice. Does binary search make sense here? Not exactly. We already pay O(B) to split a node (for finding the median).

  38. Drawbacks of our insertion method Once that the key s insertion node is found it may be necessary to read its parent node again (due to splitting). DISK-READ/WRITE operations are expensive and would be executed al least twice for each node in the key s path. It would be necessary to store a nodes s parent or to use the recursion stack to keep its reference. (Mond and Raz, 1985) provide a solution that spends one DISK-READ/WRITE per visited node (See at CLRS)

  39. Exercise 3 Show the results of inserting the keys E, H, B, A, F, G, C, J, D, I in order into an empty B-tree of order 1.

  40. Exercise 3 Solution: (final configuration) E B G I A C D F H J

  41. Exercise 4 Does a B-tree of order 1 is a good choice for a balanced search tree? What about the expression h log B (n+1)/2 when B = 1?

  42. Deleting a key from a B-tree Analogous to insertion but a little more complicated. A key can be deleted from any node (not just a leaf) and can affect its parent and its children (insertion operation just affect parents). One must ensure that a node does not get to small during deletion (less than B keys). As a result deleting a node is the most complex operation on B-trees. It will be considered in 4 particular cases.

  43. Deleting a key from a B-tree Case 1: The key is in a leaf node with more than B elements. Procedure: Just remove the key from the node.

  44. Deleting a key from a B-tree Case 1: The key is in a leaf node with more than B elements (B = 2) N E J T X A C D F G K M O Q U W Y Z

  45. Deleting a key from a B-tree Case 1: The key is in a leaf node with more than B elements (B = 2) N E J T X A D F G K M O Q U W Y Z

  46. Deleting a key from a B-tree Case 2: The join procedure The key k1to be deleted is in a leaf x with exactly B elements. Let y be a node that is an adjacent brother of x. Suppose that y has exactly B elements. Procedure: Remove the key k1. Let k2be the keythat separates nodes x and y in their parent. Join the the nodes x and y and move the key k2 from the parent to the new joined node. If the parent of x becomes with B-1 elements and also has an adjacent brother with B elements, apply the join procedure recursively for the parent of x (seen as x) and its adjacent brother (seen as y).

  47. Deleting a key from a B-tree Case 2: Delete key Q (B = 2) F K T X ... H I O Q U W Y Z

  48. Deleting a key from a B-tree Case 2: Delete key Q (B = 2) F Parent K T X ... H I O Q U W Y Z Node x Node y

  49. Deleting a key from a B-tree Case 2: Delete key Q (B = 2) F Parent K T X ... H I O U W Y Z Node x Node y

  50. Deleting a key from a B-tree Case 2: Delete key Q (B = 2) F Parent K T X ... H I O U W Y Z Node x Node y

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#