Binary Search Trees and Balance Strategies

undefined
CS106X –
Programming
Abstractions in C++
Cynthia Bailey Lee
 
 
CS2 in C++ Peer Instruction
Materials by 
 is
licensed under a 
.
Permissions beyond the scope of this
license may be available
at 
.
http://peerinstruction4cs.orgShareAlike 4.0 International LicenseAttribution-NonCommercial-Creative CommonsCynthia Bailey Lee  
 
Today’s Topics:
Map Interface, continued
 
1.
Binary Search Tree math
(continued from Friday)
How to maintain balance
2.
Introduction to Hashing
3.
C++ template details (if we have time)
 
2
 
BST math
 
 
 
Bad BSTs
 
One way to create a bad BST is to insert the
elements in order: 34, 22, 18, 9, 3
That’s not the only way…
 
How many 
distinctly structured 
BSTs are there that
exhibit the worst case height (height equals number
of nodes) for a tree with 5 nodes?
A.
2-3
B.
4-5
C.
6-7
D.
8-9
E.
More than 9
 
BALANCE!!!
 
The 
#1
 issue to remember with BSTs is that
they are great when balanced (O(log n)
operations), and horrible when unbalanced
(O(n) operations)
Balance depends on order of insert of
elements
Not the implementor’s control
Over the years, implementors have devised
ways of making sure BSTs stay balanced no
matter what order the elements are inserted
 
BST Balance Strategies
 
 
Red-Black trees
 
One of the most famous (and most tricky) strategies
for keeping a BST balanced
 
Red-Black trees
 
In addition to the requirements of binary
search trees, red–black trees must meet these:
A node is either red or black.
The root is black.
All leaves (null children) are black.
Both children of every red node are black.
Every simple path from a given node to any of its
descendant leaves contains the same number of
black nodes.
 
Red-Black trees
 
In addition to the requirements of binary
search trees, red–black trees must meet these:
A node is either red or black.
The root is black.
All leaves (null children) are black.
Both children of every red node are black.
Every simple path from a given node to any of its
descendant leaves contains the same number of
black nodes.
According to these, what is the greatest possible difference
between the longest and shortest root-to-leaf paths in the tree?
A.
All root-to-leaf paths are equal length
B.
25% longer
C.
50% longer
D.
100% longer
E.
Other/none/more than one
 
Red-Black trees
 
Every simple path from a given node to any
of its descendant leaves contains the same
number of black nodes.
This is what guarantees “close” to balance
 
This file is licensed under the 
Creative Commons
 
Attribution-Share Alike 3.0 Unported
 license. 
http://commons.wikimedia.org/wiki/File:Red-black_tree_example.svg
 
Insert procedure must maintain
the invariants (this gets tricky)
 
Video:
http://www.youtube.com/watch?v=vDHFF4wjWYU
 
This file is licensed under the 
Creative Commons
 
Attribution-Share Alike 3.0 Unported
 license. 
http://commons.wikimedia.org/wiki/File:Red-black_tree_example.svg
 
Other BST balance strategies
 
Red-Black tree
AVL tree
Treap (BST + heap in one tree! What could be cooler
than that, amirite? 
♥ ♥ ♥ 
)
 
Other fun types of 
BST
:
Splay tree
Rather than only worrying about balance, Splay Tree
dynamically readjusts based on 
how 
often
 users search
for an item
. Commonly-searched items move to the
top, saving time
B-Tree
Like BST, but a node can have many children, not just 2
 
Hashing
 
Implementing the Map interface with Hashing/Hash
Tables
Imagine you want to look up your
neighbors’ names, based on their house
number
 (all on your same street)
House numbers: 10565 through 90600
(roughly 1000 houses—there are varying
gaps in house numbers between houses)
Names: one last name per house
 
Options
BST
 (balanced):
Add/remove: 
 
O(logn)
Find: 
  
O(logn)
Linked list
:
Add/remove: 
 
O(n)
Find: 
  
O(n)
Array
:
Add/remove:
Find:
 
Hash Table is just a modified,
more flexible array
 
Keys don’t have to be integers 0-(size-1)
(Ideally) avoids big gaps like our gap from
0 to 10565 in the house numbers and
between numbers
 
Hash function is what makes this possible!
 
Closed Addressing
 
Where does
key=“Annie”
value=10 go if hash
(“Annie”)   = 3?
Where does
key=“Solange”
value=12 go if
hash(“Solange”) =
5?
 
 
Hash collisions
 
We may need to worry about 
hash
collisions
 
 
 
Hash collision:
Two keys a, b, a≠b, have the same hash
code index (i.e. hash(a) = hash(b))
 
Closed Addressing
 
Where does
key=“Annie”
value=55 go if
hashkey(“Annie”)
= 3?
A.
55 overwrites 10
at 3
B.
A link list node is
added at 3
C.
Other/none/
more than one
 
Hash key collisions
 
We can NOT overwrite the value the way
we would if it really were the same key
Need a way of storing multiple values in a
given “place” in the hash table
Slide Note
Embed
Share

The concepts of Binary Search Trees (BSTs), including strategies for maintaining balance such as Red-Black trees. Learn about the importance of balance in BST operations and how different structures impact efficiency. Dive into the world of tree algorithms and height optimization for better data organization and retrieval.

  • Binary Search Trees
  • Balance Strategies
  • Red-Black Trees
  • Data Structures
  • Algorithms

Uploaded on Feb 16, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Creative Commons License CS2 in C++ Peer Instruction Materials by Cynthia Bailey Lee is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 4.0 International License. Permissions beyond the scope of this license may be available at http://peerinstruction4cs.org. CS106X Programming Abstractions in C++ Cynthia Bailey Lee

  2. 2 Today s Topics: Map Interface, continued 1. Binary Search Tree math (continued from Friday) How to maintain balance 2. Introduction to Hashing 3. C++ template details (if we have time)

  3. BST math

  4. Bad BSTs One way to create a bad BST is to insert the elements in order: 34, 22, 18, 9, 3 That s not the only way How many distinctly structured BSTs are there that exhibit the worst case height (height equals number of nodes) for a tree with 5 nodes? A. 2-3 B. 4-5 C. 6-7 D. 8-9 E. More than 9

  5. BALANCE!!! The #1 issue to remember with BSTs is that they are great when balanced (O(log n) operations), and horrible when unbalanced (O(n) operations) Balance depends on order of insert of elements Not the implementor s control Over the years, implementors have devised ways of making sure BSTs stay balanced no matter what order the elements are inserted

  6. BST Balance Strategies

  7. Red-Black trees One of the most famous (and most tricky) strategies for keeping a BST balanced

  8. Red-Black trees In addition to the requirements of binary search trees, red black trees must meet these: A node is either red or black. The root is black. All leaves (null children) are black. Both children of every red node are black. Every simple path from a given node to any of its descendant leaves contains the same number of black nodes.

  9. Red-Black trees In addition to the requirements of binary search trees, red black trees must meet these: A node is either red or black. The root is black. All leaves (null children) are black. Both children of every red node are black. Every simple path from a given node to any of its descendant leaves contains the same number of black nodes. According to these, what is the greatest possible difference between the longest and shortest root-to-leaf paths in the tree? A. All root-to-leaf paths are equal length B. 25% longer C. 50% longer D. 100% longer E. Other/none/more than one

  10. Red-Black trees Every simple path from a given node to any of its descendant leaves contains the same number of black nodes. This is what guarantees close to balance This file is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license. http://commons.wikimedia.org/wiki/File:Red-black_tree_example.svg

  11. Insert procedure must maintain the invariants (this gets tricky) Video: http://www.youtube.com/watch?v=vDHFF4wjWYU This file is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license. http://commons.wikimedia.org/wiki/File:Red-black_tree_example.svg

  12. Other BST balance strategies Red-Black tree AVL tree Treap (BST + heap in one tree! What could be cooler than that, amirite? ) Other fun types of BST: Splay tree Rather than only worrying about balance, Splay Tree dynamically readjusts based on how often users search for an item. Commonly-searched items move to the top, saving time B-Tree Like BST, but a node can have many children, not just 2

  13. Hashing Implementing the Map interface with Hashing/Hash Tables

  14. Imagine you want to look up your neighbors names, based on their house number (all on your same street) House numbers: 10565 through 90600 (roughly 1000 houses there are varying gaps in house numbers between houses) Names: one last name per house

  15. Options Index value 0 1 10565 10566 10567 90598 90599 90600 KyungLee IsaiahWhite Solange Clark BST (balanced): Add/remove: O(logn) Find: Linked list: Add/remove: O(n) Find: Array: Add/remove: Find: O(logn) O(n)

  16. Hash Table is just a modified, more flexible array Keys don t have to be integers 0-(size-1) (Ideally) avoids big gaps like our gap from 0 to 10565 in the house numbers and between numbers Hash function is what makes this possible!

  17. Closed Addressing Array index 0 1 2 3 4 5 6 7 8 Value Where does key= Annie value=10 go if hash ( Annie ) = 3? Where does key= Solange value=12 go if hash( Solange ) = 5?

  18. Hash collisions We may need to worry about hash collisions Hash collision: Two keys a, b, a b, have the same hash code index (i.e. hash(a) = hash(b))

  19. Closed Addressing Array index 0 1 2 3 4 5 6 7 8 Value Where does key= Annie value=55 go if hashkey( Annie ) = 3? A. 55 overwrites 10 at 3 B. A link list node is added at 3 C. Other/none/ more than one

  20. Hash key collisions We can NOT overwrite the value the way we would if it really were the same key Need a way of storing multiple values in a given place in the hash table

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#