Understanding B+ Trees in Data Abstractions

cse 332 data abstractions b trees n.w
1 / 27
Embed
Share

This content discusses B+ trees in data abstractions, covering topics such as tree structures, search trees, disk friendliness, and the comparison with AVL trees. It explains the properties and characteristics of B+ trees, including internal and leaf nodes, order properties, tree balancing, and disk-friendly features. The information provided gives insights into the design, structure, and operations of B+ trees in managing data efficiently.

  • Data Abstractions
  • B+ Trees
  • Disk Friendliness
  • AVL Trees
  • Search Trees

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. CSE 332 Data Abstractions B-Trees Richard Anderson Spring 2016

  2. Announcements Next two weeks: Hashing and sorting Upcoming dates Friday, April 29. Midterm 2

  3. Cycles to access: CPU 1 Registers 2 L1 Cache 30 L2 Cache 250 Main memory Disk Random: 30,000,000 Streamed: 5000 3

  4. M-ary Search Tree Consider a search tree with branching factor M: Complete tree has height: # hops for find: Runtime of find: 4

  5. B+ Trees (book calls these B-trees) Each internal node has (up to) M-1 keys: Order property: subtree between two keys x and y contain leaves with valuesv such that x v < y Note the Leaf nodes have up to L sorted keys. 3 7 12 21 x<3 7 x<12 12 x<21 21 x 3 x<7 5

  6. B+ Tree Structure Properties Internal nodes store up to M-1 keys have between M/2 and M children Leaf nodes where data is stored all at the same depth contain between L/2 and L data items Root (special case) has between 2 and M children (or root could be a leaf) 6

  7. B+ Tree: Example B+ Tree with M = 4 (# pointers in internal node) and L = 5 (# data items in leaf) 12 44 Data objects which I ll ignore in slides 6 20 27 34 50 1, AB.. 2, GH.. 6 8 9 10 12 14 16 17 19 20 22 24 27 28 32 34 38 39 41 44 47 49 50 60 70 All leaves at the same depth 4, XY.. 7 Definition for later: neighbor is the next sibling to the left or right.

  8. Disk Friendliness What makes B+ trees disk-friendly? 1.Many keys stored in a node All brought to memory/cache in one disk access. 2.Internal nodes contain only keys; Only leaf nodes contain keys and actual data Much of tree structure can be loaded into memory irrespective of data object size Data actually resides in disk 8

  9. B+ trees vs. AVL trees Suppose again we have n = 230 109 items: Depth of AVL Tree Depth of B+ Tree with M = 256, L = 256 Great, but how to we actually make a B+ tree and keep it balanced ? 12 9

  10. Building a B+ Tree with Insertions Insert(3) Insert(18) Insert(14) The empty B-Tree M = 3 L = 3 10

  11. 3 3 Insert(30) 14 14 18 18 11 M = 3 L = 3

  12. 18 18 18 3 18 3 18 3 18 Insert(32) Insert(36) 14 30 14 30 14 30 18 Insert(15) 3 18 14 30 12 M = 3 L = 3

  13. 18 32 18 32 3 18 32 3 18 32 Insert(16) 14 30 36 14 30 36 15 15 18 32 18 32 30 36 13 M = 3 L = 3

  14. Insert(12,40,45,38) 18 18 15 32 15 32 40 3 18 32 40 3 18 32 15 15 14 16 30 36 45 12 16 30 36 14 38 14 M = 3 L = 3

  15. Insertion Algorithm 1. Insert the key in its leaf in sorted order 2. If the leaf ends up with L+1 items, overflow! Split the leaf into two nodes: original with (L+1)/2 smaller keys new one with (L+1)/2 larger keys Add the new child to the parent If the parent ends up with M+1 children, overflow! 3. If an internal node ends up with M+1 children, overflow! Split the node into two nodes: original with (M+1)/2 children with smaller keys new one with (M+1)/2 children with larger keys Add the new child to the parent If the parent ends up with M+1 items, overflow! 4. Split an overflowed root in two and hang the new nodes under a new root 5. Propagate keys up tree. This makes the tree deeper! 15

  16. And Now for Deletion Delete(32) 18 18 15 40 15 32 40 40 40 3 18 3 18 32 15 15 45 45 12 16 30 12 16 30 36 14 14 38 16 M = 3 L = 3

  17. Delete(15) 18 18 15 36 40 36 40 40 3 18 36 40 15 3 18 36 45 12 16 30 38 45 12 16 30 38 14 17 M = 3 L = 3

  18. Delete(16) 18 18 14 36 40 36 40 40 3 18 36 14 40 18 36 45 12 16 30 38 45 30 38 18 M = 3 L = 3

  19. Delete(16) 18 36 40 40 3 18 36 3 45 12 30 38 12 14 14 19 M = 3 L = 3

  20. Delete(14) 36 36 18 40 18 40 40 3 18 36 40 3 18 36 45 12 30 38 45 12 30 38 14 20 M = 3 L = 3

  21. Delete(18) 36 36 18 40 40 40 40 3 18 36 3 36 45 45 12 30 38 12 38 21 M = 3 L = 3

  22. 36 40 40 3 36 45 12 38 30 22 M = 3 L = 3

  23. Deletion Algorithm 1.Remove the key from its leaf 2. If the leaf ends up with fewer than L/2 items, underflow! Adopt data from a neighbor; update the parent If adopting won t work, delete node and merge with neighbor If the parent ends up with fewer than M/2 children, underflow! 23

  24. Deletion Slide Two 3. If an internal node ends up with fewer than M/2 children, underflow! Adopt from a neighbor; update the parent If adoption won t work, merge with neighbor If the parent ends up with fewer than M/2 children, underflow! 4. If the root ends up with only one child, make the child the new root of the tree This reduces the height of the tree! 5. Propagate keys up through tree. 24

  25. Thinking about B+ Trees B+ Tree insertion can cause (expensive) splitting and propagation up the tree B+ Tree deletion can cause (cheap) adoption or (expensive) merging and propagation up the tree Split/merge/propagation is rare if M and L are large (Why?) Pick branching factor M and data items/leaf L such that each node takes one full page/block of memory/disk. 25

  26. Complexity Find: Insert: find: Insert in leaf: split/propagate up: Claim: O(M) costs are negligible 26

  27. Tree Names You Might Encounter B-Trees More general form of B+ trees, allows data at internal nodes too Range of children is (key1,key2) rather than [key1, key2) B-Trees with M = 3, L = x are called 2-3 trees Internal nodes can have 2 or 3 children B-Trees withM = 4,L = x are called 2-3-4 trees Internal nodes can have 2, 3, or 4 children 27

Related


More Related Content