Introduction to Data Structures in Computer Science

Slide Note
Embed
Share

Storing and organizing data efficiently is key in computer science. Data structures such as lists, search trees, heaps, hashing, and more play a vital role in this process. Abstract data types like lists and stacks are fundamental concepts that help in managing data effectively.


Uploaded on Sep 22, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Data Structures Lectures: Haim Kaplan, Uri Zwick Exam: 80% Theoretical Assingnments: 10% Practical Assignments: 10% Cormen, Leiserson, Rivest and Stein Introduction to Algorithms (Second/Third Editions) (Contains only some of the material)

  2. Data Structures In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently. Lists Search Trees Heaps/Priority Queues Hashing Union-Find Tries and Suffix Trees Sorting and Selection

  3. Data Structures Lecture 1 Abstract Data Types Lists, Stacks, Queues, Deques Arrays, Linked Lists Graphs Haim Kaplan, Uri Zwick March 2018

  4. Lists (Sequences) b [a0 a1 a2 a3 ai 1 aiai+1 an 2 an 1] Retrieve the item at position i Insert b at position i Delete the item at position i When items are inserted or deleted, the indices of some other items change.

  5. Abstract Data Type (ADT) Lists (Sequences) List() Create an empty list Length(L) Return the length of list L Retrieve(L,i) Return the i-th item of L Insert(L,i,b) Insert b as the i-th item of L Delete(L,i) Delete and return the i-th item of L ( Search(L,b) Return the position of b in L, or 1 ) Concat(L1, L2) Concatenate L1 and L2 Plant(L1,i, L2) Insert L2 starting at the i-th position of L1 Split(L,i) Split Linto two lists Interesting special cases: Retrieve-First(L), Insert-First(L,b), Delete-First(L) Retrieve-Last(L), Insert-Last(L,b), Delete-Last(L) 5

  6. Abstract Data Types (ADT) Stacks, Queues, Dequeues Stacks Implement only: Insert-Last(L,b), Retrieve-Last(L), Delete-Last(L) Also known as: Push(L,b), Top(L), Pop(L) Last In First Out (LIFO) Queues Implement only: Insert-Last(L,b), Retrieve-First(L), Delete-First(L) First In First Out (FIFO) Deques (double ended queues) Implement only: Insert-First(L,b), Retrieve-First(L), Delete-First(L) Insert-Last(L,b), Retrieve-Last(L), Delete-Last(L)

  7. Implementing lists using arrays We need to know the maximal length M in advance. n L array maxlen length an 1 a0 a1 * * * M n M Retrieve(L,i) (of any element) takes O(1) time Insert-Last(L,b) and Delete-Last(L) take O(1) time Stack operations in O(1) time

  8. Implementing lists using arrays We need to know the maximal length M in advance. n L array maxlen length an 1 a0 a1 * * * M n M Retrieve(L,i) (of any element) takes O(1) time Insert-Last(L,b) and Delete-Last(L) take O(1) time Insert(L,i,b) and Delete(L,i) take O(n i+1) time

  9. Implementing lists using arrays L n 0 1 n 1 array maxlen length M n+1 n M Delete-Last very similar

  10. Implementing lists using arrays L i n 0 1 n 1 array maxlen length M n+1 n M We need to move n i items, then insert. O(n i+1) time.

  11. Implementing lists using circular arrays Implement queue and deque operations in O(1) time n L 0 1 2 array maxlen length start M n 2 M New field: start

  12. Implementing lists using circular arrays Implement queue and deque operations in O(1) time L 0 1 2 M 4 M 1 array maxlen length start M 7 M 4 M Occupied region can wrap around!

  13. Implementing lists using circular arrays Implement queue and deque operations in O(1) time L 0 1 n n 1 array maxlen length start M n 1 n M 0 1 Code of other operations similar

  14. Arrays vs. Circular Arrays Circular Arrays Arrays Insert/Delete Last Insert/Delete First O(1) O(1) O(n+1) O(1) Insert/Delete(i) O(n i+1) O(min{i+1,n i+1}) Retrieve(i) O(1) O(1) Main advantage: Constant access time. Main disadvantage: Inserting or deleting elements in the middle is expensive. 14

  15. Implementing lists using singly linked lists Lists of unbounded length Support some additional operations L List object first last n length item next a0 a1 a2 an 1 List-Node object

  16. Insert-First with singly linked lists Insert-First(L,b) Generate a new List-Node object B containing b Add B to the list Adjust L.last, if necessary L first last length n+1 n B Increment L.length (Return B) b item a0 next a1 a2 an 1

  17. Insert-First with singly linked lists L first last length n+1 n B b item a0 next a1 a2 an 1 Insert-Last and Delete-First very similar, also O(1) time Unfortunately, Delete-Last requires O(n+1) time

  18. Retrieve with singly linked lists L first last length n item a0 next a1 a2 an 1 Retrieving the i-th item takes O(i+1) time

  19. Insert with singly linked lists L first last length n item a0 next a1 a2 an 1 Inserting an item into position i takes O(i+1) time

  20. Inserting a node (After a given node) Insert-After(A,B) Insert B after A A ai 1 ai B b Assignments order is important

  21. Deleting a node (After a given node, not the last one) Delete-After(A) Delete the node following A A ai 1 ai ai+1 What happens to the node removed?

  22. Concatenating lists Concat(L1,L2) Attach L2 to the end of L1 L1 n n+m a1 an 1 a0 a2 L2 m b1 bm 1 b0 b2 O(1) time! (What happened to L2?)

  23. Circular arrays vs. Linked Lists Circular arrays Linked lists Insert/Delete-First Insert-Last Delete-Last O(1) O(1) O(1) O(n+1) Insert/Delete(i) O(min{i+1,n i+1}) O(i+1) Retrieve(i) O(1) O(i+1) Concat O(min{n1,n2}+1) O(1) With linked lists we can insert or delete elements in the middle in O(1) time 23

  24. More fun with linked lists Circular singly linked lists (Circular) Doubly linked lists Doubly linked lists with a sentinel 24

  25. Circular singly linked lists L If we make the list circular, we don t need first first last length n All capabilities remain the same item a0 next a1 a2 an 1

  26. How do we implement Delete-Last(L) in O(1) time? Doubly linked lists! 26

  27. (Circular) Doubly linked lists L first length n prev item next a0 a2 a1 an 1 Each List-Node now has a prev field All previous benefits + Delete-Last(L) in O(1) time

  28. Inserting a node into a Doubly Linked List Insert-After(A,B) Insert node B after node A A ai+1 ai b B

  29. Deleting a node from a Doubly Linked List Each node now has a prev field Delete-Node(A) Delete node A from its list A ai 1 ai+1 ai Note: A itself not changed! Is that good?

  30. Circular Doubly linked lists with a sentinel L an-1 a0 a1 a2 sentinel Each node has a successor and a predecessor No special treatment of first and last elements No special treatment of empty lists In some case we can identify a list with its sentinel

  31. Circular Doubly linked lists with a sentinel L Empty list

  32. Circular Doubly linked lists with a sentinel L an 1 a1 a2 a3

  33. Abstraction barriers User List Insert, Retrieve, Delete Search List-Node Retrieve-Node Insert-After, Delete-After, Delete-Node 33

  34. Inserting and later deleting Suppose we inserted a into L. At a later stage, we want to delete a from L. With the current interface all we can do is: What we would like to do: O(n) O(n) O(1) Need to implement Search. Would take O(n) time. Did we delete the right a ? 34

  35. Modified ADT for lists The current specification does not allow us to utilize one of the main capabilities of linked lists: Insert-After, Delete-Node, Next in O(1) time We next define a new ADT in which the user is allowed to call List-Node, Insert-After, Delete-Node, Next

  36. Lists A modified abstraction A B [ a0 a1 ai an-1] Lists are now composed of List-Nodes List-Node(b) Create a List-Node containing item b Item(B) Return the item contained in List-Node B List() Create an empty list Length(L) Return the length of list L Insert(L,i,B) Insert B as the i-th List-Node of L Retrieve(L,i) Return the i-th List-Node of L Delete(L,i) Delete and return the i-th List-Node of L Concat(L1, L2) Concatenate L1 and L2

  37. Lists A modified abstraction A B [ a0 a1 ai an-1] We can now allow the following additional operations: Next(A) Return the List-Node following A (assuming there is one) Insert-After(A,B) Insert B after A Delete-Node(A) Delete A from its current list These operations assume that A is contained in some list, while B is not contained in any list Note: L is not an argument of these operations!

  38. Lists A modified abstraction The user explicitly manipulates List-Nodes The actual implementation using linked lists remains essentially the same The length field is removed from the implementation as it is hard to keep it up to date Due to concatenations, hard to keep track of which list contains a given List-Node

  39. Traversing a list With previous abstraction: for i 0 to Length(L) 1 do { a Retrieve(L,i) process(a) } With the new abstraction: A Retrieve(L,0) while (A null) { a Item(A) process(a) A Next(A) } 39

  40. Pitfalls of the modified ADT A L1 an-1 a0 a1 a2 B L2 bm-1 b0 b1 b2 L2is not a valid list now Insert-After(A,B) Should call Delete-Node(B) before Insert-After(A,B)

  41. Which-List? Concatenations move List-Nodes from lists to lists Which-List(A) return the list currently containing A Na ve implementation: scan the list from A until getting to the sentinel Much more efficient implementation possible using a Union-Find data structure.

  42. Array-based implementation of the List/List-Node abstraction Can we implement the List/List-Node abstraction such that Retrieve will take O(1) time? L 0 1 2 n i array maxlen length start M n 2 M item list List-Node Array cells contain pointers to List-Nodes. b A list node contains a pointer to the list and its position in the list. i pos (Why is the list pointer needed?) ( How are different operations implemented?)

  43. Implementation of lists Doubly Linked lists Circular arrays Balanced Trees Insert/Delete-First Insert/Delete-Last Insert/Delete(i) O(1) O(1) O(log n) O(i+1) O(i+1) O(log n) Retrieve(i) O(1) O(i+1) O(log n) Concat O(n+1) O(1) O(log n) (Doubly) linked lists also support Insert-After and Delete- Node in O(1) time. This is used by later data structures.

  44. Representing digraphs (directed graphs) Vertex Vertex B A B name info out A blabla C D Edge from to info F E

  45. Adjacency lists representation of digraphs Graph B vertices A List of vertices , , ] , [ A B C C D out Do we need the from field? List of outgoing edges of A F , , , ] [ E List of outgoing edges of B Keep a list of vertices. , , ] , [ For each vertex keep a list of its outgoing edges. How are the lists implemented?

  46. Should List-Nodes contain items, or point to items? Suppose we use linked lists to represent adjacency lists. Option 1: List-Nodes point to Edges ek 1 e0 e1 e2 sentinel Edge Disadvantage: Another level of indirection from to info Advantage: An item may belong to several lists 46 Are the dotted pointers useful?

  47. Should List-Nodes contain items, or point to items? Suppose we use linked lists to represent adjacency lists. Option 2: List-Nodes contain Edges sentinel prev next from to info Edge Edge(List-Node) List-Node<Edge> Less pointers Faster and more compact An item can be in only one list 47

  48. Adjacency lists representation of digraphs B 1 Using a linked-list representation of adjacency lists we can in O(1) time: 1 2 A Move from an edge to the edge following it or preceding it in the corresponding adjacency list. 1 2 C D 1 Delete or insert edges before or after a given edge. 2 F 1 2 3 E We cannot efficiently traverse incoming edges of a vertex. Adjacency lists define an (arbitrary) order on the outgoing edges of each vertex.

  49. Incoming and outgoing adjacency lists For each vertex keep both an incoming and an outgoing adjacency lists B Each Edge now contained in two Lists Each Edge points to the two List-Nodes containing it A C D Edge Vertex A blabla from to info in-node out-node F name info in out E

  50. Incoming and outgoing adjacency lists Option 1: List-Nodes point to Edges ek 1 e0 e1 e2 sentinel Edge Outgoing edges of A from to info in-node out-node A B Incoming edges of B fm 1 f0 f1 f2 sentinel

Related