Introduction to Greedy Algorithms in Problem-Solving
Greedy algorithms involve making locally optimal choices at each stage to approximate globally optimal solutions efficiently. Explore topics like Huffman Coding Tree, Shortest Path (Dijkstra Algorithm), and Minimum Cost Spanning Tree (Kruskal's Algorithm) in this insightful content. Learn how Huffman coding helps optimize space requirements at the expense of running time and how variable-length codes like Huffman codes improve efficiency by assigning shorter codes to more frequent characters. Dive into building Huffman coding trees to assign codes based on character frequencies and depths for optimal performance.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
GREEDY ALGORITHMS DR. MARAM BANIYOUNES
CONTENTS Greedy Algorithm Huffman Coding Tree Shortest Path problem (Dijkstra algorithm) Minimum Cost Spanning Tree (Kruskal s algorithm)
GREEDY ALGORITHM A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage. In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locally optimal solutions that approximate a globally optimal solution in a reasonable amount of time.
HUFFMAN CODING TREE The space/time tradeoff suggests that one can often gain an improvement in space requirements in exchange for a penalty in running time. A typical example is storing files on disk. If the files are not actively used the owner may wish to compress them to save space. And then, they can be uncompressed for use, which costs some time, but only once.
We can represent a set of items in a computer program by assigning a unique code for each item. ASCII coding scheme unique 8 bits value to each character. Takes 128 lg or 7-bits to provide (128) unique code to represent the (128) symbols of the ASCII character set. The eighth bit is used either to check for transmission errors, or to support extended ASCII codes with additional (128) characters.
Requirement for values assume that the codes will be the same length (fixed-length coding scheme). If all the characters are used equally, fixed-length coding scheme is the most space efficient method. But not all characters are used equally often. It is possible to store the more frequent letters in shorter codes, but the other characters would require longer codes. Huffman code is an approach to assign variable-length codes. lg n bits to represent (n) unique code
BUILDING HUFFMAN CODING TREES Huffman coding tree assigns codes to characters such that the length of the code depends on the relative frequency or weight of the corresponding character, (it is variable-length code). Huffman code for each letter is derived from a full binary tree called Huffman tree. Each leaf corresponds to a letter. The goal is to build a tree with a minimum external path weight. Weighted path length of a leaf to be its weight times its depth. A letter with high weight should have low depth.
PROCESS OF BUILDING THE HUFFMAN TREE First order the letters in a list by ascending weight (frequency). Remove the first two letters (ones with lowest weight), from the list and assign them to leaves in what will become the Huffman tree. Assign these leaves as the children of an internal node whose weight is the sum of the weights for the two children. Put the sum back on the list in correct place necessary to preserve the order of the list. The process is repeated until only one item remains on the list.
EXAMPLE This process will build a full Huffman tree. Z K F C U D L E 2 7 24 32 37 42 42 120
Note This is an example of a greedy algorithm, because at each step, the two sub trees with least weight are joined together.
ASSIGNING HUFFMAN CODES After the Huffman tree is constructed we start assigning codes to individual letters. Start at the ROOT, we assign either a (0), or a (1), to each edge in the tree. Zero is assigned to edges connecting a node with its left child, and (one) to the right child. The Huffman code for a letter is simply a binary number determined by the path from the root to the leaf corresponding to that letter.
Letter C D E F K L U Z Frequency 32 42 120 24 7 42 37 2 Code 1110 101 0 11111 111101 110 100 111100 Bits 4 3 1 5 6 3 3 6
DECODING THE MESSAGE Decoding the message is done by looking at the bits in the coded string from left to right until a letter is decoded. This can be done using the Huffman tree in a reverse process from that used to generate the codes. We start from the root; we take branches depending on the bit value (0 left, 1 right), until reaching a leaf node. A set of codes is said to meet the prefix property, if no code in the set is the prefix of another. The prefix property guarantees that there will be no ambiguity in how a bit string is decoded. i.e. once we reach the last bit of a code during the decoding process, we know which letter it is the code for.
Huffman codes have the prefix property, since any prefix of a code will correspond to an internal node, while all codes correspond to leaf nodes. The average expected cost per character is: The sum of the cost for each character probability of its occurring nP C P C P C + + + 2 2 1 1 C * the ( ) i iP ( ) n
+ + + C F C F C F 1 1 2 n n or F T Expected cost per letter to above tree is While with fixed length code would require lg8 = 3 bits per letter. The Huffman coding expected save about 12% for this set of letters. . 2 57
IMPLEMENTATION The operations required for Huffman s encoding are Insertions into a data structure Deletions of the two characters with minimal frequency from heap Building the tree A heap is a good data structure for the 1sttwo operations, each of which requires O(lgn) steps in the worst case.
COMPLEXITY Building the tree takes constant time per node. Insertions and deletions take O(lgn) steps each. Overall, the running time of the algorithm is O(nlgn).
SHORTEST PATH PROBLEM (DIJKSTRA ALGORITHM) Use an adjacency matrix representation, in which the edge lengths are the cost (distances, times ), associated with the edges. Then we initialize an array called Dist to equal the 1strow of the edge matrix.
W Dist(2) - 30* 2 30 5 30 4 30 3 30 Dist(3) 70 70 60* 60 60 Dist(4) Dist(5) Dist(6) 50 40 50 40* 50* 40 50 40 50 40 50 40 Iter S initial{1} 1 2 3 4 5 100 100 100 100 90* 90 {1,2} {1,2,5} {1,2,5,4} {1,2,5,4,3} {1,2,5,4,3,6} 6 30
This algorithm is called a greedy algorithm, because at each stage it simply does what is locally optimal. If the graph is undirected, we can think of it as a directed graph such that each undirected edge corresponds to two directed edges in opposite directions with the same length. Computes the cost of the shortest path from V0 to each vertex requires O(n2) time.
MINIMUM COST SPANNING TREE (KRUSKAL SALGORITHM) One approach to determine a min-cost spanning tree of a graph is given by Kruskal. We partition the set of vertices into Vequivalence classes, each consisting of one vertex, and then process the edges in order of weight. Edges can be processed in order of weight by using a min-heap, which is faster than sorting the edges first.
Suppose the edges have been sorted into the following order: VertexCost (1, 2) 1 (1, 3) 1 (2, 3) 1 (6, 7) 1 (3, 5) 4 (3, 4) 2 (5, 6) 2 (5, 7) 2 (1, 4) 3
The efficiency of the algorithm depends upon the implementations chosen for the priority queue (heap) and Union-Find. Priority queue implemented by heap requires O(lgn) for enqueue (insertion), and dequeue (deletion) operations. Union-Find structure implemented by weighted-balanced tree yields O(lgn) time.