Understanding Laundry Pipelining and Digital Circuits

Slide Note
Embed
Share

Explore the concept of pipelining through laundry examples and digital circuits. Learn how pipelining can improve throughput and reduce latency in processing tasks. Discover the benefits of pipelining in achieving more work done in less time.


Uploaded on Oct 11, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. ECE 352 Digital System Fundamentals Pipelining Pipelining 1 1

  2. Laundry Pipelining Example You could wait for a load to completely finish before you start the next Four loads in 8 time units Pipelining Time Units: 0 1 2 3 4 5 6 7 8 Green: done after 2 Purple: done after 4 Orange: done after 6 Blue: done after 8 2 2

  3. Laundry Pipelining Example 2-stage pipeline: start new washer load at same time you put first load in dryer Work on two loads at once, at different stages of completion Four loads in 5 time units Pipelining Time Units: 0 1 2 3 4 5 Green: done after 2 Purple: done after 3 Orange: done after 4 Blue: done after 5 3 3

  4. Laundry Pipelining Example What does this mean? If you wash your favorite shirt first, it is not done sooner But you get approximately twice the amount of laundry done in the same time (if doing many loads) Four loads in 5 time units Pipelining Time Units: 0 1 2 3 4 5 Green: done after 2 Purple: done after 3 Orange: done after 4 Blue: done after 5 4 4

  5. Terminology Latency The length of time it takes for a value ready at the input to propagate to a result at the output Throughput The rateat which results are produced Laundry Example: Unpipelined Latency = 2 time units Throughput = 1 result per 2 time units Pipelined Latency = 2 time units Throughput = 1 result per time unit Note: throughput ignores startup latency (2 time units until first result produced) Pipelining 5 5

  6. Pipelining Digital Circuits Original circuit has lower throughput than desired What can we do to increase it? Increase throughput by still producing one result per clock cycle, but with a shorter tmin How can we decrease tmin and still accomplish the same amount of work? Pipelining Registers Registers Logic 6 6

  7. Pipelining Digital Circuits Insert registers to subdivide long-latency combinational blocks into two (or more) stages New circuit has a shorter critical path Clock rate of modified circuit is limited by longest combinational path between registers, so we want to subdivide as evenly as possible (balance the stages) Pipelining Registers Registers Registers Registers Registers Stage 1 Stage 2 Logic Logic Logic 7 7

  8. Pipelining Effects Original circuit produces 1 result per cycle ORIGINAL Logic Reg Reg Pipelined with 2 evenly- balanced stages Still produces 1 result / cycle PIPELINED Pipelining Logic Logic Reg Reg Reg tcomb,pipe = (tcomb,orig / 2) = tpd + tcomb,pipe + ts tmin,pipe Pipelined circuit can be clocked faster, but not 2 faster! = tpd + (tcomb,orig / 2) + ts tmin,pipe > (tmin,orig / 2) 8 8

  9. Pipelining Effects Throughput is increased! Produce a result once per cycle fmax,pipe is higher than fmax,orig Latency is increased! N stages, so latency is N cycles tmin,pipe is more than tmin,orig / N ORIGINAL Logic Reg Reg PIPELINED Pipelining Logic Logic Reg Reg Reg Pipelining is only useful if we can take advantage of throughput increase and can tolerate latency increase Need to be processing a sequence of data Diminishing returns as pipeline depth (N) increases 9 9

  10. Add Four Values: Non-Pipelined Calculate tmin All paths in this circuit have the same delay Calculate latency The time it takes for input values that are ready in their registers to propagate to output Y = 4 + 10 + 12 + 1 = 27 ns + tADD1+ tADD2 tpd + ts = min latency = 1 tmin = 1 cycle tp = 27 ns Pipelining For these delay values ts = 1ns tpd = 4ns tADD2 = 12ns tADD1 = 10ns A + tADD1 B + tADD2 Y C + tADD1 Throughput = 1 result per cycle Max = 1 result / 27 ns = 37 M results / s D 10 10

  11. Add Four Values: Pipelined Calculate tmin Based on the longest path Calculate latency Same idea, but remember that the length of each pipeline stage is dictated by the same clock! = 4 + max(10, 12) + 1 = 17 ns + max(tADD1, tADD2) tpd + tS = min latency = 2 tmin 2 cycles tp = = 34 ns Pipelining tCOMB is the longest of these paths For these delay values ts = 1ns tpd = 4ns tADD2 = 12ns tADD1 = 10ns A + tADD1 B + tADD2 Y C + tADD1 Throughput = 1 result per cycle Max = 1 result / 17 ns = 59 M results / s D 11 11

  12. ts = 1ns tpd = 4ns tADD1 = 10ns tADD2 = 12ns Comparison Non-Pipelined Pipelined tmin: Max Throughput: 1 result / cycle tmin: Max Throughput: 1 result / cycle 27 ns 17 ns Pipelining = 37 M results / s = 59 M results / s Minimum Latency: = 1 tmin 1 cycle Minimum Latency: = 2 tmin 2 cycles = 27 ns = 34 ns A A + tADD1 + tADD1 B B + tADD2 + tADD2 Y Y C C + tADD1 + tADD1 D D 12 12

  13. Pipelining Summary Technique that can increase frequency and throughput at the expense of latency and area If adding pipeline stages, we need to evaluate: Is latency, throughput, or area most important for how that particular circuit will be used? Where should pipeline registers be added? Clock speed depends on the longest path Limited by flip-flop ts and tpd (diminishing returns) There are tricks we can use to mitigate this, but they are beyond the scope of the class . Pipelining 13 13

  14. ECE 352 Digital System Fundamentals Pipelining Pipelining 14 14

Related


More Related Content