Simplifying Post-Silicon Timing Diagnosis with NGSPA Tool
Explore how the NGSPA tool by Intel simplifies post-silicon timing diagnosis, replacing costly machines with a CAD application running on an x86 server, saving resources and enabling parallelized CAD operations. Learn about the challenges in post-silicon speed debugging, the importance of static timing analysis, and the process of isolating speed paths for efficient debugging. Discover how NGSPA revolutionizes speed path isolation, making the process faster, more cost-effective, and deterministic.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Post-silicon Timing Diagnosis Made Simple using Formal Technology Daher Kaiss, Jonathan Kalechstain Formal Engines and Technologies Team Core CAD Technologies Intel Corp. - Haifa
Agenda Motivation Speed path debug at Intel Introducing our tool: NGSPA Next Generation Speed Path Analyzer Results Challenges and next steps
Static Timing Analysis An important pre-silicon design activity Pros: Aims to compute the expected timing to a digital circuit without requiring simulation Cons: miscorrelation between the pre. and post silicon behaviors usage of simplified delay models: limited ability to consider the effects of logical interactions between signals Result: about 5% of the chip frequency is achieved by post silicon speed path debug
Post-silicon Speed Debug Time consuming process Hundreds of speed paths for some chips Based on Laser Assisted Device Alternation (LADA) Costly machines (>$1 Million per machine) Requires skilled operators Serial process Some units might be burnt/broken TTM requirements sometimes cause projects to go with low GHz
How it was done so far Validation Reproduce the Failure Debug Each Failure Failures to Debug Isolate and Id Speedpath Si. Debug Collect All Failures Probing ZBB ed Failures Bug Fix
Timing Domains A timing domain is a set of HW devices controlled by a common clock Combinatorial Block Combinatorial Block Combinatorial Block DST clock domain SRC clock domain Optical Probing
What is NGSPA? Next Generation Speed Path Analyzer A new CAD tool for preforming speed path isolation Enables replacing >$1M optical probing (LADA) machines with CAD application running on a $1K x86 server NGSPA Optical Probing Saving machine cost Saving machine operators resource From serial LADA execution Parallelized CAD From burnt/broken units Deterministic SW
Inputs to NGSPA Gate level schematic model (Structural Verilog) A trace produced by simulating a trace on the RTL Either RTL simulation (~overnight) Or, Emulation trace (~2 hours) Failing scan and failing cycle Path length 10-20 cycles Source and Destination timing domains
How it works Failing Scanout CORE CORE A B Block1 SRC Block2 Block4 DST Block5 Block6 Block3 SRC Domain DST Doamin
A speed path SRC domain DST domain Scan Inputs Not widely inserted
Our approach for isolating speed paths Reproduce the functional behavior of the speed path Instead of silicon debug, we use the logical model of the design Assumptions: The speed path was triggered by a logic transition at one of the sequentials in the source domain
Using SAT for Backward propagation X 1 0 X 1 1 X 1 1 0 0 X 0
Finding Speed paths Scan SRC Scan Scan DST Scan [v0,v1, , vj, .. vk] Scan SRC DST SRC Inp1 Inp2 Same inputs with same values Stimuli from Trace Same free inputs Scan SRC Scan Scan DST Scan [?,?, , NOT vj, .. ?] Scan SRC Only one selector is high DST SRC Inp1 Inp2 Flipping Scanat scanout_phase(=j)
First Challenge: Reconverging logic [F] [F] A Out [F] [T]
In a more general way Scan f SRC
Handling Reconverging Paths Scan f SRC Scan SEL-2 f SEL-1 SEL-3 Mutex (SEL-2, SEL-3)
Second Challenge: Dealing with complexity CORE A B Block1 SRC Block2 Block4 DST Block5 Block6 Block3
Iterative Cone Expansion Failing Scan j j j j-1 j-2 j j-3 j-4
Results Test No. # signals in cone # of inputs on oundary # of latches in cone # of econverg. signals # of iterations path length (in phases) # of paths Run Time (Sec.) 248 278 214 290 186 227 222 745 7168 434 318 442 222 6458 3395 242 5555 619 579 1713 3285 780 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 296 509 405 305 248 517 497 1528 27696 3025 2403 1798 855 25895 21864 855 1545 837 4665 8789 26226 4931 26 67 54 19 11 50 83 212 3009 617 345 258 164 7279 4618 164 303 90 704 994 4035 675 2 14 3 3 1 14 4 59 635 43 22 58 8 294 165 8 46 39 106 125 168 167 4 11 12 0 0 26 3 86 8569 650 209 236 27 1070 2266 27 5 29 1149 2132 2422 689 5 6 11 6 1 55 7 14 31 15 12 33 8 30 33 8 12 23 31 26 27 27 3 4 8 4 1 44 4 8 16 8 7 20 5 16 18 5 6 12 18 14 14 14 1 1 1 1 1 1 1 1 1 2 2 2 3 3 3 4 5 6 7 7 15 40
How speed paths looks like SRC DST SRC S SRC DST SRC DST A B SRC
Results 30 7.0 6.0 25 # of days 5.0 20 # of speed paths 4.0 15 3.0 10 LADA 2.0 NGSPA 5 1.0 Day per speed path 0 0.0 A0 B0 P0 C0 Stepping/Spin
Progress so far >90% of the optical probing activity was saved One of two LADA machines in the debug lab will be released Work on progress deployment this technology across Intel Limitations: No failing scan was detected, despite the fact that the test failed
Future work Can we drop the need for RTL simulation/emulation and use scan dump traces only? Pros: faster TAT Cons: less observability Use same technology for yield analysis
Summary NGSPA is one of the great examples demonstrating the glory of formal verification Ability to replace laser based machines with CAD Same technology can be applied to other adjacent areas like : fault isolation & glitch detection Formal technologies (SAT and SMT) are being deployed in other interesting areas in Intel Tester scheduling, layout routing and filling and others