High-Speed Error-Correcting Modular Computation Advancements
Introducing a powerful form of computation that includes error correction and high precision in FPGA, with modules for multiplication, accumulation, and conversion from binary to fixed-point RNS format. The advancements provide carry-free operations, efficient space allocation, and accurate results until the final normalization step.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
High High- -speed, error speed, error- - correcting product correcting product accumulator accumulator Introducing a practical and powerful form of computation that includes error correction.
Error Error- -correcting product correcting product- -accumulator demo test bed demo test bed accumulator Safe region clock Over-clock region Dual Clock FIFO Data A Memory Forward convert Multiply & accumulate Dual Clock FIFO Reverse convert Error Normalize Correction Dual Clock FIFO Data B Memory Forward convert Error Code fifo Result data fifo Binary data flow Nios-II CPU Modular data flow High-speed Variable clock
Modular computation fills classic gap for high Modular computation fills classic gap for high- - performance computation in an FPGA performance computation in an FPGA Double precision float float Double precision Numeric Precision Modular computation computation Modular Single precision float float Single precision 16 bit integer integer 16 bit 8 bit integer integer 8 bit Speed Modular computation provides high-speed and high-precision
Conversion from binary to fixed Conversion from binary to fixed- -point RNS format is performed first format is performed first point RNS d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 Residue output format: {d0,d1.d2,d3,d4,d5,d6,d7,d8,d9} Binary Format input (32.32): (+/-) 0xXXXXXXXX.XXXXXXXX Forward convert Maitrix has developed a new fixed-point fractional format in RNS !
Modular multiply Modular multiply- -accumulator accumulator Modular accumulator provides high-speed, carry-free operation. a0 b0 x Digit Acc. y0 Errors occurring in one digit do not propagate to another digit regardless of the number of products summed. + a1 b1 x Digit Acc. y1 + Modular accumulator is highly accurate, there is no loss of information until the final step of normalization (no rounding, truncation until final step). a2 b2 x Digit Acc. y2 + Modular multiplier is high-efficiency since unit multipliers are allocated in a linear fashion with respect to precision (in bits). a9 b9 x Digit Acc. y9 + Modular circuits are space efficient; example: the output result is same width as input operand width.
Pipeline back Pipeline back- -end includes error correction, end includes error correction, normalization and reverse conversion to binary. normalization and reverse conversion to binary. (+/-) XXXXXXXX.XXXXXXXX Binary Format output (32.32): d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 d0 d1 d2 d3 d4 d5 d6 d7 d0 d1 d2 d3 d4 d5 d6 d7 d0 d1 d2 d3 d4 d5 Reverse convert Normalize Error Correction New arithmetic technology from Maitrix provides computation in the modular domain but provides corrected results in a standard binary fixed-point or floating-point format.
Generating arithmetic errors by Generating arithmetic errors by over over- -clocking clocking Test bed uses over-clocking to generate arithmetic errors. Safe region clock Over-clock region Multiply and Accumulator unit (MAU) is isolated from rest of circuit using dual-clock FIFOs. Dual Clock FIFO Multiply & accumulate Dual Clock FIFO A high-speed variable over- clock source is created using an external frequency generator and a PLL internal to FPGA. Dual Clock FIFO High-speed Variable Over-clock
Vector arrangement amortizes cost of error Vector arrangement amortizes cost of error correction by supporting multiple MAUs. correction by supporting multiple MAUs. Forward convert A1 Multiply & accumulate Forward convert B1 Forward convert A2 Multiply & accumulate Selector Forward convert Reverse convert Error B2 Normalize Correction Forward convert A16 Multiply & accumulate Forward convert B16 Adding error correction increases circuitry by 28% for 16 MAU s that share a single error correction unit.
Single Digit Error Correction kicks Single Digit Error Correction kicks- -in at 372 MHz over 372 MHz over- -clock in at clock Modular computation operates with a small error rate past its rated frequency of 270 MHz. Error correction starts to dominate at critical frequency of 372 MHz.
Non Non- -correctable errors held to less correctable errors held to less than 1% re than 1% re- -tries allows over tries allows over- -clocking! clocking! Effective clock rate versus NC errors Effective clock 380.00 375.00 EFFECTIVE FREQUENCY 370.00 365.00 360.00 355.00 350.00 345.00 340.00 335.00 0.00% 0.50% 1.00% 1.50% 2.00% 2.50% 3.00% 3.50% 4.00% 4.50% 5.00% ERROR RATE If 1% errors occur at 375 MHz, then the effect of 1% retries equates to an effective clock rate of 370 MHz