Efficient High-Speed Video Compression Techniques

high speed video compression decompression l.w
1 / 21
Embed
Share

Dive into the world of high-speed video compression and decompression pipelines, exploring techniques such as JPEG, JPEG-2000, MPEG, and the Discrete Wavelet Transform. Learn about parallelism, dummy FIFOs, and the micro-architecture of 1D and 2D signal processing for optimized video frame transfer to FPGA with high bit rates.

  • Video Compression
  • High Speed
  • FPGA Transfer
  • Discrete Wavelet Transform
  • Parallelism

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. High Speed Video Compression/Decompression Pipeline Ariana Einsenstein Yuan Cao

  2. Objective Transfer video frames to FPGA quickly Video data has high bit rate Compress the data before transfering 1 Byte 512 bits De- compression Modules Some Interface Dram Wrapper DDR3 Dram PC 1 Byte 512 bits Compression Modules 50 MHz

  3. Some standards JPEG Discrete Cosine Transform -medium Huffman Encoder -medium JPEG-2000 -Better Discrete Wavelet Transform -complicated Entropy Encoder -very complicated MPEG -Even Better Motion predictor -very complicated

  4. Discrete Wavelet Transform (DWT) Low Pass High Pass

  5. DWT vs. FFT Keep 40 Largest Components

  6. DWT 2-dimensional Non-zero Original 4-level DWT LL2 LH2 LL LH HL2 HH2 Close to zero HL HH

  7. Lifting scheme y=(x0+x2)*a+x1

  8. How much parallelism? 1,024 Samples 1,024 Samples 2,048 Multiplier/Adders 1,024 Samples/cycle @ 50MHz =50 GSPS 128*8 Samples 128*8 Samples 16 Multiplier/Adders 8 Samples/cycle @ 50MHz =400 MSPS

  9. Dummy FIFOs S1 S2 S3 S4 Sc S1 S2 S3 S4 Sc FIFO l0 l0 x0 x0 h0 h0 x1 x1 l1 l1 x2 x2 h1 h1 x3 x3 (2)Modified with dummy buffers (1)Original lifting scheme

  10. DWT 1D micro-architecture Stage_2 Stage_1 ififo_history s1xfifo s2save Coef_b Coef_a Mult- Adder Mult- Adder s1fifo s2fifo Vector#(B, WSample) =0 =0 ififo -1 counter +1 counter Stage_3 Stage_4 s3xfifo Coef_c Coef_d Mult- Adder Mult- Adder s3fifo s4fifo =0 =0 -1 counter +1 counter coef_scale Latency insensitive Fully pipelined Scale Vector#(B, WSample) Scale 1/coef_scale ofifo

  11. DWT 2D signal coef_a Vector#(B,WSample) Stage1 s1fifo[0] Vector#(B,WSample) DWT1D Mult- Add s1fifo[1] s1save Distributor signal coef_b Large BRAM buffers to store full lines s2fifo[0] Stage2 Mult- Add s2fifo[1] Vector#(B,WSample) signal coef_c output_fifo Assembler Stage3 s3fifo[0] Mult- Add s3fifo[1] s3save Distributor signal coef_d 1/s s Scale Scale s4fifo[0] Stage4 Mult- Add Stage_sc s4fifo[1]

  12. Multi-level DWT ofifo Low Low Vector#(B, WSample) DWT2D N/2 DWT2D N/4 DWT2D N Vector#(B, WSample) High High BRAM FIFO 16 lines capacity 512kbit BRAM FIFO 16 lines capacity 512kbit Everything is pipelined! Throughput=B Sample/cycle

  13. Histogram of Samples -Output from 1-level DWT

  14. Huffman Encoding Tree and Table 1 0 1 0:10 1:1100 -1:1101 2:11100 -2:11001 3:111100 -3:111101 -4:111110 x:111111 0 0 1 1 0 0 1 1 -1 0 1 0 1 2 -2 0 1 0 1 3 -3 -4

  15. Encoder Architecture Coeff In Vector FIFO Encoded Value Vector FIFO Circle Buffer (EHR) Byte Out FIFO Write Index Read Index Encoding Table

  16. Decoder Architecture Byte In FIFO Circle Buffer (EHR) Coeff Out FIFO Write Index Read Index Read Index + 6 Encoding Table

  17. FPGA Results -1024x1024 Monochrome Initial Image (Padded to 1024x1024) Matlab and FPGA Compressed Image Matlab Compressed Image

  18. FPGA Results -512x512 Color Initial Image (Padded to 512x512) Matlab and FPGA Compressed Image Matlab Compressed Image 3-level DWT compression, 0.23 compression ratio (lossy)

  19. Performance Throughput: 50MHz * Fully pipelined 1 Sample/cycle = 50 MSPS = 1280x720 RGB @ 18FPS Compression Ratio Down to 0.23 (5.5 bit/pixel) for lossy compression Utilization on FPGA synthesization 88k LUT, 41k Register, 300 Block RAM tiles Code 2,620L in BSV, 126L in C++ and 391L in MATLAB

  20. Thank you!

  21. DWT 1D module w/ serialization & dummy FIFOs S1 S2 S3 S4 L0~L3 x0~x7 H0~H3 L4~L7 x8~x15 H4~H7 L8~L11 x16~x23 H8~H11 L12~L15 x24~x31 H12~H15

More Related Content