Analysis of Drawbacks in BlinkDB System
BlinkDB is a system that focuses on organizing sampling around query column sets and determining query classes with the best efficiency. However, potential failures lie in unstable QCSes, high rare subgroup counts, and challenging dimensionality. Drawbacks include unclear parameter tuning, limited o
0 views • 23 slides
Overview of BlinkDB: Query Optimization for Very Large Data
BlinkDB is a framework built on Apache Hive, designed to support interactive SQL-like aggregate queries over massive datasets. It creates and maintains samples from data for fast, approximate query answers, supporting various aggregate functions with error bounds. The architecture includes modules f
0 views • 26 slides
Understanding BlinkDB: A Framework for Fast and Approximate Query Processing
BlinkDB is a framework built on Hive and Spark that creates and maintains offline samples for fast, approximate query processing. It provides error bars for queries executed on the same data and ensures correctness. The paper introduces innovations like sample creation techniques, error latency prof
0 views • 8 slides