NoSQL Database Scalability Using Indexing Techniques

 
Replex: A Multi-Index, Highly-
Available Data Store
 
Amy Tai*
§
, Michael Wei*, Michael J. Freedman
§
,
Ittai Abraham*, Dahlia Malkhi*
*VMware Research, 
§
Princeton University
SQL vs NoSQL
 
SQL
 
NoSQL
 
Rich querying model
 
Transaction support
 
Simple, key-based
querying
 
No or limited
transaction support
 
Poor performance
 
Scalable
 
Key-value stores
Ex: HyperDex, Cassandra
 
NewSQL
 
Rich querying model
 
Transaction support
 
Scalable, high
performance
 
Ex: CockroachDB,
Yesquel, H-Store
SQL vs NoSQL
SQL vs NoSQL
 
SQL
 
NoSQL
 
Rich querying model
 
Transaction support
 
Simple, key-based
querying
 
No or limited
transaction support
 
Poor performance
 
Scalable
 
Key-value stores
Ex: HyperDex, Cassandra
 
NewSQL
 
Rich querying model
 
Transaction support
 
Scalable, high
performance
 
Ex: CockroachDB,
Yesquel, H-Store
SQL vs NoSQL
NoSQL Scales with Database Partitioning
 
 r[0]         r[1]         r[2]              . . .               r[c]
 
rows
 
default partitioning key
NoSQL Scales with Database Partitioning
 
Partitions
 
(with respect to the
partitioning index)
 
How to enable richer queries
without sacrificing shared-
nothing scale-out?
Key Observations
 
1.
Indexing enables richer queries
Local Indexing on the Partitioning Index
 
Index updates are local
 
select * where col0=“foo”
 
Index lookups are local
Local Indexing
Index stored locally at each partition
 
Index updates are local
 
 insert
(r[0] r[1] r[2]  . . . r[c])
Local Indexing
Each partition builds a local index
Index updates are local
 
Index lookups must be
broadcast to all partitions
 
select * where col2=“foo”
 
Potential synchronization across
partitions
 
Index lookups executed on
a single data structure
Global Indexing
Distributed data structure spans all partitions
 
Index updates
are multi-hop
 
 insert
(r[0] r[1] r[2]  . . . r[c])
Key Observations
 
1.
Indexing enables richer queries
2. Indexes more efficient if data is
    partitioned according to that index
Partitioning by Index 
 Storage Overheads
default
partitioning key
 
Partitioning by index 
 must store data again
Partitioning by Index 
 Storage Overheads
default
partitioning key
Partitioning by index 
 must store data again
 
Partitioning by Index 
 Storage Overheads
 
default
partitioning key
 
Partitioning by index 
 must store data again
Partitioning by Index 
 Storage Overheads
default
partitioning key
 
HyperDex
 
Key Observations
 
1.
Indexing enables richer queries
2. Indexes more efficient if data is
    partitioned according to that index
3. Rethink replication paradigm
Replex
 
New replication unit:
replex
 -- data replica partitioned and sorted with
respect to an associated index
 
 
 
 
replexes
 
Serves data 
repl
ication and ind
ex
ing
 
Replex
 
New replication unit:
replex
 -- data replica partitioned and sorted with
respect to an associated index
 
 
 
 
replexes
 
Serves data 
repl
ication and ind
ex
ing
Replace data replicas with replexes
System Architecture
Inserts in Replex
 
 insert
(r[0] r[1] r[2]  . . . r[c])
 
?
 
Partition Determined by a replex’s Index
 
 insert
(
r[0] 
r[1] r[2]  . . . r[c])
 
?
 insert
(
r[0] 
r[1] r[2]  . . . r[c])
 
?
Partition Determined by a replex’s Index
 
Partition Determined by a replex’s Index
 
 insert
(r[0] r[1] 
r[2]  
. . . r[c])
 
?
 
Partition Determined by a replex’s Index
 
 insert
(r[0] r[1] 
r[2]
  . . . r[c])
Data Chains in Replex
 
Partitions in different replexes make
up a data chain if they share rows
 
(x[0] x[1] x[2]  . . . x[c])
 
(y[0] y[1] y[2]  . . . y[c])
 
(z[0] z[1] z[2]  . . . z[c])
 
Data Chains in Replex
 
All pairs of partitions are
potential data chains
 
Main Challenge in Replex
 
Increased failure complexity
Partition Failures
 
Index is unavailable, 
data is available
1.
 Partition Recovery
2.
 Client Requests
 
Partition Failures
 
Index is unavailable, 
data is available
Where is the data?
 
Finding Data After Failure
Finding Data After Failure
 
Partition Recovery
 
Client Requests
 
Parallel Recovery
 
Requests must be multicast
 
Lookups reduce to linear
scan at each partitions
 
Request Amplification
Tradeoff?
 
Recovery Time
Hybrid Replex: Key Contributions
 
1.
Recovery time vs request amplification tradeoff
2.
Improve failure availability of multiple replexes
3.
Storage vs recovery performance tradeoff
4.
Graceful failure degradation
 
Hybrid Replex: Key Contributions
 
1.
Recovery time vs request amplification tradeoff
2.
Improve failure availability of multiple replexes
3.
Storage vs recovery performance tradeoff
4.
Graceful failure degradation
Hybrid Replex: What is it?
 
 
Shared across 
r
 replexes
 
Not associated with an index
 
Partitioning function dependent on these 
r
 replexes
 
Hybrid Replex
 
hybrid
replex
Data Chains with Hybrid Replex
 
 insert
(r[0] r[1] r[2]  . . . r[c])
 
Data Chains with Hybrid Replex
 
 insert
(
r[0] 
r[1] 
r[2]  
. . . r[c])
 
Data Chains with Hybrid Replex
 
?
 
?
 
?
 
?
 
 insert
(
r[0] 
r[1] 
r[2]  
. . . r[c])
 
Data Chains with Hybrid Replex
 
 insert
(
r[0] 
r[1] 
r[2]  
. . . r[c])
1. Recovery time vs Request Amplification
Recovery is less parallel… but
lower request amplification
 
2. Improve failure availability of multiple replexes
 
2. Improve failure availability of multiple replexes
Improving Failure Availability w/o Hybrid Replexes
2x Storage!
3. Storage vs. recovery performance
1.5x Storage! …  but
worse client requests
Generalizing Hybrid Replexes
 
k
 hybrid replexes may be shared across 
r
 replexes
 
For the special case when r = 2,
request amplification will be O(n
1/(k+1)
)
 
For the special case when k = 1,
request amplification will be O(n
(r-1)/r
)
 
Implementation
 
Built on top of HyperDex, ~700 LOC
Implemented partition recovery and request
re-routing on failure
Implemented variety of hybrid replex
configurations
 
Evaluation
 
 
1.
What is impact of replexes on
steady-state performance?
 
2.
How do hybrid replexes affect
failure performance?
 
 
Evaluation Setup
 
Table with two indexes
12 server machines, 4 client machines
All machines colocated in the same rack,
connected via 1GB head-of-rack switch
8 CPU, 16GB memory per machine
Systems Under Evaluation
Replex-2
HyperDex
Replex-3
 
Storage overhead: 2x
Network hops per insert: 2
Failures Tolerated: 1
 
Storage overhead: 3x
Network hops per insert: 3
Failures Tolerated: 2
 
Storage overhead: 6x
Network hops per insert: 6
Failures Tolerated: 2
 
Systems Under Evaluation
 
Replex-2
 
 
 
 
HyperDex
 
Replex-3
 
 
 
 
Storage overhead: 2x
Network hops per insert: 2
Failures Tolerated: 1
 
Storage overhead: 3x
Network hops per insert: 3
Failures Tolerated: 2
 
Storage overhead: 6x
Network hops per insert: 6
Failures Tolerated: 2
Steady State Latency
Reads
Inserts
Replex-2 not included because it
has a lower fault tolerance
threshold
Reads against either index have
similar latency, but we report reads
against primary index
 
3 network
hops
 
6 network
hops
Single Failure Performance
Experiment
Load with 10M, 100 byte rows
Split reads 50:50 between each index
 
Failure
Single Failure: Recovery Time
 
Recovery Time
1.
HyperDex recovers slowest because 2-3x more data
2.
Replex-2 recovers fastest because least data, parallel recovery
Single Failure: Failure Throughput
 
Failure Throughput
1.
Replex-2 low throughput because of high request amplification
2.
Replex-3 has throughput comparable to HyperDex
 
Increased
throughput with
hybrid replex
Two Failure Performance
Experiment
Load with 1M, 1 KB rows
50:50 reads against each index
 
Failures
*Replex-2 not included b/c
  only tolerates 1 failure
Two Failure Performance
 
1.
Recovery Time: Replex-3 recovers nearly 3x faster, due to less data
2.
Failure Throughput: HyperDex has ½ throughput, because nearly
every partition is participating in recovery
 
Summary
 
1.
Replacing replicas with replexes decreases
index storage AND steady-state overheads
 
2.
Hybrid replexes enable a rich tradeoff space
during failures
 
Questions?
Slide Note
Embed
Share

Dive into the world of NoSQL database scalability by understanding how indexing enables richer queries and how local indexing impacts partitioning, updates, and lookups across distributed databases.

  • NoSQL
  • Database Scalability
  • Indexing Techniques
  • Distributed Databases
  • Query Optimization

Uploaded on Oct 10, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Replex: A Multi-Index, Highly- Available Data Store Amy Tai* , Michael Wei*, Michael J. Freedman , Ittai Abraham*, Dahlia Malkhi* *VMware Research, Princeton University

  2. SQL vs NoSQL Key-value stores Ex: HyperDex, Cassandra Ex: CockroachDB, Yesquel, H-Store NoSQL SQL NewSQL Rich querying model Simple, key-based querying No or limited transaction support Rich querying model Transaction support Transaction support Poor performance Scalable, high performance Scalable

  3. SQL vs NoSQL Key-value stores Ex: HyperDex, Cassandra Ex: CockroachDB, Yesquel, H-Store NoSQL SQL NewSQL Rich querying model Simple, key-based querying No or limited transaction support Rich querying model Transaction support Transaction support Poor performance Scalable, high performance Scalable

  4. NoSQL Scales with Database Partitioning col2 colc col col . . . 1 0 rows r[0] r[1] r[2] . . . r[c] default partitioning key

  5. NoSQL Scales with Database Partitioning Partitions (with respect to the partitioning index)

  6. How to enable richer queries without sacrificing shared- nothing scale-out?

  7. Key Observations 1. Indexing enables richer queries

  8. Local Indexing on the Partitioning Index col2 colc col1 col0 . . . select * where col0= foo Index updates are local Index lookups are local

  9. Local Indexing Index stored locally at each partition col2 colc col1 col0 . . . insert (r[0] r[1] r[2] . . . r[c]) Index updates are local

  10. Local Indexing Each partition builds a local index col2 colc col1 col0 . . . select * where col2= foo Index updates are local Index lookups must be broadcast to all partitions Potential synchronization across partitions

  11. Global Indexing Distributed data structure spans all partitions col2 colc col1 col0 . . . insert (r[0] r[1] r[2] . . . r[c]) Index updates are multi-hop Index lookups executed on a single data structure

  12. Key Observations 1. Indexing enables richer queries 2. Indexes more efficient if data is partitioned according to that index

  13. Partitioning by Index Storage Overheads col2 colc col1 col0 . . . default partitioning key Partitioning by index must store data again

  14. Partitioning by Index Storage Overheads col2 colc col1 col0 . . . default partitioning key Partitioning by index must store data again

  15. Partitioning by Index Storage Overheads col2 colc col1 col0 . . . default partitioning key Partitioning by index must store data again

  16. Partitioning by Index Storage Overheads col2 colc col1 col0 . . . default partitioning key HyperDex

  17. Key Observations 1. Indexing enables richer queries 2. Indexes more efficient if data is partitioned according to that index 3. Rethink replication paradigm

  18. Replex New replication unit: replex -- data replica partitioned and sorted with respect to an associated index replexes Serves data replication and indexing

  19. Replex New replication unit: replex -- data replica partitioned and sorted with respect to an associated index replexes Replace data replicas with replexes Serves data replication and indexing

  20. System Architecture col2 colc col0 col1 . . .

  21. Inserts in Replex insert (r[0] r[1] r[2] . . . r[c]) ?

  22. Partition Determined by a replexs Index insert (r[0] r[1] r[2] . . . r[c]) ?

  23. Partition Determined by a replexs Index insert (r[0] r[1] r[2] . . . r[c]) ?

  24. Partition Determined by a replexs Index insert (r[0] r[1] r[2] . . . r[c]) ?

  25. Partition Determined by a replexs Index insert (r[0] r[1] r[2] . . . r[c])

  26. Data Chains in Replex (x[0] x[1] x[2] . . . x[c]) (y[0] y[1] y[2] . . . y[c]) (z[0] z[1] z[2] . . . z[c]) Partitions in different replexes make up a data chain if they share rows

  27. Data Chains in Replex All pairs of partitions are potential data chains

  28. Main Challenge in Replex Increased failure complexity

  29. Partition Failures 1. Partition Recovery 2. Client Requests Index is unavailable, data is available

  30. Partition Failures Where is the data? Index is unavailable, data is available

  31. Finding Data After Failure

  32. Finding Data After Failure Client Requests Partition Recovery Tradeoff? Requests must be multicast Parallel Recovery Lookups reduce to linear scan at each partitions Recovery Time Request Amplification

  33. Hybrid Replex: Key Contributions 1. Recovery time vs request amplification tradeoff 2. Improve failure availability of multiple replexes 3. Storage vs recovery performance tradeoff 4. Graceful failure degradation

  34. Hybrid Replex: Key Contributions 1. Recovery time vs request amplification tradeoff 2. Improve failure availability of multiple replexes 3. Storage vs recovery performance tradeoff 4. Graceful failure degradation

  35. Hybrid Replex: What is it? Shared across r replexes Not associated with an index Partitioning function dependent on these r replexes

  36. Hybrid Replex hybrid replex

  37. Data Chains with Hybrid Replex insert (r[0] r[1] r[2] . . . r[c])

  38. Data Chains with Hybrid Replex insert (r[0] r[1] r[2] . . . r[c])

  39. Data Chains with Hybrid Replex insert (r[0] r[1] r[2] . . . r[c]) ? ? ? ?

  40. Data Chains with Hybrid Replex insert (r[0] r[1] r[2] . . . r[c])

  41. 1. Recovery time vs Request Amplification Recovery is less parallel but lower request amplification

  42. 2. Improve failure availability of multiple replexes

  43. 2. Improve failure availability of multiple replexes

  44. Improving Failure Availability w/o Hybrid Replexes 2x Storage!

  45. 3. Storage vs. recovery performance 1.5x Storage! but worse client requests

  46. Generalizing Hybrid Replexes k hybrid replexes may be shared across r replexes For the special case when r = 2, request amplification will be O(n1/(k+1)) For the special case when k = 1, request amplification will be O(n(r-1)/r)

  47. Implementation Built on top of HyperDex, ~700 LOC Implemented partition recovery and request re-routing on failure Implemented variety of hybrid replex configurations

  48. Evaluation 1. What is impact of replexes on steady-state performance? 2. How do hybrid replexes affect failure performance?

  49. Evaluation Setup Table with two indexes 12 server machines, 4 client machines All machines colocated in the same rack, connected via 1GB head-of-rack switch 8 CPU, 16GB memory per machine

  50. Systems Under Evaluation HyperDex Replex-2 Replex-3 Storage overhead: 2x Network hops per insert: 2 Failures Tolerated: 1 Storage overhead: 3x Network hops per insert: 3 Failures Tolerated: 2 Storage overhead: 6x Network hops per insert: 6 Failures Tolerated: 2

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#