Understanding NoSQL: Evolution and Key Concepts

Slide Note
Embed
Share

Exploring the world of NoSQL databases, this comprehensive overview delves into the history, key features, and evolution of NoSQL technology. From the motivations behind its development to the various systems and papers that shaped its landscape, this content provides insights into how NoSQL has revolutionized data management and scalability.


Uploaded on Oct 06, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. NO SQL

  2. https://www.youtube.com/watch?v=uD3p_rZPBUQ

  3. https://www.youtube.com/watch?v=uD3p_rZPBUQ

  4. https://www.youtube.com/watch?v=uD3p_rZPBUQ

  5. NoSQL What is NoSQL? Reasons for inclusion here

  6. Where we are? Data science Data Preparation (at scale) Analytics Communication Databases Key ideas: Relational algebra, physical/logical data independence MapReduce Key ideas: Fault tolerance, no loading, direct programming

  7. NoSQL NoSQL and related systems, by feature System/ Paper RDBMS memcached MapReduce CouchDB Scale to 1000s O Primary Index Secondary Indexes Joins/ Analytics Integrity Constraints Views O O O / O O / O O / Language/ Algebra Data model tables key-val key-val document ext. record document ext. record tables tables key-val key-val key-val tables tables tables tables tables tables Year 1971 2003 2004 2005 2006 BigTable/Hbase 2007 MongoDB 2007 Dynamo 2008 2008 2008 Cassandra 2009 Voldemort 2009 2010 Dremel 2011 Megastore 2011 Tenzing 2011 Spark/Shark 2012 Spanner 2013 Impala Transactions my label sql-like nosql batch nosql nosql nosql nosql sql-like sql-like nosql nosql nosql sql-like nosql sql-like sql-like sql-like sql-like O O O O O O O O O O O O O O record record EC, record O O O EC, record EC, record EC, record O entity groups O O MR compat. w/MR O O O O O O O O O O Pig HIVE O O O O MR / O O O O O O Riak O O O O / O O O O O ? O O O O

  8. NoSQL (Primary Motivation: Scale) System/ Paper Scale to 1000s Primary Index Secondary Indexes Joins/ Analytics Integrity Constraints Language/ Algebra Data model Year 1971 Transactions Views my label sql-like RDBMS O tables 2003 memcached O O O O O O key-val nosql 2004 MapReduce O O O O O O key-val batch 2005 CouchDB record MR O O document nosql 2006 BigTable (Hbase) record compat. w/MR / O O ext. record nosql 2007 MongoDB EC, record O O O O document nosql 2007 Dynamo O O O O O O ext. record nosql 2008 Pig O O O / O tables sql-like 2008 HIVE O O O O tables sql-like 2008 Cassandra EC, record O O key-val nosql 2009 Voldemort O EC, record O O O O key-val nosql 2009 Riak EC, record MR O key-val nosql 2010 Dremel O O O / O tables sql-like 2011 Megastore entity groups O / O / tables nosql 2011 Tenzing O O O O tables sql-like 2011 Spark/Shark O O O O tables sql-like 2012 Spanner ? tables sql-like 2012 Accumulo record compat. w/MR / O O ext. record nosql 2013 Impala O O O O tables sql-like

  9. NoSQL: Scalability 1) We need to ensure high availability 2) We also want to support updates

  10. User: Joe Friends: Sue, Status: I m sleepy Wall: , User: Sue Friends: Joe, Kai, Status: Headed to new Bond flick Wall: , Example Example User: Kai Friends: Sue, Status: Done for tonight Wall: , Write: Update Sue s status. Who sees the new status, and who sees the old one? Databases: Everyone MUST see the same thing, either old or new, no matter how long it takes. NoSQL: For large applications, we can t afford to wait that long, and maybe it doesn t matter anyway

  11. Friends Users Jim Sue Posts Jim, Sue Sue, Jim Lin, Joe Joe, Lin Jim, Kai Kai, Jim Jim, Lin Lin, Jim Sue: headed to see new Bond flick Sue: it was ok Kai: I m hungry

  12. Two-Phase Commit Motivation 1) user updates their status 2) do it! subordinate 1 3) success! 4) oops! 2) do it ! subordinate 2 2) do it ! 3) success! subordinate 3 3) FAIL!

  13. Two Two- -Phase Commit Phase Commit Phase 1: Coordinator Sends Prepare to Commit Subordinates make sure they can do so no matter what Write the action to a log to tolerate failure Subordinates Reply Ready to Commit Phase 2: If all subordinates ready, send Commit If anyone failed, send Abort

  14. Two Two- -Phase Commit Phase Commit 1) user updates their status 5) commit 2) Prepare subordinate 1 4) ready 3) write to log 2) Prepare 5) commit subordinate 2 4) ready 2) Prepare 3) write to log 5) commit subordinate 3 4) ready 3) write to log

  15. Other Protocols Other Protocols

  16. Eventual Consistency Eventual Consistency Write conflicts will eventually propagate throughout the system D. Terry et al., Managing Update Conflicts in Bayou,a Weakly Connected Replicated Storage System , SOSP 1995 We believe that applications must be aware that they may read weakly consistent data and also that their write operations may conflict with those of other users and applications. Moreover, applications must be revolved m the detection and resolution of conflicts since these naturally depend on the semantics of the application.

  17. Eventual Consistency In case of absence of updates, all replicas converge towards identical copies. What the application sees in the meantime is sensitive to replication mechanics and difficult to predict Contrast with RDBMS, Paxos: Immediate (or strong ) consistency, but there may be deadlocks

  18. System/ Paper Scale to 1000s Primary Index Secondary Indexes Joins/ Analytics Integrity Constraints Language/ Algebra Data model Year Transactions Views my label 2003 memcached O O O O O O key-val nosql 2005 CouchDB record MR O O document nosql 2006 BigTable (Hbase) record compat. w/MR / O O ext. record nosql 2007 MongoDB EC, record O O O O document nosql 2007 Dynamo O O O O O O key-val nosql 2008 Cassandra EC, record O O key-val nosql 2009 Voldemort O EC, record O O O O key-val nosql 2009 Riak EC, record MR O key-val nosql 2011 Megastore entity groups O / O / tables nosql 2012 Accumulo record compat. w/MR / O O ext. record nosql 2012 Spanner ? tables sql-like

  19. CAP Theorem [Brewer 2000, Lynch 2002] Consistency Do all applications see all the same data? Availability If some nodes fail, does everything still work? Partitioning If two sections of your system cannot talk to each other, can they make forward progress on their own? If not, you sacrifice Availability Conventional databases assume no partitioning clusters were assumed to be small and local NoSQL systems may sacrifice consistency If so, you might have to sacrific Consistency can t have everything

Related


More Related Content