Petabyte Migration

 
Chorus - Effortless Ceph S3
Petabyte Migration
 
FOSDEM 2024
 
Sirisha Guduru
Ceph Engineer
 
Why data
migration?
 
Migration and
its woes
 
Our experience
with data
migration
 
Chorus
 
Chorus
 
Problem
 
How to migrate to a different S3 vendor
with reduced downtime?
How to backup S3 data to another S3 in
different region and different vendor?
 
Chorus
Overview
 
Overview
 
One main storage 
and followers.
Chorus S3 API.
Requests are routed to main storage and async
replication to followers.
All existing data is also replicated from main to
follower in background.
Data replication can be configured, paused, resumed
by user by bucket with 
web admin UI
 or 
CLI
 
Chorus
 
Proxy
:
Routing & replication per bucket, PAUSE &
RESUME
Map custom credentials
Worker
:
Sync obj/bucket meta, content, tags, ACLs.
Migrate existing data in background
Track replication lag in prometheus <-> autoscale
workers
Rate limit (RAM, network)
 
Features
 
Chorus
 
Proxy
:
stateless, low-memory, low-CPU, high-network.
Redis
:
Scale: Redis cluster.
Persistence: AOF, snapshot (RDB).
Memory: 1M obj ~ 105MB, 1M queue ~ 700MB
Low cpu: 100-1000 rps during migration
Worker
:
Stateless, high-memory, high-network, low-CPU.
Tune worker resources and rate-limit <-> queue size
& replication lag
 
Operations
 
Chorus
 
Have working vendor-agnostic solution fast
pluggable architecture
learn, play-around
benchmark
Focus on correctness
Migrate big bucket under the load without
downtime
 
Initial Goals
 
Chorus
 
More load tests.
API Cost/Resource optimization.
Routing  policy alternatives: route by obj size,
meta, etc…
Load balance read requests for replicated data.
Chorus agents: subscribe to bucket
notification/event log instead of proxy.
Swift API compatibility
Lifecycle policy
 
Next steps
 
Questions
 
Chorus
 
Use Cases
 
Active transparent proxy
Active transparent proxy -migration
Backup service
Ransomware protection
Client deployment
Global namespace
 
Outlook
 
Chorus
 
Resources
 
https://docs.clyso.com/blog/2024/01/
24/opensourcing-chorus-project/
 
https://docs.clyso.com/docs
/products/chorus/overview/
 
https://github.com/clyso/chorus
 
Thank You!
 
Contact:
 
sirisha.guduru@clyso.com
Slide Note
Embed
Share

Traversing the challenges of migrating petabytes between old and new clusters, this FOSDEM 2024 presentation by Sirisha Guduru showcases the innovative tool Chorus for seamless S3 data replication. Learn about the complexities of data migration, the development of Chorus, and its solutions for reducing downtime and backing up data across different S3 vendors and regions.

  • Data Migration
  • S3 Replication
  • Chorus Tool
  • Petabyte Data
  • Cloud Storage

Uploaded on May 16, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Chorus - Effortless Ceph S3 Petabyte Migration FOSDEM 2024 Sirisha Guduru Ceph Engineer sirisha.guduru@clyso.com

  2. - Traversing through the lifecycle phase of replacing cluster hardware and the necessity to build a new cluster when the existing cluster cannot be augmented Why data migration? - Data migration between an old cluster (EOL) and a newly built cluster

  3. Data migration is a herculean task!! Migration and its woes Challenges with syncing petabytes of data across clusters - - Continuous monitoring and time consumed - Tools used for migration - Continuous changes in the data (writes and updates on the buckets)

  4. - The Ceph cluster hardware was EOL. We built a new cluster from scratch - +3PB of data was to be migrated between old and new clusters Our experience with data migration - Used rclone as data migration tool - Migration of user accounts along with access and secret keys, ACLs and bucket policies from old cluster to new was a tough task. Buckets and live data were copied using rclone seamlessly run in parallel.

  5. These learnings led to the development of a tool called Chorus which is a data replication software capable of synchronizing S3 data between multiple cloud storage backends. Chorus https://github.com/clyso/chorus

  6. Problem How to migrate to a different S3 vendor with reduced downtime? How to backup S3 data to another S3 in different region and different vendor? Chorus

  7. Overview One main storage and followers. Chorus S3 API. Chorus Requests are routed to main storage and async replication to followers. Overview All existing data is also replicated from main to follower in background. Data replication can be configured, paused, resumed by user by bucket with web admin UI or CLI

  8. Proxy: Routing & replication per bucket, PAUSE & RESUME Map custom credentials Chorus Worker: Features Sync obj/bucket meta, content, tags, ACLs. Migrate existing data in background Track replication lag in prometheus <-> autoscale workers Rate limit (RAM, network)

  9. Proxy: stateless, low-memory, low-CPU, high-network. Redis: Chorus Scale: Redis cluster. Persistence: AOF, snapshot (RDB). Memory: 1M obj ~ 105MB, 1M queue ~ 700MB Low cpu: 100-1000 rps during migration Operations Worker: Stateless, high-memory, high-network, low-CPU. Tune worker resources and rate-limit <-> queue size & replication lag

  10. Have working vendor-agnostic solution fast pluggable architecture Chorus learn, play-around benchmark Initial Goals Focus on correctness Migrate big bucket under the load without downtime

  11. More load tests. API Cost/Resource optimization. Routing policy alternatives: route by obj size, meta, etc Load balance read requests for replicated data. Chorus agents: subscribe to bucket notification/event log instead of proxy. Swift API compatibility Lifecycle policy Chorus Next steps

  12. Questions

  13. Use Cases Active transparent proxy Chorus Active transparent proxy -migration Backup service Outlook Ransomware protection Client deployment Global namespace

  14. Chorus https://docs.clyso.com/docs /products/chorus/overview/ https://docs.clyso.com/blog/2024/01/ 24/opensourcing-chorus-project/ Resources https://github.com/clyso/chorus

  15. Thank You! Contact: sirisha.guduru@clyso.com

More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#