OVS DPDK Practice in Real Public Cloud: Solutions for Performance Optimization

Slide Note
Embed
Share

Explore innovative solutions for optimizing Open vSwitch with Data Plane Development Kit (OVS DPDK) performance in real public cloud environments. Insights include meter lock scenario resolution, conntrack performance improvement, and flow-based connection tracking limitation enhancements. These solutions tackle challenges such as costly meter locks, unnecessary conn timeout updates, and resource allocation issues. Enhance your understanding of leveraging per PMD bucket pools, optimizing conntrack updates, and implementing flow-based connection tracking limits for improved performance.


Uploaded on Oct 04, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. 2022 OVS DPDK Practice in Real Public Cloud Cheng Li, Ding Han Chinatelecom Cloud

  2. Meter lock Scenario: User VM pps and bps are both limited by meter action. Problem: meter lock becomes expensive when PMD number increases(Multiple PMDs process pkts for the same VM). Solution: introduce per PMD bucket pool to reduce meter lock frequency.

  3. Solution: per PMD bucket pool Each PMD has its local bucket pool. Cost from local pool if satisfies(no lock). If local pool can t cover the cost, then: 1. Feed the public bucket pool 2. Move at most X times this batch cost from public pool to local pool (with lock). 3. Cost local pool Thoughts: How to decide the local pool size(X)? Larger local pool(X), less lock times. Lower accuracy. (20% improvement.) What if public pool doesn t have enough bucket? Lock every meter action? Yes, but it s not a problem.

  4. Conntrack performance improvement Scenario: Conntrack is used to ensure reply pkts can arrive VMs. Problem: conn timeout update for each pkt is not necessary for most scenarios. Solution: instead of making timeout update for each pkt, we can update in interval.

  5. Solution: not make update for every pkt The rule is simple: do not make timeout update if no conn state change and last update happened within 1 second. If we run at 1,000,000 pkt/s, we saves 999,999 times of update. At the cost of 1 second timeout floating. Most time, we are not sensitive about the 1 second floating. About 5% performance improment. (The sample code was based on an old version, latest ovs version may has some changes.)

  6. Flow based CT limitation Scenario: One host may run multiple VMs from difference users, one VM may cost all CT resource if no limitation. Problem: OVS currently supports zone based CT limit Solution: we have developed flow based CT limit, to achieve CT limitation for every single VM.

  7. Solution: flow based CT limit Openflow changes: ovs-ofctl add-ct-limit <bridge_name> "id=<limit_id>,limit=<limit_size> ovs-ofctl add-flow <bridge_name> \ "in_port=1,action=ct(commit[,limit=<id>],xxx) Check conn count when creating new conn. When a conn expires, limit.count minus 1. Deleting ct-limit triggers ct flow delete. (just like how meter works) With this implementation, we can: - Limit ct entry for every single VM. - No performance drop as limit happened only for new conn(not lookup).

  8. Thank you! Cheng Li lic121@chinatelecom.cn Ding Han handing@chinatelecom.cn 8

Related


More Related Content