Efficient Network Traffic Queries Handling Strategies

Beau
Coup
:
Answering
 
many
 
network
 
traffic
 
queries,
one
 
memory
 
update
 
at
 
a
 
time!
Xiaoqi
 
Chen,
 
Shir
 
Landau-Feibish,
 
Mark
 
Braverman,
 
Jennifer
 
Rexford
 
1
[bo’ku]
 
Adv
.
 
many,
 
a
 
lot.
 
1
Network
 
traffic
 
query
2
DDoS:
Are
 
there
 
many
 
Source
 
IPs
 
sending
to
 
one
 
particular
 
Destination
 
IP
?
Select
 
DstIP
 
where
distinct(
SrcIP
)>
1000
Key
Attribute
Threshold
Many
 
network
 
traffic
 
queries
3
DDoS?
Worm?
Port
 
Scan?
Different
 
keys/attrs,
 
need
multiple
 
data
 
structures
Many
 
network
 
traffic
 
queries
4
I
 
have
 
42
 
queries
Run
 
42
 
data
 
structures?
I
 
can’t…
Spec
 
for
 
today’s
 
commodity
 
programmable
 
switch:
XX
 
Tbps
 
aggregated
 
throughput
YY
 
MB
 
data-plane
 
memory
Can
 
only
 
access
 
ZZ
 
bytes
 
of
 
memory
 
per
 
packet
 
(True
 
for
 
CPU,
 
FPGA,
 
etc.,
 
as
 
well…
 
Moore’s
 
law!)
One
 
memory
 
update
 
at
 
a
 
time?
 
Constant
 
memory
 
update
 
per
 
packet,
regardless
 
of
 
the
 
number
 
of
 
queries?
Game
 
plan:
1.
Each
 
query
 
uses
 
only
 
o(1)
 
memory
update
 
per
 
packet
 
on
 
average
2.
Combine
 
many
 
different
 
queries,
on
 
average
 
uses
 
O(1)
3.
Coordinate,
 
at
 
most
 
O(1)
 
per
 
packet
5
Today’s
 
talk
 
Challenge:
many
 
queries,
 
few
 
memory
 
updates
Achieving
 
o(1)
 
memory
 
access:
coupon
 
collectors
System
 
design:
query
 
compiler
 
+
 
data
 
plane
 
program
Evaluation
6
C
D
A
The
 
coupon
 
collector
 
problem
 
4
 
different
 
coupons,
 
collect
 
all
 
of
 
them
Random
 
draws
How
 
many
 
total
 
draws
 
are
 
required?
7
B
?
Beau
Coup
 
coupon
 
collector
 
f
(
10.0.1.15
)
 
->
 
Coupon
f
(
10.0.1.33
)
 
->
 
Coupon
f
(
10.0.1.15
)
 
->
 
Coupon
f
(
10.0.1.42
)
 
->
 
No
 
Coupon
f
(
SrcIP
)
 
->
 
Coupon
?
Mapping
Select
 
DstIP
 
where
 
distinct(
SrcIP
)
>100
Collect
 
different
coupons
C
B
C
SrcIP:
 
10.0.1.15
DstIP:
 
162.249.4.107
SrcIP:
 
10.0.1.33
DstIP:
 
162.249.4.107
SrcIP:
 
10.0.1.42
DstIP:
 
162.249.4.107
 
Key:
         
162.249.4.107
Coupons:
8
Beau
Coup
 
coupon
 
collector
 
Generalization:
 
(
m
,
 
p
,
 
n
)
-coupon
 
collector
m
*
p
<1,
 
most
 
packets
 
collect
 
no
 
coupon
f
(
SrcIP
)
 
->
 
Coupon
?
 
m=8
 
coupons
 
in
 
total
 
stop
 
at
 
n=4
 
different
 
coupons
 
Example:
(
m
=8,
 
p
=1%,
 
n
=4)
Given
 
a
 
new
 
SrcIP
,
 
each
 
coupon
is
 
drawn
 
with
 
probability
 
1%
9
System
 
design
 
Query compiler:
finds coupon collector configurations
Stops near query thresholds
,
 
minimize
 
error
Hardware limits (e.g., memory access limit)
Fairness across queries
Data plane program:
collect coupons into in-memory table
Simultaneously run 
many
 queries
At most 
one
 coupon per packet
Update queries on-the-fly
10
Query
 
compiler
 
γ
q
=
 
Γ
 
/
 
|Q|
(fair
 
allocation)
 
 
11
Compiler
Query
 
set
 
Q
 
=
 
{
q
1
,
 
q
2
,
 
…}
Total
 
memory
 
update
limit:
 
Γ
 
per
 
packet
 
Per-query
 
limit:
γ
q
 
per
 
packet
Query
 
q
i
Key,
 
Attribute,
 
Threshold
q
i
’s
 
Collector
 
Configuration
Total
 
coupons:
           
m
Each
 
probability:
       
p
Coupons
 
to
 
collect:
   
n
Query
 
q
i
Key,
 
Attribute,
 
Threshold
Query
 
q
i
Key,
 
Attribute,
 
Threshold
q
i
’s
 
Collector
 
Configuration
Total
 
coupons:
           
m
Each
 
probability:
       
p
Coupons
 
to
 
collect:
   
n
q
i
’s
 
Collector
 
Configuration
Total
 
coupons:
           
m
Each
 
probability:
       
p
Coupons
 
to
 
collect:
   
n
 
Goal:
I.
Stop
 
near
 
Threshold
II.
Update
 
limit
 
m
*
p
γ
q
III.
HW
 
limit,
 
e.g.,
 
m
32
 
For more detail on compiler heuristics,
please refer to our paper.
Query
 
compiler
12
Query
 
set
 
Q
 
=
 
{
q
1
,
 
q
2
,
 
…}
Query
Compiler
Total
 
coupons:
         
m
Each
 
probability:
     
p
Coupons
 
to
 
collect:
 
n
P4
Program
 
Switch
 
Data
 
Plane
Threshold=
1000
,
 
γ
q 
=0.01
   
(
m
=20,
p
=1/2048,
n
=8)
 
Stacking
 
queries:
 
same
 
attribute
13
 
Hash
 
function
h
1
(
SrcIP
)
 
->
 
[0,1)
q
1
 
#1
q
1
 
#2
q
1
 
#3
q
1
 
#4
 
4
 
coupons
 
for
 
q
1
q
2
#1
q
2
#2
q
2
#3
 
3
 
coupons
 
for
 
q
2
 
 
One
 
hash
 
function
 
for
 
each
 
attribute
14
h
1
(
SrcIP
)
 
->
 
h
2
(
DstIP
)
 
->
TCAM
 
for
 
selecting
 
a
 
coupon
15
Packet
SrcPort:
 
25012
DstPort:
 
443
SrcIP:
 
10.0.1.15
DstIP:
 
162.249.4.107
 
h
A
(
SrcPort
)=
101010…
h
B
(
DstPort
)=
111010…
h
C
(
SrcIP
)=
1010111…
h
D
(
DstIP
)=
0101011…
Random
 
tiebreak
if
 
>1
 
coupons
Coupon
 
collector
 
table
16
Packet
SrcPort:
 
25012
DstPort:
 
443
SrcIP:
 
10.0.1.15
DstIP:
 
162.249.4.107
q
6
#3
q
6
:
SrcIP
q
6
 
Coupon
 
#3
Key:
 
10.0.1.15
 
Query
 
q
6
Key
 
10.0.1.33
3
Packet
SrcPort:
 
27000
DstPort:
 
443
SrcIP:
 
10.0.1.33
DstIP:
 
4.3.2.1
q
6
#1
q
6
 
Coupon
 
#1
Key:
 
10.0.1.33
1
q
6
:
 
DstIP
,
1000
SrcIP
 
10.0.1.33
 
is
 
sending
 
to
 
>
1000
 
distinct
 
DstIP
s.
 
Space
 
efficiency:
Keys
 
from
 
all
 
queries
multiplexed into
 
one
 
table
Only
 
keep
 
rows
 
for
 
“active
keys”
 
(at
 
least
 
one
 
coupon)
Clear
 
rows
 
after
 
timeout
Installing
 
queries
 
into
 
switches
 
The
 
installed
 
rules
represent
 
query
 
set
 
Q
Update
 
queries
 
on
 
the
 
fly
,
without
 
recompiling
 
P4
17
 
Table
 
rules
 
Header
 
field
 
tuples
 
Key
,
 
Attribute
,
 
Threshold
Code
Generator
Query
Compiler
Rules
Generator
P4
Compiler
 
Data
 
plane
program
Static
program
Dynamic
rules
 
Query
 
set
 
Q
 
=
 
{
q
1
,
 
q
2
,
 
…}
Programmable
 
Switch
Packets
 
Alerts
Evaluation
 
highlights
 
How 
efficient
 is BeauCoup?
We
 
uses 
4x~10x fewer memory access 
than
 
the
state-of-the-art
 
to
 
achieve
 
the
 
same
 
accuracy.
What
 
about
 
fairness
?
Equalized
 
accuracy
 
for
 
same-threshold
 
queries.
How much hardware resource?
On
 
the
 
Barefoot
 
Tofino
 
programmable
 
switch,
BeauCoup
 
occupies
 
<50%
 
of
 
each
 
resource
18
Accuracy
 
metric:
 
Mean
 
Relative
 
Error
19
 
Threshold
Wider
Narrow
x
 
=
 
Actual
 
#
 
of
 
distinct
 
Error
=|
x
 
-
 
Threshold
|
MRE
=
 
Error
 
/
 
Threshold
Comparing accuracy
20
Mean Relative Error
 
for
 
distinct(
SrcIP,DstIP
)
 
1
On estimating the number of flows
.
 
Spang
 
&
 
McKeown,
Buffer
 
Sizing
 
Workshop
 
2019
2
NitroSketch: Robust and General Sketch-based Monitoring
in Software Switches
.
 
Liu
 
et
 
al.,
 
SIGCOMM
 
2019
3
One Sketch to Rule Them All:
 
Rethinking Network Flow
Monitoring with UnivMon.
 
Liu
 
et
 
al.,
 
SIGCOMM
 
2016
Single-query
 
accuracy
 
NitroSketch-Univmon
1
BeauCoup
Sampling
2
HyperLogLog
Stricter
Better
Better
4x~10x
21
 
1
NitroSketch: Robust and General Sketch-
based Monitoring
 
in Software Switches
.
Liu
 
et
 
al.,
 
SIGCOMM
 
2019
2
On estimating the number of flows
.
 
Spang
&
 
McKeown,
 
Buffer
 
Sizing
 
Workshop
 
2019
Across
 
multiple
 
queries
22
 
Total
 
memory
 
access
 
per
 
packet
 
Mean
 
Relative
 
Error
 
Key=
SrcIP
Attribute=
 
DstIP
Threshold=1000
 
Key=
SrcIP
Attribute=
 
DstIP+DstPort
Threshold=1000
 
Key=
DstIP+DstPort
Attribute=
 
SrcIP
Threshold=1000
 
Key=
DstIP+DstPort
Attribute=
 
SrcIP+SrcPort
Threshold=1000
 
Run
 
|
Q
|=26
 
queries
 
simultaneously
Fairness
 
among
 
queries:
 
same
 
accuracy
Across
 
multiple
 
queries
 
Run
 
|
Q
|=26
 
queries
 
simultaneously
Intuition:
high-threshold
 
query
 
is
 
”easier”,
low-threshold
 
query
 
benefits
 
from
 
more
 
memory
 
access
23
Total
 
memory
 
access
 
per
 
packet
Mean
 
Relative
 
Error
Threshold=100
Threshold=500
Threshold=5000
Threshold=10000
Hardware
 
resource utilization
(Tofino v1)
 
h
(
Attribute
)->
24
 
Scalable
:
 
built
 
upon
 
coupon
 
collectors,
runs
 
many
 
queries simultaneously
Versatile
:
 
change
 
queries
 
on
 
the
 
fly,
without
 
recompiling
 
P4
 
program
Efficient
:
 
achieve
 
the
 
same
 
accuracy
using
 
4x-10x
 
fewer
 
memory
 
accesses
25
Our
 
code
 
is
 
open-source!
github.com/Princeton-Cabernet/BeauCoup
Beau
Coup
:
Answering
 
many
 
network
 
traffic
 
queries,
one
 
memory
 
update
 
at
 
a
 
time!
Q&A on
Slack
 
Merci
 
Beaucoup!
 
Thank
 
you!
26
Face
Face
Slide Note
Embed
Share

Solve network traffic queries efficiently by ensuring constant memory updates and minimal memory access. Implement strategies such as managing multiple data structures for various query types and coordinating queries to optimize packet processing. Explore the coupon collector problem to enhance query compilation and program evaluation.

  • Network Traffic
  • Memory Optimization
  • Query Handling
  • Data Structures
  • Optimization

Uploaded on Oct 01, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. 1[boku] Adv. many, a lot. BeauCoup: Answeringmany network traffic queries, one memory update at a time! 1 Xiaoqi Chen, Shir Landau-Feibish, Mark Braverman, Jennifer Rexford

  2. Network traffic query DDoS: Are there many Source IPs sending to one particular Destination IP? Internet Key Select DstIP where distinct(SrcIP)>1000 Threshold 2 Attribute

  3. Many network traffic queries DDoS? Worm? Port Scan? Different keys/attrs, need multiple data structures Internet Query Key Attribute Threshold DDoS DstIP SrcIP 1000 Worm SrcIP DstIP 300 PortScan SrcIP,DstIP DstPort 100 3

  4. Many network traffic queries I have 42 queries Run 42 data structures? Spec for today s commodity programmable switch: XX Tbps aggregated throughput I can t YY MB data-plane memory Can only access ZZ bytes of memory per packet (True for CPU, FPGA, etc., as well Moore s law!) 4

  5. One memory update at a time? Constant memory update per packet, regardless of the number of queries? Game plan: 1. Each query uses only o(1) memory update per packet on average 2. Combine many different queries, on average uses O(1) 3. Coordinate, at most O(1) per packet 5

  6. Todays talk Challenge: many queries, few memory updates Achieving o(1) memory access: coupon collectors System design: query compiler + data plane program Evaluation 6

  7. The coupon collector problem 4 different coupons, collect all of them Random draws How many total draws are required? ? A B C D 7

  8. BeauCoup coupon collector f(SrcIP) -> Coupon ? Select DstIP where distinct(SrcIP)>100 Collect different coupons Mapping f(10.0.1.15) -> Coupon f(10.0.1.33) -> Coupon f(10.0.1.15) -> Coupon f(10.0.1.42) -> No Coupon C B C Key: Coupons: 162.249.4.107 A B C D 8

  9. BeauCoup coupon collector f(SrcIP) -> Coupon ? Generalization: (m, p, n)-coupon collector m*p<1, most packets collect no coupon m=8 coupons in total Example: (m=8, p=1%, n=4) Given a new SrcIP, each coupon is drawn with probability 1% 9 stop at n=4 different coupons

  10. System design Query compiler: finds coupon collector configurations Stops near query thresholds, minimize error Hardware limits (e.g., memory access limit) Fairness across queries Data plane program: collect coupons into in-memory table Simultaneously run many queries At most one coupon per packet Update queries on-the-fly 10

  11. Query compiler Query set Q = {q1, q2, } Total memory update limit: per packet Per-query limit: qper packet Query qi Key, Attribute, Threshold Threshold Threshold Goal: I. II. III. HW limit, e.g., m 32 qi s Collector Configuration Total coupons: Each probability: Coupons to collect: n Coupons to collect: n Coupons to collect: n Query qi Key, Attribute, Key, Attribute, qi s Collector Configuration Total coupons: Each probability: Each probability: Stop near Threshold Update limit m*p q m p Query qi qi s Collector Configuration Total coupons: m p Compiler m p q= / |Q| (fair allocation) For more detail on compiler heuristics, please refer to our paper. 11

  12. Query compiler Total coupons: Each probability: Coupons to collect: n m p Query set Q = {q1, q2, } P4 Query Compiler Program Switch Data Plane Threshold=1000 Threshold=1000, q =0.01 (m=20,p=1/2048,n=8) 12

  13. Stacking queries: same attribute ? q1: f(SrcIP) -> Coupon m1=4, p1=1/8 q2: f(SrcIP) -> Coupon m2=3, p2=1/16 . ? . q2 #3 q2 #2 q2 #1 q1#1 q1#2 q1#3 q1#4 Hash function h1(SrcIP) -> [0,1) 0 1/4 1/2 3/4 1 3 coupons for q2 4 coupons for q1 13

  14. One hash function for each attribute ? q1: f(SrcIP) -> Coupon m1=4, p1=1/8 q6: g(DstIP) -> Coupon m6=3, p6=1/8 . ? . q1#3 q1#4 q1#1 q1#2 h1(SrcIP) -> 0 1/4 1/2 3/4 1 q6#3 q6#1 q6#2 h2(DstIP) -> 0 1/4 1/2 3/4 1 14

  15. TCAM for selecting a coupon Match hA(SrcPort) 000***** 001***** 010***** 01101*** Query#,Coupon# (5,1) (5,2) (5,3) (9,1) Match hB(DstPort) Match hC(SrcIP) Match hD(DstIP) 000***** 001***** 010***** 01101*** Query#,Coupon# Query#,Coupon# Query#,Coupon# (6,1) (6,2) (6,3) (8,1) hA(SrcPort)=101010 hB(DstPort)=111010 hC(SrcIP)=1010111 hD(DstIP)=0101011 No coupon Packet SrcPort: 25012 DstPort: 443 SrcIP: 10.0.1.15 DstIP: 162.249.4.107 No coupon Random tiebreak if >1 coupons No coupon q6 #3 Collect coupon (q6, #3) 15

  16. Coupon collector table Q , Key q4: 8.8.8.8:53 q4: 1.1.1.1:53 q5: 10.0.0.1 q6: 10.0.1.33 q6: 10.0.1.15 Coupons 2 2 2 2 2 Packet q6Coupon #1 Key: 10.0.1.33 q6 #1 SrcPort: 27000 DstPort: 443 SrcIP: 10.0.1.33 DstIP: 4.3.2.1 3 4 4 4 1 1 1 1 1 3 3 3 3 3 3 Queryq6 Key 10.0.1.33 2 1 Packet q6Coupon #3 Key: 10.0.1.15 q6 #3 SrcPort: 25012 DstPort: 443 SrcIP: 10.0.1.15 DstIP: 162.249.4.107 Space efficiency: Keys from all queries multiplexed into one table Only keep rows for active keys (at least one coupon) Clear rows after timeout q6: DstIP, 1000 SrcIP 10.0.1.33 is sending to >1000 distinct DstIPs. q6: SrcIP 16

  17. Installing queries into switches Query set Q = {q1, q2, } Header field tuples Key, Attribute, Threshold Code Generator Query Compiler (m,p,n) Rules Generator code P4 Compiler The installed rules represent query set Q Update queries on the fly, without recompiling P4 Data plane program Table rules Packets Alerts 17

  18. Evaluation highlights How efficient is BeauCoup? We uses 4x~10x fewer memory access than the state-of-the-art to achieve the same accuracy. What about fairness? Equalized accuracy for same-threshold queries. How much hardware resource? On the Barefoot Tofino programmable switch, BeauCoup occupies <50% of each resource 18

  19. Accuracy metric: Mean Relative Error Threshold Wider x = Actual # of distinct Error=|x - Threshold| MRE= Error / Threshold Narrow 19

  20. Comparing accuracy Mean Relative Error for distinct(SrcIP,DstIP) NitroSketch2 +UnivMon3 Memory Access Word Per Packet Sampling1 HyperLogLog BeauCoup q=1/10 82% 33% X 17% q=1/42 290% 84% X 31% 1On estimating the number of flows. Spang & McKeown, Buffer Sizing Workshop 2019 2NitroSketch: Robust and General Sketch-based Monitoring in Software Switches. Liu et al., SIGCOMM 2019 3One Sketch to Rule Them All: Rethinking Network Flow Monitoring with UnivMon. Liu et al., SIGCOMM 2016 20

  21. Single-query accuracy NitroSketch-Univmon1 BeauCoup 4x~10x Sampling2 HyperLogLog Better 1NitroSketch: Robust and General Sketch- based Monitoring in Software Switches. Liu et al., SIGCOMM 2019 2On estimating the number of flows. Spang & McKeown, Buffer Sizing Workshop 2019 Stricter 21

  22. Across multiple queries Run |Q|=26 queries simultaneously Fairness among queries: same accuracy Key=SrcIP Attribute= DstIP Threshold=1000 Key=SrcIP Attribute= DstIP+DstPort Threshold=1000 Key=DstIP+DstPort Attribute= SrcIP Threshold=1000 Key=DstIP+DstPort Attribute= SrcIP+SrcPort Threshold=1000 Mean Relative Error Total memory access per packet 22

  23. Across multiple queries Run |Q|=26 queries simultaneously Intuition: high-threshold query is easier , low-threshold query benefits from more memory access Threshold=100 Threshold=500 Threshold=5000 Threshold=10000 Mean Relative Error Total memory access per packet 23

  24. Hardware resource utilization (Tofino v1) Key: Q , Key Coupons 1 2 3 q6 h(Attribute)-> 10.0.1.15 3 q1:8.8.8.8 q6: SrcIP Alerts! q1:1.1.1.1 1 2 3 0 1/4 1/2 Matching Coupons Extracting Query Key Collecting Coupons Teardown Overall 13.2% TCAM 39.6% 2.3% 0% 0% 12.3% SRAM 9.1% 2.1% 26.3% 0% 12.8% Instruction 25.0% 7.3% 5.4% 3.1% 41.7% Hash Unit 50.0% 24 61.1% 29.1% 0%

  25. BeauCoup: Answeringmany network traffic queries, one memory update at a time! Scalable: built upon coupon collectors, runsmany queries simultaneously Versatile: change queries on the fly, without recompiling P4 program Efficient: achieve the same accuracy using 4x-10x fewer memory accesses Merci Beaucoup! Thank you! Q&A on Slack Our code is open-source! github.com/Princeton-Cabernet/BeauCoup 25

  26. Face Face 26

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#