Understanding Managed Elasticsearch for Dispatch Services

 
M
a
n
a
g
e
d
E
l
a
s
t
i
c
s
e
a
r
c
h
f
o
r
 
D
i
s
p
a
t
c
h
 
M
i
c
h
a
e
l
 
A
l
b
e
r
h
a
s
k
y
A
I
S
-
A
r
c
h
i
t
e
c
t
u
r
e
 
T
e
a
m
 
A
p
r
i
l
 
2
2
,
 
2
0
2
0
 
I
T
S
 
-
 
A
d
m
i
n
i
s
t
r
a
t
i
v
e
 
I
n
f
o
r
m
a
t
i
o
n
 
S
e
r
v
i
c
e
s
 
W
h
a
t
 
i
s
 
E
l
a
s
t
i
c
s
e
a
r
c
h
?
 
Distributed search and analytics engine
Built upon Apache Lucene
RESTful interface
Elasticsearch + Logstash + Kibana = ELK stack
AIS also uses it for log searching and course searching
in MyUI.
 
I
T
S
 
-
 
A
d
m
i
n
i
s
t
r
a
t
i
v
e
 
I
n
f
o
r
m
a
t
i
o
n
 
S
e
r
v
i
c
e
s
 
D
i
s
p
a
t
c
h
s
 
U
s
e
 
C
a
s
e
 
Feature request to search through message history
Oracle query against 70 million rows = 7.79 minutes
Elasticsearch query in the most inefficient way = 1083ms
Can’t just add more indices to a table when there are
150 million rows
No downtown and an index that big would take eons to build
Storage considerations – database already 2TB
Need ability to quickly get metadata for new app
 
I
T
S
 
-
 
A
d
m
i
n
i
s
t
r
a
t
i
v
e
 
I
n
f
o
r
m
a
t
i
o
n
 
S
e
r
v
i
c
e
s
 
M
a
n
a
g
e
d
 
S
e
r
v
i
c
e
s
 
Paying to use it (hopefully cheaper than it would cost to
run it yourself)
Updates/patches are done for you
Backups happen automatically
Self-service tools for configuration
Basic monitoring provided
 
I
T
S
 
-
 
A
d
m
i
n
i
s
t
r
a
t
i
v
e
 
I
n
f
o
r
m
a
t
i
o
n
 
S
e
r
v
i
c
e
s
 
H
o
s
t
i
n
g
 
o
u
r
 
o
w
n
 
o
r
 
M
a
n
a
g
e
d
 
o
n
 
A
W
S
?
 
Elasticity - If I need a cluster
with more CPU for ingesting
lots of data, I can change
instance type easily
New feature called UltraWarm
Storage - Extend your storage
into low cost storage so you
can search through gobs
more data
 
Blue/Green deployments take
time as the entire cluster must
be replaced and data copied
to new cluster. Data can still
be read/write to old cluster
while that is happening
More expensive then just
trying to run your own -
however, time is money -
$335/month
 
I
T
S
 
-
 
A
d
m
i
n
i
s
t
r
a
t
i
v
e
 
I
n
f
o
r
m
a
t
i
o
n
 
S
e
r
v
i
c
e
s
 
P
R
O
 
C
O
N
 
A
r
c
h
i
t
e
c
t
u
r
e
 
I
T
S
 
-
 
A
d
m
i
n
i
s
t
r
a
t
i
v
e
 
I
n
f
o
r
m
a
t
i
o
n
 
S
e
r
v
i
c
e
s
 
I
n
t
e
r
f
a
c
e
s
 
RESTful API offers ability query
indices via:
Query DSL
SQL
Kibana
 
I
T
S
 
-
 
A
d
m
i
n
i
s
t
r
a
t
i
v
e
 
I
n
f
o
r
m
a
t
i
o
n
 
S
e
r
v
i
c
e
s
 
{
  "query": {
    "bool": {
      "filter": [{
        "bool": {
          "minimum_should_match": 1,
          "should": [    {
              "match_phrase": {
                "member_id": ”foobar-1234-abcd-5678"
              }
            }
          ]
        }
      }],
      "must": [{
        "range": {
          "index_date": {
            "format": "strict_date_optional_time",
            "gte": "2020-04-15T20:25:16.707Z",
            "lte": "2020-04-15T20:55:16.707Z"
          }
        }
      }],
      "must_not": [],
      "should": []
    }
  }
}
 
I
t
s
 
a
 
b
i
r
d
,
 
i
t
s
 
a
 
p
l
a
n
e
,
 
i
t
s
 
a
 
c
a
c
h
e
 
Treat indices like a cache, it could go poof at any time.
Message metadata loaded to Elasticsearch after batch is completed.
Daily exports of metadata to S3.
Load Lambda function to rapidly reload indices.
 
I
T
S
 
-
 
A
d
m
i
n
i
s
t
r
a
t
i
v
e
 
I
n
f
o
r
m
a
t
i
o
n
 
S
e
r
v
i
c
e
s
 
I
n
d
e
x
 
D
e
s
i
g
n
 
Index for each day = bad idea
Index for each month = good idea
Aliased to a super index
Curator to manage indices and
delete old indices
AWS now offers index
management as a feature
 
I
T
S
 
-
 
A
d
m
i
n
i
s
t
r
a
t
i
v
e
 
I
n
f
o
r
m
a
t
i
o
n
 
S
e
r
v
i
c
e
s
 
I
t
 
a
i
n
t
 
f
r
e
e
 
3 x m5.large.elasticsearch with 70GB each
On-demand pricing:
Compute: $0.142/hour =  $306.72
Storage: $9.45/month x 3 = $28.35
Elected not to use dedicated Master nodes
Opportunities for reduced cost:
Reserved instances would save over $100/month
Reduce number of instances and accept higher risk?
 
I
T
S
 
-
 
A
d
m
i
n
i
s
t
r
a
t
i
v
e
 
I
n
f
o
r
m
a
t
i
o
n
 
S
e
r
v
i
c
e
s
 
D
e
m
o
 
Search function in Dispatch
Kibana interface
AWS Console
 
I
T
S
 
-
 
A
d
m
i
n
i
s
t
r
a
t
i
v
e
 
I
n
f
o
r
m
a
t
i
o
n
 
S
e
r
v
i
c
e
s
 
I
T
S
 
-
 
A
d
m
i
n
i
s
t
r
a
t
i
v
e
 
I
n
f
o
r
m
a
t
i
o
n
 
S
e
r
v
i
c
e
s
 
M
i
c
h
a
e
l
A
l
b
e
r
h
a
s
k
y
 
319-353-4484
 
michael-alberhasky@uiowa.edu
Slide Note

AIS = 6 full-time; 2 students

Dispatch sent 61 million messages in 2019.

Embed
Share

Discover the benefits of using Managed Elasticsearch for dispatch services, including efficient search capabilities, log searching, and course searching. Learn about Elasticsearch's distributed search and analytics engine, built upon Apache Lucene, with a RESTful interface. Explore the challenges faced in querying large datasets and the advantages of managed services for Elasticsearch. Gain insights into hosting options, such as self-hosting or opting for a managed solution on AWS. Delve into the architecture and interfaces utilized in the IT landscape.


Uploaded on Jul 19, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Managed Elasticsearch for Dispatch Michael Alberhasky AIS-Architecture Team April 22, 2020 ITS - Administrative Information Services

  2. What is Elasticsearch? Distributed search and analytics engine Built upon Apache Lucene RESTful interface Elasticsearch + Logstash + Kibana = ELK stack AIS also uses it for log searching and course searching in MyUI. ITS - Administrative Information Services

  3. Dispatchs Use Case Dispatch s Use Case Feature request to search through message history Oracle query against 70 million rows = 7.79 minutes Elasticsearch query in the most inefficient way = 1083ms Can t just add more indices to a table when there are 150 million rows No downtown and an index that big would take eons to build Storage considerations database already 2TB Feature request to search through message history Oracle query against 70 million rows = 7.79 minutes Elasticsearch query in the most inefficient way = 1083ms Can t just add more indices to a table when there are 150 million rows No downtown and an index that big would take eons to build Storage considerations database already 2TB Need ability to quickly get metadata for new app Need ability to quickly get metadata for new app ITS - Administrative Information Services ITS - Administrative Information Services

  4. Managed Services Paying to use it (hopefully cheaper than it would cost to run it yourself) Updates/patches are done for you Backups happen automatically Self-service tools for configuration Basic monitoring provided ITS - Administrative Information Services

  5. Hosting our own or Managed on AWS? PRO CON Elasticity - If I need a cluster with more CPU for ingesting lots of data, I can change instance type easily New feature called UltraWarm Storage - Extend your storage into low cost storage so you can search through gobs more data Blue/Green deployments take time as the entire cluster must be replaced and data copied to new cluster. Data can still be read/write to old cluster while that is happening More expensive then just trying to run your own - however, time is money - $335/month ITS - Administrative Information Services

  6. Architecture ITS - Administrative Information Services

  7. Interfaces { "query": { "bool": { "filter": [{ "bool": { "minimum_should_match": 1, "should": [ { "match_phrase": { "member_id": foobar-1234-abcd-5678" } } ] } }], "must": [{ "range": { "index_date": { "format": "strict_date_optional_time", "gte": "2020-04-15T20:25:16.707Z", "lte": "2020-04-15T20:55:16.707Z" } } }], "must_not": [], "should": [] } } } RESTful API offers ability query indices via: Query DSL SQL Kibana ITS - Administrative Information Services

  8. Its a bird, its a plane, its a cache Treat indices like a cache, it could go poof at any time. Message metadata loaded to Elasticsearch after batch is completed. Daily exports of metadata to S3. Load Lambda function to rapidly reload indices. Load Function Elasticsearch S3 Bucket SQS Queue Dispatch ITS - Administrative Information Services

  9. Index Design Index for each day = bad idea Index for each month = good idea Aliased to a super index Curator to manage indices and delete old indices AWS now offers index management as a feature ITS - Administrative Information Services

  10. It aint free 3 x m5.large.elasticsearch with 70GB each On-demand pricing: Compute: $0.142/hour = $306.72 Storage: $9.45/month x 3 = $28.35 Elected not to use dedicated Master nodes Opportunities for reduced cost: Reserved instances would save over $100/month Reduce number of instances and accept higher risk? ITS - Administrative Information Services

  11. Demo Search function in Dispatch Kibana interface AWS Console ITS - Administrative Information Services

  12. Michael Alberhasky 319-353-4484 michael-alberhasky@uiowa.edu ITS - Administrative Information Services

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#