Understanding Managed Elasticsearch for Dispatch Services

Slide Note
Embed
Share

Discover the benefits of using Managed Elasticsearch for dispatch services, including efficient search capabilities, log searching, and course searching. Learn about Elasticsearch's distributed search and analytics engine, built upon Apache Lucene, with a RESTful interface. Explore the challenges faced in querying large datasets and the advantages of managed services for Elasticsearch. Gain insights into hosting options, such as self-hosting or opting for a managed solution on AWS. Delve into the architecture and interfaces utilized in the IT landscape.


Uploaded on Jul 19, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Managed Elasticsearch for Dispatch Michael Alberhasky AIS-Architecture Team April 22, 2020 ITS - Administrative Information Services

  2. What is Elasticsearch? Distributed search and analytics engine Built upon Apache Lucene RESTful interface Elasticsearch + Logstash + Kibana = ELK stack AIS also uses it for log searching and course searching in MyUI. ITS - Administrative Information Services

  3. Dispatchs Use Case Dispatch s Use Case Feature request to search through message history Oracle query against 70 million rows = 7.79 minutes Elasticsearch query in the most inefficient way = 1083ms Can t just add more indices to a table when there are 150 million rows No downtown and an index that big would take eons to build Storage considerations database already 2TB Feature request to search through message history Oracle query against 70 million rows = 7.79 minutes Elasticsearch query in the most inefficient way = 1083ms Can t just add more indices to a table when there are 150 million rows No downtown and an index that big would take eons to build Storage considerations database already 2TB Need ability to quickly get metadata for new app Need ability to quickly get metadata for new app ITS - Administrative Information Services ITS - Administrative Information Services

  4. Managed Services Paying to use it (hopefully cheaper than it would cost to run it yourself) Updates/patches are done for you Backups happen automatically Self-service tools for configuration Basic monitoring provided ITS - Administrative Information Services

  5. Hosting our own or Managed on AWS? PRO CON Elasticity - If I need a cluster with more CPU for ingesting lots of data, I can change instance type easily New feature called UltraWarm Storage - Extend your storage into low cost storage so you can search through gobs more data Blue/Green deployments take time as the entire cluster must be replaced and data copied to new cluster. Data can still be read/write to old cluster while that is happening More expensive then just trying to run your own - however, time is money - $335/month ITS - Administrative Information Services

  6. Architecture ITS - Administrative Information Services

  7. Interfaces { "query": { "bool": { "filter": [{ "bool": { "minimum_should_match": 1, "should": [ { "match_phrase": { "member_id": foobar-1234-abcd-5678" } } ] } }], "must": [{ "range": { "index_date": { "format": "strict_date_optional_time", "gte": "2020-04-15T20:25:16.707Z", "lte": "2020-04-15T20:55:16.707Z" } } }], "must_not": [], "should": [] } } } RESTful API offers ability query indices via: Query DSL SQL Kibana ITS - Administrative Information Services

  8. Its a bird, its a plane, its a cache Treat indices like a cache, it could go poof at any time. Message metadata loaded to Elasticsearch after batch is completed. Daily exports of metadata to S3. Load Lambda function to rapidly reload indices. Load Function Elasticsearch S3 Bucket SQS Queue Dispatch ITS - Administrative Information Services

  9. Index Design Index for each day = bad idea Index for each month = good idea Aliased to a super index Curator to manage indices and delete old indices AWS now offers index management as a feature ITS - Administrative Information Services

  10. It aint free 3 x m5.large.elasticsearch with 70GB each On-demand pricing: Compute: $0.142/hour = $306.72 Storage: $9.45/month x 3 = $28.35 Elected not to use dedicated Master nodes Opportunities for reduced cost: Reserved instances would save over $100/month Reduce number of instances and accept higher risk? ITS - Administrative Information Services

  11. Demo Search function in Dispatch Kibana interface AWS Console ITS - Administrative Information Services

  12. Michael Alberhasky 319-353-4484 michael-alberhasky@uiowa.edu ITS - Administrative Information Services

Related