IoT Data Analytics Architecture for Real-World Use Cases
Explore the IoT data analytics architecture proposed by Adnan Akbar from the University of Surrey, applicable to diverse real-world scenarios like smart homes in Taipei. Discover how IoT leverages the connection of everyday objects to the internet, enabling remote control of physical environments. Delve into the challenges and opportunities of IoT data analytics, including processing models, analytic methods, and tool combinations for maximizing value from the data. Gain insights into batch processing, event processing, machine learning, and the right tools for storing and processing IoT data.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
From Intelligent Transportation in Madrid to Smart Homes in Taipei: An IoT Data Analytics architecture applicable to multiple real world use cases Adnan Akbar Institute for Communication Systems (ICS) 5G Innovation Centre (5GIC) University of Surrey, UK Adnan.akbar@surrey.ac.uk Joint work with: Paula Ta-Shma, IBM Research Guy Hadash, IBM Research Juan Sancho, ATOS Thursday, 12 September 2024
What is Internet of Things ? Internet of Things is based on the vision of connecting everyday objects to internet to form a cyber- physical system, where every object will be represented by its virtual representation enabling the control of physical worldremotely (F. Mattern and C. Floerkemeier) Connecting Everyday Objects Physical things containing chips/ sensors capture and communicate all types of data Virtual Representation Control of Physical World interact with other devices, computing systems and the external environment, including people Thursday, 12 September 2024
IoT Data Analytics More Data, More opportunities, But More Challenges for analyzing and extracting knowledge from this data Which analytic methods are available to get more value from this data ? IoT Data Which processing model should be used to analyze this data ? Which are the right set of tools ? Thursday, 12 September 2024
Which processing Model to use ? Batch Processing vs Event Processing or Real-time vs Historical Machine Learning Batch Processing Statistical Methods Hybrid Solutions IoT Data Event Processing Complex Event Processing Thursday, 12 September 2024
Right combination of tools for IoT data ? Plethora of open source projects for storing and Processing Big data Secor Swift Elasticsearch Thursday, 12 September 2024
Generic IoT Architecture Data Flow Ingestion 1. Collect historical time series data Collect data from devices Aggregate into objects Index and/or partition IoT Swift Secor Thursday, 12 September 2024
Generic IoT Architecture Data Flow Historical Data Access and Analytics 2. Learn patterns in data May be time/location dependent Generate thresholds, classifiers etc. Swift Secor Thursday, 12 September 2024
Generic IoT Architecture Data Flow Real-Time Data Analytics 3. Apply what was learned on real time data stream Take action CEP IoT Swift Secor Thursday, 12 September 2024
Proposed Solution: A Lambda Architecture for IoT A generic IoT Analytics architecture 1) Ingestion 2) Historical Data Analytics (Batch Processing) 3) Real-time Data Analytics (Event Processing) CEP Green Flows: Real time IoT Purple Flows: Batch Swift Secor Thursday, 12 September 2024
Use Case 1: Intelligent Transportation System for Madrid Council Problem Over 3000 traffic sensors deployed through city of Madrid EMT needs to staff control rooms where employees manually analyze Madrid traffic sensor output. This can be slow and costly. Objective Improve customer satisfaction and reduce costs by responding more efficiently and quickly to real- time traffic problems Approach Ingest data from up to 3000 sensors in to our architecture, learn patterns from historical data, apply it in real-time data using CEP and React by alerting drivers, calling emergency vehicles, rerouting buses, modifying traffic lights, etc Today Tomorrow Thursday, 12 September 2024
IoT Architecture Madrid Traffic Ingestion Flow Aim: Collect historical timeseries data for analysis Continuously collect data from up to 3000 Madrid council traffic sensors via web service Data includes traffic speeds and intensities, updated every 5 mins Push the messages to Kafka Use Secor to aggregate multiple messages into a single Swift object According to policy, e.g., every 60 mins Possibly partition the data, e.g. according to date Convert to Parquet format Annotate with metadata, e.g., min/max speed, start/end time Index Swift objects according to their metadata using ElasticSearch Swift IoT Secor Thursday, 12 September 2024
IoT Architecture Madrid Traffic Data Access Aim: Access data efficiently and cost effectively Store IoT data in OpenStack Swift object storage Open source, low cost deployment, and highly scalable Parquet data is accessible via Spark SQL Optimized predicate pushdown Custom Spark SQL external data source driver Uses object metadata indexes Searches for Swift objects whose min/max values overlap requested ranges Swift Get all data for morning traffic: SELECT codigo, intensidad, velocidad FROM madridtraffic WHERE tf >= '08:00:00' AND tf <= '12:00:00' Brute force method 13245 Swift requests Optimized predicate pushdown 616 Swift requests 21.5 times improvement Thursday, 12 September 2024
IoT Architecture Madrid Traffic Machine Learning Aim: Learn to differentiate between good and bad traffic Depends on context Time (morning/evening), Day (weekday/weekend) Location Use Spark MLlib k-means clustering Produce threshold values for real-time decision making Re-run algorithm when quality of clusters decreases Can use silhouette index to measure quality Swift Thursday, 12 September 2024
IoT Architecture Madrid Traffic Machine Learning Event Detection: Morning Traffic on Weekdays Use Spark MLlib k-means clustering to separate data into 2 clusters Find the midpoint between the 2 cluster centres Use this midpoint to generate the thresholds Repeat for each context e.g. time period (morning, afternoon, evening, night) Anomaly Detection: Use a single cluster and define an anomaly to be further than a certain distance from the cluster centre Thursday, 12 September 2024
IoT Architecture Madrid Traffic Real Time Decision Making Aim: Respond in real time to traffic conditions Use Complex Event Processing (CEP) approach Rule based Process events record by record CEP rules are typically defined manually but in many cases it is difficult to get them right We automate this process and make it smart Prediction Proactive approach: Use Spark streaming linear regression to predict traffic behavior (e.g. speed, intensity) for near future Apply CEP on predicted data CEP Respond pro-actively to predicted events such as traffic congestion e.g. EMT can proactively re- route buses IoT Thursday, 12 September 2024
Use Case 2: Taipei Smart Homes Taipei test scenario comprised of fifty 50 volunteer households Installed with Smart Energy kit home smart plugs, smart strips) Real-time monitoring, control, and report of home appliances energy usage (incl. gateway, and Home Gateway Real-time usage Energy Smart plugs Goal: Real time Monitoring of Appliances in order to detect anomalies Thursday, 12 September 2024
Taipei Smart Homes Example of Anomalies Short circuit of a device Devices being operated at unusual times An Anomaly at night might not be an anomaly at daytime Same Architecture is used for monitoring Energy data Only difference lies in the type of Analytics and Rules Historical Data Analytics Learn normal patterns from historical data Use CEP rules to detect the deviation from normal Different Models for different context Time of a day (Morning, Afternoon, Evening, Night) Weekday or weekend Winter or summer Rainy or sunny Thursday, 12 September 2024
Real-Time Anomaly detection using COSMOS Data Analytics Architecture COSMOS Data Analytics CEP Real-time warning messages Swift Secor Node - Red istrip sensor PC/monitor Refrigerator Fan / Lighting 7 Thursday, 12 September 2024
Our Architecture Applies to Many IoT Use cases Healthcare Healthcare patient monitoring/alert/response Logistics Monitoring of sensitive goods Social Media Event detection if high number of posts detected as compared to normal behavior Insurance Driver behavior and location monitoring Transportation Connected vehicles, engine diagnostics, automated service scheduling Thursday, 12 September 2024
COSMOS Funding: EU FP7 at level of 2PY x 3 years Started: Sept 2013 Coordinator: ATOS Technical partners: University of Surrey, IBM, NTUA, Siemens, ATOS Use Case Partners: Hildebrand/Camden, EMT Madrid Bus Transport/Madrid Council, III Taiwan Smart Cities use cases Project Vision: Enable things to interact with each other based on shared experience, trust, reputation etc. Thursday, 12 September 2024
Thank you. Any Questions ? For more details, Email: adnan.akbar@surrey.ac.uk Thursday, 12 September 2024
Why Now ? Advancement in Technology -- Cheap parallel computing and Storage MapReduce (2004) Object Storage such as Amazon S3(2006) No SQL Data stores such as Cassandra (2008) Software for cloud computing such as Openstack (2010) Hadoop A big data echo system (2011) Apache Spark - Big data processing platform (2014) Price of sensors has fallen Cost of bandwidth has decreased (40x over 10 years) Free or cheap WiFi Moved from IPv4 to IPv6 Thursday, 12 September 2024