IoT Data Analytics Architecture for Real-World Use Cases

From Intelligent Transportation in Madrid to Smart Homes
in Taipei: An IoT Data Analytics architecture applicable to
multiple real world use cases
Thursday, 12 September 2024
Adnan Akbar
Institute for Communication Systems (ICS)
5G Innovation Centre (5GIC)
University of Surrey, UK
Adnan.akbar@surrey.ac.uk
Joint work with:
Paula Ta-Shma, IBM Research
Guy Hadash, IBM Research
Juan Sancho, ATOS
What is Internet of Things ?
“Internet of Things is based on the vision of 
connecting everyday objects 
to internet to form a cyber-
physical system, where every object will be represented by its 
virtual representation 
enabling the
control of physical world
 
remotely” (F. Mattern and C. Floerkemeier)
Connecting Everyday Objects
Physical things
 containing chips/ sensors
capture and communicate all types of dat
a
Virtual Representation
Control of Physical World
interact with other devices, computing systems and the external environment, including people
Thursday, 12 September 2024
IoT Data Analytics
More Data, More opportunities, But More Challenges for analyzing and extracting knowledge from
this data
Thursday, 12 September 2024
Which processing Model to use ?
Thursday, 12 September 2024
Batch Processing vs Event Processing or Real-time vs Historical
Right combination of tools for IoT data ?
 
Thursday, 12 September 2024
Plethora of open source projects for storing and Processing Big data
Secor
Secor
Generic IoT Architecture – Data Flow
Thursday, 12 September 2024
Ingestion
1.
Collect historical time series data
Collect data from devices
Aggregate into objects
Index and/or partition
Generic IoT Architecture – Data Flow
Thursday, 12 September 2024
Historical Data Access and Analytics
2.
Learn patterns in data
May be time/location dependent
Generate thresholds, classifiers etc.
Generic IoT Architecture – Data Flow
Thursday, 12 September 2024
Real-Time Data Analytics
3.
Apply what was learned on
real time data stream
Take action
Proposed Solution: A Lambda Architecture for IoT
1)
Ingestion
2)
Historical Data Analytics (Batch Processing)
3)
Real-time Data Analytics (Event Processing)
Thursday, 12 September 2024
A generic IoT Analytics architecture
Use Case 1: Intelligent Transportation System for Madrid Council
Problem
Over 3000 traffic sensors deployed through city of Madrid
EMT needs to staff control rooms where employees manually analyze Madrid traffic sensor output.
This can be slow and costly.
Objective
Improve customer satisfaction and reduce costs by responding more efficiently and quickly to real-
time traffic problems
Approach
Ingest data from up to 3000 sensors in to our architecture, learn patterns from historical data,
apply it in real-time data using CEP and React by alerting drivers, calling emergency vehicles,
rerouting buses, modifying traffic lights, etc
Thursday, 12 September 2024
IoT Architecture – Madrid Traffic – Ingestion Flow
Aim: Collect historical timeseries data for analysis
Continuously collect data from up to 3000 Madrid council traffic sensors via web service
Data includes traffic speeds and intensities, updated every 5 mins
Push the messages to Kafka
Use Secor to aggregate multiple messages into a single Swift object
According to policy, e.g., every 60 mins
Possibly partition the data, e.g. according to date
Convert to Parquet format
Annotate with metadata, 
e.g., min/max speed, start/end time
Index Swift objects according to their metadata using ElasticSearch
Secor
Secor
Thursday, 12 September 2024
IoT Architecture – Madrid Traffic – Data Access
Aim: Access data efficiently and cost effectively
Store IoT data in OpenStack Swift object storage
Open source, low cost deployment, and highly scalable
Parquet data is accessible via Spark SQL
O
p
t
i
m
i
z
e
d
 
p
r
e
d
i
c
a
t
e
 
p
u
s
h
d
o
w
n
Custom Spark SQL external data source driver
Uses object metadata indexes
Searches for Swift objects whose min/max values overlap requested ranges
Get all data for morning traffic:
SELECT codigo, intensidad, velocidad FROM
madridtraffic
WHERE tf >= '08:00:00' AND tf <= '12:00:00'
Brute force method
13245 Swift requests
Optimized predicate pushdown
616 Swift requests
21.5 times improvement
Thursday, 12 September 2024
IoT Architecture – Madrid Traffic – Machine Learning
Aim: Learn to differentiate between ‘good’ and ‘bad’  traffic
Depends on context
Time (morning/evening), Day (weekday/weekend)
Location
Use Spark MLlib k-means clustering
Produce threshold values for real-time decision making
Re-run algorithm when quality of clusters decreases
Can use silhouette index to measure quality
 
Thursday, 12 September 2024
IoT Architecture – Madrid Traffic – Machine Learning
 
 
Event Detection:
Use Spark MLlib k-means
clustering to separate data
into 2 clusters
Find the midpoint between
the 2 cluster centres
Use this midpoint to
generate the thresholds
Repeat for each context e.g.
time period (morning,
afternoon, evening, night)
Anomaly Detection:
Use a single cluster and
define an anomaly to be
further than a certain
distance from the cluster
centre
Morning Traffic on Weekdays
Thursday, 12 September 2024
IoT Architecture – Madrid Traffic –
Real Time Decision Making
Aim: Respond in real time to traffic conditions
Use Complex Event Processing (CEP) approach
Rule based
Process events record by record
CEP rules are typically defined manually but in many cases it is difficult
 to get them right
We automate this process and make it smart
 
Prediction
Proactive approach:
Use Spark streaming
linear regression to
predict traffic behavior
(e.g. speed, intensity) for
near future
Apply CEP on predicted
data
Respond pro-actively to
predicted events such as
traffic congestion
e.g. EMT can
proactively re-
route buses
Thursday, 12 September 2024
Use Case 2: Taipei Smart Homes
 
Thursday, 12 September 2024
Taipei test scenario
comprised of  fifty 50
volunteer
households
 Installed with Smart
Energy kit (incl.
home gateway,
smart plugs, and
smart strips)
Real-time Energy
usage
Goal:
 
Real time Monitoring of Appliances in order to detect anomalies
Taipei Smart Homes
Example of Anomalies
Short circuit of a device
Devices being operated at unusual times
An Anomaly at night might not be an anomaly at daytime
Same Architecture is used 
for
 monitoring Energy data
Only difference lies in the type of Analytics and Rules
Historical Data Analytics
Learn normal patterns from historical data
Use CEP rules to detect the deviation from normal
Different Models for different context
Time of a day (Morning, Afternoon, Evening, Night)
Weekday or weekend
Winter or summer
Rainy or sunny
Thursday, 12 September 2024
Real-Time Anomaly detection using COSMOS Data Analytics Architecture
Thursday, 12 September 2024
COSMOS Data Analytics
Our Architecture Applies to Many IoT Use cases
Healthcare
Healthcare patient monitoring/alert/response
Logistics
Monitoring of sensitive goods
Social Media
Event detection if high number of posts detected as compared to normal behavior
Insurance
Driver behavior and location monitoring
Transportation
Connected vehicles, engine diagnostics, automated service scheduling
Thursday, 12 September 2024
COSMOS
Funding:
 EU FP7 at level of 2PY x 3 years
Started:
 Sept 2013
Coordinator:
 ATOS
Technical partners: University of Surrey, I
BM
, 
NTUA, Siemens, ATOS
Use Case Partners: 
Hildebrand/Camden, EMT Madrid Bus Transport/Madrid Council, III Taiwan – Smart
Cities use cases
Project Vision: 
Enable ‘things’ to interact with each other based on shared experience, trust, reputation etc.
Thursday, 12 September 2024
Thank you.
Any Questions ?
Thursday, 12 September 2024
For more details, Email: adnan.akbar@surrey.ac.uk
Why Now ?
Thursday, 12 September 2024
Advancement in Technology -- Cheap parallel computing and Storage
MapReduce (2004)
Object Storage such as Amazon S3(2006)
No SQL Data stores such as Cassandra (2008)
Software for cloud computing such as Openstack (2010)
Hadoop – A big data echo system (2011)
Apache Spark -  Big data processing platform (2014)
Price of sensors has fallen
Cost of bandwidth has decreased (40x over 10 years)
Free or cheap WiFi
Moved from IPv4 to IPv6
Slide Note
Embed
Share

Explore the IoT data analytics architecture proposed by Adnan Akbar from the University of Surrey, applicable to diverse real-world scenarios like smart homes in Taipei. Discover how IoT leverages the connection of everyday objects to the internet, enabling remote control of physical environments. Delve into the challenges and opportunities of IoT data analytics, including processing models, analytic methods, and tool combinations for maximizing value from the data. Gain insights into batch processing, event processing, machine learning, and the right tools for storing and processing IoT data.


Uploaded on Sep 12, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. From Intelligent Transportation in Madrid to Smart Homes in Taipei: An IoT Data Analytics architecture applicable to multiple real world use cases Adnan Akbar Institute for Communication Systems (ICS) 5G Innovation Centre (5GIC) University of Surrey, UK Adnan.akbar@surrey.ac.uk Joint work with: Paula Ta-Shma, IBM Research Guy Hadash, IBM Research Juan Sancho, ATOS Thursday, 12 September 2024

  2. What is Internet of Things ? Internet of Things is based on the vision of connecting everyday objects to internet to form a cyber- physical system, where every object will be represented by its virtual representation enabling the control of physical worldremotely (F. Mattern and C. Floerkemeier) Connecting Everyday Objects Physical things containing chips/ sensors capture and communicate all types of data Virtual Representation Control of Physical World interact with other devices, computing systems and the external environment, including people Thursday, 12 September 2024

  3. IoT Data Analytics More Data, More opportunities, But More Challenges for analyzing and extracting knowledge from this data Which analytic methods are available to get more value from this data ? IoT Data Which processing model should be used to analyze this data ? Which are the right set of tools ? Thursday, 12 September 2024

  4. Which processing Model to use ? Batch Processing vs Event Processing or Real-time vs Historical Machine Learning Batch Processing Statistical Methods Hybrid Solutions IoT Data Event Processing Complex Event Processing Thursday, 12 September 2024

  5. Right combination of tools for IoT data ? Plethora of open source projects for storing and Processing Big data Secor Swift Elasticsearch Thursday, 12 September 2024

  6. Generic IoT Architecture Data Flow Ingestion 1. Collect historical time series data Collect data from devices Aggregate into objects Index and/or partition IoT Swift Secor Thursday, 12 September 2024

  7. Generic IoT Architecture Data Flow Historical Data Access and Analytics 2. Learn patterns in data May be time/location dependent Generate thresholds, classifiers etc. Swift Secor Thursday, 12 September 2024

  8. Generic IoT Architecture Data Flow Real-Time Data Analytics 3. Apply what was learned on real time data stream Take action CEP IoT Swift Secor Thursday, 12 September 2024

  9. Proposed Solution: A Lambda Architecture for IoT A generic IoT Analytics architecture 1) Ingestion 2) Historical Data Analytics (Batch Processing) 3) Real-time Data Analytics (Event Processing) CEP Green Flows: Real time IoT Purple Flows: Batch Swift Secor Thursday, 12 September 2024

  10. Use Case 1: Intelligent Transportation System for Madrid Council Problem Over 3000 traffic sensors deployed through city of Madrid EMT needs to staff control rooms where employees manually analyze Madrid traffic sensor output. This can be slow and costly. Objective Improve customer satisfaction and reduce costs by responding more efficiently and quickly to real- time traffic problems Approach Ingest data from up to 3000 sensors in to our architecture, learn patterns from historical data, apply it in real-time data using CEP and React by alerting drivers, calling emergency vehicles, rerouting buses, modifying traffic lights, etc Today Tomorrow Thursday, 12 September 2024

  11. IoT Architecture Madrid Traffic Ingestion Flow Aim: Collect historical timeseries data for analysis Continuously collect data from up to 3000 Madrid council traffic sensors via web service Data includes traffic speeds and intensities, updated every 5 mins Push the messages to Kafka Use Secor to aggregate multiple messages into a single Swift object According to policy, e.g., every 60 mins Possibly partition the data, e.g. according to date Convert to Parquet format Annotate with metadata, e.g., min/max speed, start/end time Index Swift objects according to their metadata using ElasticSearch Swift IoT Secor Thursday, 12 September 2024

  12. IoT Architecture Madrid Traffic Data Access Aim: Access data efficiently and cost effectively Store IoT data in OpenStack Swift object storage Open source, low cost deployment, and highly scalable Parquet data is accessible via Spark SQL Optimized predicate pushdown Custom Spark SQL external data source driver Uses object metadata indexes Searches for Swift objects whose min/max values overlap requested ranges Swift Get all data for morning traffic: SELECT codigo, intensidad, velocidad FROM madridtraffic WHERE tf >= '08:00:00' AND tf <= '12:00:00' Brute force method 13245 Swift requests Optimized predicate pushdown 616 Swift requests 21.5 times improvement Thursday, 12 September 2024

  13. IoT Architecture Madrid Traffic Machine Learning Aim: Learn to differentiate between good and bad traffic Depends on context Time (morning/evening), Day (weekday/weekend) Location Use Spark MLlib k-means clustering Produce threshold values for real-time decision making Re-run algorithm when quality of clusters decreases Can use silhouette index to measure quality Swift Thursday, 12 September 2024

  14. IoT Architecture Madrid Traffic Machine Learning Event Detection: Morning Traffic on Weekdays Use Spark MLlib k-means clustering to separate data into 2 clusters Find the midpoint between the 2 cluster centres Use this midpoint to generate the thresholds Repeat for each context e.g. time period (morning, afternoon, evening, night) Anomaly Detection: Use a single cluster and define an anomaly to be further than a certain distance from the cluster centre Thursday, 12 September 2024

  15. IoT Architecture Madrid Traffic Real Time Decision Making Aim: Respond in real time to traffic conditions Use Complex Event Processing (CEP) approach Rule based Process events record by record CEP rules are typically defined manually but in many cases it is difficult to get them right We automate this process and make it smart Prediction Proactive approach: Use Spark streaming linear regression to predict traffic behavior (e.g. speed, intensity) for near future Apply CEP on predicted data CEP Respond pro-actively to predicted events such as traffic congestion e.g. EMT can proactively re- route buses IoT Thursday, 12 September 2024

  16. Use Case 2: Taipei Smart Homes Taipei test scenario comprised of fifty 50 volunteer households Installed with Smart Energy kit home smart plugs, smart strips) Real-time monitoring, control, and report of home appliances energy usage (incl. gateway, and Home Gateway Real-time usage Energy Smart plugs Goal: Real time Monitoring of Appliances in order to detect anomalies Thursday, 12 September 2024

  17. Taipei Smart Homes Example of Anomalies Short circuit of a device Devices being operated at unusual times An Anomaly at night might not be an anomaly at daytime Same Architecture is used for monitoring Energy data Only difference lies in the type of Analytics and Rules Historical Data Analytics Learn normal patterns from historical data Use CEP rules to detect the deviation from normal Different Models for different context Time of a day (Morning, Afternoon, Evening, Night) Weekday or weekend Winter or summer Rainy or sunny Thursday, 12 September 2024

  18. Real-Time Anomaly detection using COSMOS Data Analytics Architecture COSMOS Data Analytics CEP Real-time warning messages Swift Secor Node - Red istrip sensor PC/monitor Refrigerator Fan / Lighting 7 Thursday, 12 September 2024

  19. Our Architecture Applies to Many IoT Use cases Healthcare Healthcare patient monitoring/alert/response Logistics Monitoring of sensitive goods Social Media Event detection if high number of posts detected as compared to normal behavior Insurance Driver behavior and location monitoring Transportation Connected vehicles, engine diagnostics, automated service scheduling Thursday, 12 September 2024

  20. COSMOS Funding: EU FP7 at level of 2PY x 3 years Started: Sept 2013 Coordinator: ATOS Technical partners: University of Surrey, IBM, NTUA, Siemens, ATOS Use Case Partners: Hildebrand/Camden, EMT Madrid Bus Transport/Madrid Council, III Taiwan Smart Cities use cases Project Vision: Enable things to interact with each other based on shared experience, trust, reputation etc. Thursday, 12 September 2024

  21. Thank you. Any Questions ? For more details, Email: adnan.akbar@surrey.ac.uk Thursday, 12 September 2024

  22. Why Now ? Advancement in Technology -- Cheap parallel computing and Storage MapReduce (2004) Object Storage such as Amazon S3(2006) No SQL Data stores such as Cassandra (2008) Software for cloud computing such as Openstack (2010) Hadoop A big data echo system (2011) Apache Spark - Big data processing platform (2014) Price of sensors has fallen Cost of bandwidth has decreased (40x over 10 years) Free or cheap WiFi Moved from IPv4 to IPv6 Thursday, 12 September 2024

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#