Trajectory Data Mining and Classification Overview
Dr. Yu Zheng, a leading researcher at Microsoft Research and Shanghai Jiao Tong University, delves into the paradigm of trajectory data mining, focusing on uncertainty, trajectory patterns, classification, privacy preservation, and outlier detection. The process involves segmenting trajectories, extracting features, and applying models like Dynamic Bayesian Network and Hidden Markov Model for classification. Furthermore, the learning of transportation modes from GPS trajectories aims to differentiate between driving, biking, taking the bus, and walking with high accuracy, offering valuable insights for smart route design and recommendation systems.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans. Intelligent Systems and Technology http://research.microsoft.com/en-us/people/yuzheng/
Paradigm of Trajectory Data Mining Uncertainty Traj. Pattern Mining Trajectory Classification Graph Mining Moving Together Patterns Clustering Privacy Preserving Freq. Seq. Patterns Trajectory Outlier/Anomaly Detection Periodic Patterns Reducing Uncertainty Routing Trajectory Indexing and Retrieval Distance of Trajectory Matrix Analysis Query Historical Trajectories Managing Recent Trajectories TD MF Trajectory Preprocessing Map-Matching Compression CF Stay Point Detection Noise Filtering Segmentation Matrix Spatial Trajectories Spatial Trajectories Spatial Trajectories Tensor Graph Yu Zheng. Trajectory Data Mining: An Overview. ACM Transactions on Intelligent Systems and Technology. 2015, vol. 6, issue 3.
Trajectory Classification Differentiate between trajectories (or its segments) of different status: motions transportation modes human activities Applications trip recommendation life experiences sharing context-aware computing
Trajectory Classification General Steps: Divide a trajectory into segments using segmentation methods. Sometimes, each single point is regarded as a minimum inference unit Extract features from each segment (or point) Build a model to classify each segment (or point) Some models Dynamic Bayesian Network (DBN) HMM and Conditional Random Field (CRF)
Learning Transportation Modes Based on GPS Trajectories Goal & Results: Inferring transportation modes from raw GPS data Differentiate driving, riding a bike, taking a bus and walking Achieve a 0.75 inference accuracy (independent of other sensor data) GPS log Users Infer model
Learning Transportation Modes Based on GPS Trajectories Motivation For users: For service provider: Classify trajectories of different transportation modes Enable smart-route design and recommendation Difficulty Velocity-based method cannot handle this problem well (<0.5 accuracy) People usually transfer their transportation modes in a trip The observation of a mode is vulnerable to traffic condition and weather Reflect on past events and understand their own life pattern Obtain more reference knowledge from others experiences Yu Zheng, et al. Understanding Mobility Based on GPS Data. UbiComp 2008
Learning Transportation Modes Based on GPS Trajectories Contributions and insights A change point-based segmentation method Walk is a transition between different transportation modes Handle congestions to some extent A set of sophisticated features Robust to traffic condition Feed into a supervise learning-based inference model A graph-based post-processing Considering typical user behavior Employing location constrains of the real world WWW 2008 (first version) Yu Zheng, et al. Understanding Mobility Based on GPS Data. UbiComp 2008
Architecture Online Inference Offline Learning Test Data Training Data Change Point Clustering Segmentation Segmentation Extracting Feature Graph Building Extracting Feature Knowledge Extraction Inference Model Model Training Spatial Knowledge Post-Processing Trans. Modes Spatially Indexed Knowledge Spatial Indexing
Walk-Based Segmentation Commonsense knowledge from the real world Typically, people need to walk before transferring transportation modes Typically, people need to stop and then go when transferring modes Yu Zheng, et al. Understanding Mobility Based on GPS Data. UbiComp 2008
Walk-Based Segmentation Change point-based Segmentation Algorithm Step 1: distinguish all possible Walk Points, non-Walk Points. Step 2: merge short segment composed by consecutive Walk Points or non-Walk points Step 3: merge consecutive Uncertain Segment to non-Walk Segment. Step 4: end point of each Walk Segment are potential change points Forward Backward Walk Car Bus (a) (b) Car Certain Segment 3 Uncertain Segments Certain Segment (c) Denotes a non-walk Point: P.V>Vtor P.a>at Denotes a possible walk point: P.V<Vtand P.a<at
Feature Extraction (1) Features Category Features Significance Dist Distance of a segment MaxVi The ith maximal velocity of a segment MaxAi The ith maximal acceleration of a segment Basic Features AV Average velocity of a segment EV DV HCR SR VCR Expectation of velocity of GPS points in a segment Variance of velocity of GPS points in a segment Heading Change Rate Stop Rate Velocity Change Rate Advanced Features Yu Zheng, et al. Understanding Mobility Based on GPS Data. UbiComp 2008
Feature Extraction (2) Our features are more discriminative than velocity Heading Change Rate (HCR) Stop Rate (SR) Velocity change rate (VCR) >65 accuracy Velocity Vs Distance a) Driving Velocity Vs p2. head p3 Distance p1. head p2.V2 b) Bus Velocity p1 H1 p1.V1 p2 L1, T1 Vs Distance c) Walking
Graph-Based Post-Processing (1) Using location-constraints to improve the inference performance?? Crossroad Bus Stop Traffic Light Yu Zheng, et al. Understanding Mobility Based on GPS Data. UbiComp 2008
Graph-Based Post-Processing (2) Transition probability between different transportation modes P(Bike|Walk) and P(Bike|Driving) Transition P(Walk|Driving) Transition P(Bike|Walk) Segment[i-1]: Driving Segment[i+1]: Bike Ground Truth Segment[i]: Walk P(Bike): 62% P(Walk): 24% P(Bus): 8% P(Driving): 6% P(Bike): 40% P(Walk): 30% P(Bus): 20% P(Driving): 10% P(Driving): 75% P(Bus): 10% P(Bike): 8% P(Walk): 7% Inference result Segment[i].P(Bike) = Segment[i].P(Bike) * P(Bike|Car) Segment[i].P(Walk) = Segment[i].P(Walk) * P(Walk|Car)
Graph-Based Post-Processing (3) Mine a implied road network from users GPS logs Use the location constraints and typical user behaviors as probabilistic cues Being independent of the map information Change points and start/end points (1) (2) Building Graph N1 N2 N7 N3 N8 N5 N6 A start or end point A change point (4) Probability calculation (3) Spatial indexing N1 N8 N5 N4 N1 N2 P85(Mi) P18(Mi) P54(Mi) N3 N7 N8 P185(Mi|Mj) P854(Mi|Mj) N5 N4 P581(Mi|Mj) P458(Mi|Mj) N6 M={Driving, Walk, Bike, Bus}, E.g., P(M0) = P(Driving); P(M3|M1)= P(Bus | Walk); Yu Zheng, et al. Understanding Mobility Based on GPS Data. UbiComp 2008
Graph-Based Post-Processing (4) Segments of a GPS trajectory Search spatial index N Found in graph ? Y Y Features X Labeled data Is maxProb > T1? N Inference model Knowledge Mining Output the mode with maxProb as result N Normal post- processing Is maxProb < T2? Posterior Probability P(mi | X) Posterior Probability P(mi | Eij) Prior Probablity P(mi) Transition probability-based enhancement Prior probability-based enhancement Final Results: P(mi | X, Eij)= P(mi | X) P(mi | Eij) / P(mi) Output the mode with maxProb as result End Yu Zheng, et al. Understanding Mobility Based on GPS Data. UbiComp 2008
Thanks! Yu Zheng yuzheng@microsoft.com Homepage Yu Zheng. Trajectory Data Mining: An Overview. ACM Transactions on Intelligent Systems and Technology. 2015, vol. 6, issue 3.