Flexible Spatio-temporal Indexing Scheme for Large Scale GPS Tracks Retrieval

Slide Note
Embed
Share

This research paper discusses a novel spatio-temporal indexing scheme optimized for managing large-scale GPS data. The study introduces a stochastic process model to simulate user behavior in uploading GPS tracks, leading to a more efficient indexing scheme with smaller size, minimal update efforts, and improved retrieval performance. Various aspects such as user behavior modeling, index design, and experimental results are explored in detail.


Uploaded on Nov 12, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. A Flexible Spatio-temporal indexing Scheme for Large Scale GPS Tracks Retrieval Yu Zheng, Longhao Wang, Xing Xie yuzheng@microsoft.com Microsoft Research Asia

  2. Outline Introduction Modeling user behavior Index design Experimental results Conclusion

  3. Outline Introduction Modeling user behavior Index design Experimental results Conclusion

  4. Introduction Background GPS-enabled devices become prevalent Large amount of GPS logs have been accumulated Quite a few GPS-data-sharing applications appeared Spatio-temporal index is necessary For system: to manage the potentially large-scale data For users: to explore the GPS data interested them

  5. Introduction Problem Definition Retrieve the GPS trajectories across a given region and intersecting a given time span Present techniques are not optimized to these applications Spatial query Temporal query

  6. Introduction Our contributions A stochastic process model: simulating user behavior of uploading GPS tracks Users prefer to upload data they created recently The insert frequency of different parts of index are skewed A novel indexing scheme: optimized to the user behavior of uploading GPS tracks Smaller index size Minimal update efforts Satisfactory retrieval performance

  7. Outline Introduction Modeling user behavior Index design Experimental results Conclusion

  8. Modeling User Behavior A GPS track end time: Te Latitude, longitude, Time P1: Lat1, long1, T1 P2: Lat2, long2, T2 ... Pn: Latn, longn, Tn start time: Ts Pn-1 Pn P2 P3 P1 Duration of a GPS track Interval between trajectory created and uploaded

  9. Modeling User Behavior Upload log file to server at time Tup GPS Log File Users arrival can be modeled as Poisson process Te The interval between uploading time and end time of trajectory Tint = Tup -Te Can be modeled as Rayleigh distribution Ts Tdur = Te -Ts 0.25 0.6 0.5 0.2 0.4 Probability Probability 0.15 0.3 0.1 0.2 0.1 0.05 0 0 0 1 2 3 4 5 6 7 8 22 23 24 25 26 5 10 15 20 25 30 35 40 45 50 55 60 Interval between date image uploaded and date image taken (day) Summarized from photos uploaded by multiple users over a period of 3 months on Flickr Duration of GPS track (Minutes) Tdur follows Gaussian distribution

  10. Modeling User Behavior A (Ts, Te) represents a GPS track

  11. Outline Introduction Modeling user behavior Index design Experimental results Conclusion

  12. Index Design Architecture Partition space into disjoint grids Maintain a temporal index for each grid The temporal index (CSE-Tree) is special Spatial index track Segment 1 Segment 2 Segment 3 Temporal index

  13. Temporal Index (CSE-Tree) A GPS segment can be represented by a pair (Ts, Te) A point on two dimensional plane A temporal query is a time span (Timemin , Timemax)

  14. Temporal index Structure Partition the points into groups by Te Build a start time index (B+ Tree)to index points of each group Build a end time index (B+ Tree) to index groups Te Te ti+1 ti+1 Start Time Index Si Si index points with Ti <= Te < Ti+1 End time index ti ti ... t2 t2 S1 t1 t1 S0 Ts Ts t0

  15. Temporal Index (CSE-Tree) Three operations Insert Compress Search

  16. Temporal Index (CSE-Tree) Compress operation Occur when update frequency drops to some extent Convert B+ tree to dynamic array Te insert frequency now Sj+1 B+ Tree Sj ... dynamic array Si Ts now

  17. Temporal Index (CSE-Tree) Search operation Te>Timemin: Search End Time index to get the corresponding start time indexes Ts< Timemax: Look up each start time index candidate to find the correct points Te Tree3 t3 Tree2 t2 Tree1 Timemin t1 Tree0 Ts t0 Timemax

  18. Outline Introduction Modeling user behavior Index design Experimental results Conclusion

  19. Experimental Settings Platform PC with 3.00 GHz Intel Pentium 4 CPU, Windows XP SP2 platform, and 0.99 GB RAM Parameters B+ tree: Inner node size is 64 bytes Leaf size 1024 bytes Poisson process: 100, 300, 500 and 700 Total duration of the process is 2400 hours (100 days) Rayleigh distribution: Tintis 1.07. Normal distributionof Tdur: mean (0.42), variance (0.98).

  20. Experimental Results The compress operation saves index size No overlap between nodes B+ tree Dynamic array 80 70 60 index size (MB) 50 40 R-tree 30 SEB-tree 20 CSE-tree 10 0 100 300 500 700 intensity of upload Index size comparison

  21. Experimental Results Te Te Insert efforts Less node access than both SEB-tree and R-tree Most inserts occur in the area surrounded by the broken line Few node access in End Time Tree Ts Ts 140 120 number of node access 100 80 R-tree 60 SEB-tree 40 CSE-tree 20 0 100 300 500 700 intensity of upload Mean number of node access in one insertion

  22. Experimental Results Query performance 1600 1400 number of node access 1200 1000 800 R-tree 600 SEB-tree 400 CSE-tree 200 0 100 300 500 700 intensity of upload Mean number of node access in one query

  23. Conclusion A model simulating user behavior of upload data Based on stochastic process theory statistical analysis on the data collection in real world CSE-Tree Smaller index size Less node access in insertion Slightly more node access than SEB-tree in query

  24. Thanks! yuzheng@microsoft.com Q&A

Related


More Related Content