Advanced Techniques in 3D Scene Analysis for Spatial Understanding
Cutting-edge research in 3D scene analysis focuses on sequenced predictions over points and regions for comprehensive spatial understanding. The approach involves contextual classification, overcoming limitations of classical graphical models through innovative inference machines that prioritize training procedures. The solution aims to encode spatial layout and long-range relations effectively.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
3-D Scene Analysis via Sequenced Predictions over Points and Regions Xuehan Xiong Daniel Munoz Drew Bagnell Martial Hebert 1
Solution: Contextual Classification Building Wire Pole Veg Trunk Car Ground 3
Classical Approach: Graphical Models Graphical models Intractable inference Belief propagation Mean field MCMC Difficult to train Kulesza NIPS 2007 Wainwright JMLR 2006 Finley & Joachims ICML 2008 Limited success Anguelov, et al. CVPR 2005 Triebel, et. al. IJCAI 2007 Munoz, et al. CVPR 2009 4 Fig. from Anguelov, et al. CVPR 2005
Classical Approach: Graphical Models Graphical models Intractable inference Belief propagation Mean field MCMC Difficult to train Kulesza NIPS 2007 Wainwright JMLR 2006 Finley & Joachims ICML 2008 Limited success Anguelov, et al. CVPR 2005 Triebel, et. al. IJCAI 2007 Munoz, et al. CVPR 2009 5 Fig. from Anguelov, et al. CVPR 2005
Classical Approach: Graphical Models Graphical models Intractable inference Belief propagation Mean field MCMC Difficult to train Kulesza Wainwright Finley & Joachims ICML 2008 Limited success Anguelov, et al. CVPR 2005 Triebel, et. al. IJCAI 2007 Munoz, et al. CVPR 2009 6 Fig. from Anguelov, et al. CVPR 2005
Our Approach: Inference Machines Train an inference procedure, not a model. To encode spatial layout and long range relations Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010 11
Our Approach: Inference Machines Train an inference procedure, not a model. To encode spatial layout and long range relations Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010 Inference via sequential prediction T T C0 C1 C2 F F F Reject E.g. Viola-Jones 2001 12
Our Approach: Inference Machines Train an inference procedure, not a model. To encode spatial layout and long range relations Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010 Inference via sequential prediction context context C0 C1 C2 Ours 13
Example features (0): xi point features = ) 0 ( ) 0 ( ) 0 ( LogReg ( ) Y X 14
(0): xi point features argmax( Y(0)) 15
top mid bottom (0): (1): xi xi point features Contextual features 16
top mid bottom (1): xi point features = ) 1 ( ) 1 ( ) 1 ( LogReg ( ) Y X 17
top mid bottom (1): xi point features = ) 1 ( ) 1 ( ) 1 ( LogReg ( ) Y X argmax( Y(1)) 18
top mid bottom (1): xi point features = ) 1 ( ) 1 ( ) 1 ( LogReg ( ) Y X 19
top mid bottom (2): xi point features = ) 2 ( ) 2 ( ) 2 ( LogReg ( ) Y X 20
Local features only Wire Building Veg Pole Car Ground 21
Round 1 22
Round 2 23
Round 3 Veg Car 24
Create regions Level 2 Level 1 25
(2): (0): (1): xj xj xj region features Pt level Region level 27
Level 2 Level 1 28
(2): (3): xi xi point features Region level Point level 29
With Regions 30
Learned Relationships top mid bottom xi: point features Neighbor contextual feature Learned weights 31
Learned Relationships top mid bottom xi: point features Neighbor contextual feature Learned weights 32
Experiments 3 large-scale datasets CMU (26M), Moscow State (10M), Univ. Wash (10M) Multiple classes (4 to 8) car, building, veg, wire, fence, people, trunk, pole, ground, street sign Different sensors SICK (ground), ALTM 2050 (aerial), Velodyne (ground) Comparisons Graphical models, exemplar based 33
Quantitative Results 0.9 0.8 Average F1 Score 0.7 0.6 0.5 0.4 0.3 CMU [1] Moscow A [2] Moscow B [2] UWash [3] Ours Related Work [2] Shapovalov PCV 2010 LogReg [1] Munoz CVPR 2009 [3] Lai RSS 2010 * 34 * Use additional semi-supervised data not leveraged by other methods.
[1] Munoz, et. al. CVPR 2009 CMU Dataset Ours Max Margin CRF [1] 35
[1] Munoz, et. al. CVPR 2009 CMU Dataset Ours Max Margin CRF [1] 36
[1] Munoz, et. al. CVPR 2009 CMU Dataset Ours Max Margin CRF [1] 37
Moscow State Dataset Ours Logistic regression 38
Conclusion Simple and fast approach for scene labeling No graphical model Labeling via 5x logistic regression predictions context context C0 C1 C2 Support flexible contextual features Learning rich relationships 39
Thank you! And Questions? Acknowledgements US Army Research Laboratory, Collaborative Technology Alliance QinetiQ North America Robotics Fellowship 40