Advanced Techniques in 3D Scene Analysis for Spatial Understanding

 
3-D Scene Analysis via Sequenced
Predictions over Points and Regions
 
Xuehan Xiong
 
Daniel
Munoz
 
Drew
Bagnell
 
Martial
Hebert
 
1
 
2
 
Problem: 3D Scene Understanding
 
Car
 
 
Pole
 
Ground
 
Trunk
 
Wire
 
Building
 
Veg
 
3
 
Solution: Contextual Classification
 
Intractable
inference
 
 
Difficult to train
 
 
Limited success
 
4
Graphical models
 
Fig. from Anguelov, et al. CVPR 2005
 
Classical Approach: Graphical Models
 
Anguelov, et al. CVPR 2005
Triebel, et. al. IJCAI 2007
Munoz, et al. CVPR 2009
 
Kulesza NIPS 2007
Wainwright JMLR 2006
Finley & Joachims ICML 2008
 
Belief propagation
Mean field
MCMC
 
Intractable
inference
 
 
Difficult to train
 
 
Limited success
 
5
Graphical models
 
Fig. from Anguelov, et al. CVPR 2005
 
Classical Approach: Graphical Models
 
Anguelov, et al. CVPR 2005
Triebel, et. al. IJCAI 2007
Munoz, et al. CVPR 2009
 
Kulesza NIPS 2007
Wainwright JMLR 2006
Finley & Joachims ICML 2008
 
Belief propagation
Mean field
MCMC
 
Intractable
inference
 
 
Difficult to train
 
 
Limited success
 
6
Graphical models
 
Fig. from Anguelov, et al. CVPR 2005
 
Classical Approach: Graphical Models
 
Anguelov, et al. CVPR 2005
Triebel, et. al. IJCAI 2007
Munoz, et al. CVPR 2009
 
Kulesza
Wainwright
Finley & Joachims ICML 2008
 
Belief propagation
Mean field
MCMC
 
7
 
8
 
9
 
10
 
Our Approach: Inference Machines
 
11
 
Train an inference 
procedure
, not a model.
To encode spatial layout and long range relations
Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010
 
 
 
 
Train an inference 
procedure
, not a model.
To encode spatial layout and long range relations
Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010
Inference via sequential prediction
12
Our Approach: Inference Machines
 
13
 
Ours
 
Train an inference 
procedure
, not a model.
To encode spatial layout and long range relations
Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010
Inference via sequential prediction
 
 
 
 
 
Our Approach: Inference Machines
point features
14
 
Example features
point features
 
15
point features
16
 
Contextual features
point features
17
point features
 
18
point features
 
19
point features
20
 
Local features only
 
21
 
Car
 
 
Pole
 
Building
 
Veg
 
Ground
 
Wire
 
Round 1
 
22
 
Round 2
 
23
Round 3
24
Car
Veg
 
Create regions
 
Level 2
 
Level 1
 
25
26
region features
27
 
Region  level
 
Pt level
 
Level 2
 
Level 1
 
28
point features
29
Point  level
 
Region level
 
With Regions
 
30
 
Learned Relationships
 
31
 
Neighbor contextual feature
 
Learned weights
 
Learned Relationships
 
32
 
Neighbor contextual feature
 
Learned weights
 
Experiments
 
3 large-scale datasets
CMU (26M), Moscow State (10M), Univ. Wash (10M)
Multiple classes (4 to 8)
car, building, veg, wire, fence, people, trunk, pole,
ground, street sign
Different sensors
SICK (ground), ALTM 2050 (aerial), Velodyne (ground)
Comparisons
Graphical models, exemplar based
 
33
 
Quantitative Results
 
34
 
[1] Munoz CVPR 2009
 
[2] Shapovalov PCV 2010
 
[3] Lai RSS 2010 *
 
* Use additional semi-supervised data not leveraged by other methods.
CMU Dataset
Ours
Max Margin CRF [1]
35
[1] Munoz, et. al. CVPR 2009
 
 
Ours
Max Margin CRF [1]
36
CMU Dataset
[1] Munoz, et. al. CVPR 2009
 
 
Ours
Max Margin CRF [1]
37
CMU Dataset
[1] Munoz, et. al. CVPR 2009
 
 
 
Moscow State Dataset
 
Ours
 
Logistic regression
 
38
 
 
 
Conclusion
 
Simple and fast approach for scene labeling
No graphical model
Labeling via 5x logistic regression predictions
 
 
 
Support flexible contextual features
Learning rich relationships
 
39
 
Thank you! And Questions?
 
Acknowledgements
US Army Research Laboratory, Collaborative
Technology Alliance
QinetiQ North America Robotics Fellowship
 
40
Slide Note

ICRA 2011

Embed
Share

Cutting-edge research in 3D scene analysis focuses on sequenced predictions over points and regions for comprehensive spatial understanding. The approach involves contextual classification, overcoming limitations of classical graphical models through innovative inference machines that prioritize training procedures. The solution aims to encode spatial layout and long-range relations effectively.

  • Scene Analysis
  • Spatial Understanding
  • 3D Technology
  • Machine Inference
  • Graphical Models

Uploaded on Jul 02, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. 3-D Scene Analysis via Sequenced Predictions over Points and Regions Xuehan Xiong Daniel Munoz Drew Bagnell Martial Hebert 1

  2. Problem: 3D Scene Understanding 2

  3. Solution: Contextual Classification Building Wire Pole Veg Trunk Car Ground 3

  4. Classical Approach: Graphical Models Graphical models Intractable inference Belief propagation Mean field MCMC Difficult to train Kulesza NIPS 2007 Wainwright JMLR 2006 Finley & Joachims ICML 2008 Limited success Anguelov, et al. CVPR 2005 Triebel, et. al. IJCAI 2007 Munoz, et al. CVPR 2009 4 Fig. from Anguelov, et al. CVPR 2005

  5. Classical Approach: Graphical Models Graphical models Intractable inference Belief propagation Mean field MCMC Difficult to train Kulesza NIPS 2007 Wainwright JMLR 2006 Finley & Joachims ICML 2008 Limited success Anguelov, et al. CVPR 2005 Triebel, et. al. IJCAI 2007 Munoz, et al. CVPR 2009 5 Fig. from Anguelov, et al. CVPR 2005

  6. Classical Approach: Graphical Models Graphical models Intractable inference Belief propagation Mean field MCMC Difficult to train Kulesza Wainwright Finley & Joachims ICML 2008 Limited success Anguelov, et al. CVPR 2005 Triebel, et. al. IJCAI 2007 Munoz, et al. CVPR 2009 6 Fig. from Anguelov, et al. CVPR 2005

  7. 7

  8. 8

  9. 9

  10. 10

  11. Our Approach: Inference Machines Train an inference procedure, not a model. To encode spatial layout and long range relations Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010 11

  12. Our Approach: Inference Machines Train an inference procedure, not a model. To encode spatial layout and long range relations Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010 Inference via sequential prediction T T C0 C1 C2 F F F Reject E.g. Viola-Jones 2001 12

  13. Our Approach: Inference Machines Train an inference procedure, not a model. To encode spatial layout and long range relations Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010 Inference via sequential prediction context context C0 C1 C2 Ours 13

  14. Example features (0): xi point features = ) 0 ( ) 0 ( ) 0 ( LogReg ( ) Y X 14

  15. (0): xi point features argmax( Y(0)) 15

  16. top mid bottom (0): (1): xi xi point features Contextual features 16

  17. top mid bottom (1): xi point features = ) 1 ( ) 1 ( ) 1 ( LogReg ( ) Y X 17

  18. top mid bottom (1): xi point features = ) 1 ( ) 1 ( ) 1 ( LogReg ( ) Y X argmax( Y(1)) 18

  19. top mid bottom (1): xi point features = ) 1 ( ) 1 ( ) 1 ( LogReg ( ) Y X 19

  20. top mid bottom (2): xi point features = ) 2 ( ) 2 ( ) 2 ( LogReg ( ) Y X 20

  21. Local features only Wire Building Veg Pole Car Ground 21

  22. Round 1 22

  23. Round 2 23

  24. Round 3 Veg Car 24

  25. Create regions Level 2 Level 1 25

  26. 26

  27. (2): (0): (1): xj xj xj region features Pt level Region level 27

  28. Level 2 Level 1 28

  29. (2): (3): xi xi point features Region level Point level 29

  30. With Regions 30

  31. Learned Relationships top mid bottom xi: point features Neighbor contextual feature Learned weights 31

  32. Learned Relationships top mid bottom xi: point features Neighbor contextual feature Learned weights 32

  33. Experiments 3 large-scale datasets CMU (26M), Moscow State (10M), Univ. Wash (10M) Multiple classes (4 to 8) car, building, veg, wire, fence, people, trunk, pole, ground, street sign Different sensors SICK (ground), ALTM 2050 (aerial), Velodyne (ground) Comparisons Graphical models, exemplar based 33

  34. Quantitative Results 0.9 0.8 Average F1 Score 0.7 0.6 0.5 0.4 0.3 CMU [1] Moscow A [2] Moscow B [2] UWash [3] Ours Related Work [2] Shapovalov PCV 2010 LogReg [1] Munoz CVPR 2009 [3] Lai RSS 2010 * 34 * Use additional semi-supervised data not leveraged by other methods.

  35. [1] Munoz, et. al. CVPR 2009 CMU Dataset Ours Max Margin CRF [1] 35

  36. [1] Munoz, et. al. CVPR 2009 CMU Dataset Ours Max Margin CRF [1] 36

  37. [1] Munoz, et. al. CVPR 2009 CMU Dataset Ours Max Margin CRF [1] 37

  38. Moscow State Dataset Ours Logistic regression 38

  39. Conclusion Simple and fast approach for scene labeling No graphical model Labeling via 5x logistic regression predictions context context C0 C1 C2 Support flexible contextual features Learning rich relationships 39

  40. Thank you! And Questions? Acknowledgements US Army Research Laboratory, Collaborative Technology Alliance QinetiQ North America Robotics Fellowship 40

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#