3-D Scene Analysis via Sequenced Predictions

3-D Scene Analysis via Sequenced

Predictions over Points and Regions

Xuehan Xiong

Daniel

Munoz

Drew

Bagnell

Martial

Hebert

Problem: 3D Scene Understanding

Car

Pole

Ground

Trunk

Wire

Building

Veg

Solution: Contextual Classification

•

Intractable

inference

•

Difficult to train

•

Limited success

Graphical models

Fig. from Anguelov, et al. CVPR 2005

Classical Approach: Graphical Models

Anguelov, et al. CVPR 2005

Triebel, et. al. IJCAI 2007

Munoz, et al. CVPR 2009

Kulesza NIPS 2007

Wainwright JMLR 2006

Finley & Joachims ICML 2008

Belief propagation

Mean field

MCMC

•

Intractable

inference

•

Difficult to train

•

Limited success

Graphical models

Fig. from Anguelov, et al. CVPR 2005

Classical Approach: Graphical Models

Anguelov, et al. CVPR 2005

Triebel, et. al. IJCAI 2007

Munoz, et al. CVPR 2009

Kulesza NIPS 2007

Wainwright JMLR 2006

Finley & Joachims ICML 2008

Belief propagation

Mean field

MCMC

•

Intractable

inference

•

Difficult to train

•

Limited success

Graphical models

Fig. from Anguelov, et al. CVPR 2005

Classical Approach: Graphical Models

Anguelov, et al. CVPR 2005

Triebel, et. al. IJCAI 2007

Munoz, et al. CVPR 2009

Kulesza

Wainwright

Finley & Joachims ICML 2008

Belief propagation

Mean field

MCMC

Our Approach: Inference Machines

•

Train an inference

procedure

, not a model.

–

To encode spatial layout and long range relations

–

Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010

•

Train an inference

procedure

, not a model.

–

To encode spatial layout and long range relations

–

Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010

•

Inference via sequential prediction

Our Approach: Inference Machines

Ours

•

Train an inference

procedure

, not a model.

–

To encode spatial layout and long range relations

–

Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010

•

Inference via sequential prediction

Our Approach: Inference Machines

point features

Example features

point features

point features

Contextual features

point features

point features

point features

point features

Local features only

Car

Pole

Building

Veg

Ground

Wire

Round 1

Round 2

Round 3

Car

Veg

Create regions

Level 2

Level 1

region features

Region  level

Pt level

Level 2

Level 1

point features

Point  level

Region level

With Regions

Learned Relationships

Neighbor contextual feature

Learned weights

Learned Relationships

Neighbor contextual feature

Learned weights

Experiments

•

3 large-scale datasets

–

CMU (26M), Moscow State (10M), Univ. Wash (10M)

•

Multiple classes (4 to 8)

–

car, building, veg, wire, fence, people, trunk, pole,

ground, street sign

•

Different sensors

–

SICK (ground), ALTM 2050 (aerial), Velodyne (ground)

•

Comparisons

–

Graphical models, exemplar based

Quantitative Results

[1] Munoz CVPR 2009

[2] Shapovalov PCV 2010

[3] Lai RSS 2010 *

* Use additional semi-supervised data not leveraged by other methods.

CMU Dataset

Ours

Max Margin CRF [1]

[1] Munoz, et. al. CVPR 2009

Ours

Max Margin CRF [1]

CMU Dataset

[1] Munoz, et. al. CVPR 2009

Ours

Max Margin CRF [1]

CMU Dataset

[1] Munoz, et. al. CVPR 2009

Moscow State Dataset

Ours

Logistic regression

Conclusion

•

Simple and fast approach for scene labeling

–

No graphical model

–

Labeling via 5x logistic regression predictions

•

Support flexible contextual features

–

Learning rich relationships

Thank you! And Questions?

•

Acknowledgements

–

US Army Research Laboratory, Collaborative

Technology Alliance

–

QinetiQ North America Robotics Fellowship

Slide Note

ICRA 2011

Embed Share

Download

This research delves into 3-D scene analysis using sequenced predictions over points and regions, presenting a solution through contextual classification. The classical approach of graphical models is explored, but limited success is noted. The innovative approach involves training an inference procedure to encode spatial layout and long-range relations, offering a promising method for advanced understanding of complex scenes.

dagmawir Follow

Uploaded on Feb 18, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript

3-D Scene Analysis via Sequenced Predictions over Points and Regions Xuehan Xiong Daniel Munoz Drew Bagnell Martial Hebert 1

Problem: 3D Scene Understanding 2

Solution: Contextual Classification Building Wire Pole Veg Trunk Car Ground 3

Classical Approach: Graphical Models Graphical models Intractable inference Belief propagation Mean field MCMC Difficult to train Kulesza NIPS 2007 Wainwright JMLR 2006 Finley & Joachims ICML 2008 Limited success Anguelov, et al. CVPR 2005 Triebel, et. al. IJCAI 2007 Munoz, et al. CVPR 2009 4 Fig. from Anguelov, et al. CVPR 2005

Classical Approach: Graphical Models Graphical models Intractable inference Belief propagation Mean field MCMC Difficult to train Kulesza NIPS 2007 Wainwright JMLR 2006 Finley & Joachims ICML 2008 Limited success Anguelov, et al. CVPR 2005 Triebel, et. al. IJCAI 2007 Munoz, et al. CVPR 2009 5 Fig. from Anguelov, et al. CVPR 2005

Classical Approach: Graphical Models Graphical models Intractable inference Belief propagation Mean field MCMC Difficult to train Kulesza Wainwright Finley & Joachims ICML 2008 Limited success Anguelov, et al. CVPR 2005 Triebel, et. al. IJCAI 2007 Munoz, et al. CVPR 2009 6 Fig. from Anguelov, et al. CVPR 2005

Our Approach: Inference Machines Train an inference procedure, not a model. To encode spatial layout and long range relations Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010 11

Our Approach: Inference Machines Train an inference procedure, not a model. To encode spatial layout and long range relations Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010 Inference via sequential prediction T T C0 C1 C2 F F F Reject E.g. Viola-Jones 2001 12

Our Approach: Inference Machines Train an inference procedure, not a model. To encode spatial layout and long range relations Daume III 2006, Tu 2008, Bagnell 2010, Munoz 2010 Inference via sequential prediction context context C0 C1 C2 Ours 13

Example features (0): xi point features = ) 0 ( ) 0 ( ) 0 ( LogReg ( ) Y X 14

(0): xi point features argmax( Y(0)) 15

top mid bottom (0): (1): xi xi point features Contextual features 16

top mid bottom (1): xi point features = ) 1 ( ) 1 ( ) 1 ( LogReg ( ) Y X 17

top mid bottom (1): xi point features = ) 1 ( ) 1 ( ) 1 ( LogReg ( ) Y X argmax( Y(1)) 18

top mid bottom (1): xi point features = ) 1 ( ) 1 ( ) 1 ( LogReg ( ) Y X 19

top mid bottom (2): xi point features = ) 2 ( ) 2 ( ) 2 ( LogReg ( ) Y X 20

Local features only Wire Building Veg Pole Car Ground 21

Round 1 22

Round 2 23

Round 3 Veg Car 24

Create regions Level 2 Level 1 25

(2): (0): (1): xj xj xj region features Pt level Region level 27

Level 2 Level 1 28

(2): (3): xi xi point features Region level Point level 29

With Regions 30

Learned Relationships top mid bottom xi: point features Neighbor contextual feature Learned weights 31

Learned Relationships top mid bottom xi: point features Neighbor contextual feature Learned weights 32

Experiments 3 large-scale datasets CMU (26M), Moscow State (10M), Univ. Wash (10M) Multiple classes (4 to 8) car, building, veg, wire, fence, people, trunk, pole, ground, street sign Different sensors SICK (ground), ALTM 2050 (aerial), Velodyne (ground) Comparisons Graphical models, exemplar based 33

Quantitative Results 0.9 0.8 Average F1 Score 0.7 0.6 0.5 0.4 0.3 CMU [1] Moscow A [2] Moscow B [2] UWash [3] Ours Related Work [2] Shapovalov PCV 2010 LogReg [1] Munoz CVPR 2009 [3] Lai RSS 2010 * 34 * Use additional semi-supervised data not leveraged by other methods.

[1] Munoz, et. al. CVPR 2009 CMU Dataset Ours Max Margin CRF [1] 35

[1] Munoz, et. al. CVPR 2009 CMU Dataset Ours Max Margin CRF [1] 36

[1] Munoz, et. al. CVPR 2009 CMU Dataset Ours Max Margin CRF [1] 37

Moscow State Dataset Ours Logistic regression 38

Conclusion Simple and fast approach for scene labeling No graphical model Labeling via 5x logistic regression predictions context context C0 C1 C2 Support flexible contextual features Learning rich relationships 39

Thank you! And Questions? Acknowledgements US Army Research Laboratory, Collaborative Technology Alliance QinetiQ North America Robotics Fellowship 40

3-D Scene Analysis via Sequenced Predictions

Download Presentation

Presentation Transcript

Related

More Related Content