RIDE: Reversal Invariant Descriptor Enhancement

ICCV 2015
RIDE: Reversal Invariant
Descriptor Enhancement
Speaker: Lingxi Xie
Authors: Lingxi Xie, Jingdong Wang,
Weiyao Lin, Bo Zhang, Qi Tian
State Key Laboratory of Intelligent Technology and Systems
Department of Computer Science and Technology
Tsinghua University
http://www.tsinghua.edu.cn
Outline
Introduction
The Bag-of-Feature Model
The RIDE Algorithm
Motivation
Towards Reversal Invariance
Experimental Results
Conclusions
2/24/2025
ICCV 2015 - Presentation
2
Outline
Introduction
The Bag-of-Feature Model
The RIDE Algorithm
Motivation
Towards Reversal Invariance
Experimental Results
Conclusions
2/24/2025
ICCV 2015 - Presentation
3
Image Classification
 
2/24/2025
ICCV 2015 - Presentation
4
Outline
Introduction
The Bag-of-Feature Model
The RIDE Algorithm
Motivation
Towards Reversal Invariance
Experimental Results
Conclusions
2/24/2025
ICCV 2015 - Presentation
5
2/24/2025
ICCV 2015 - Presentation
6
Raw Image
Image
Descriptors
Visual
Vocabulary
Compact
Feature Codes
Image-level
Vector
Gradient-based Local Descriptors:
SIFT [Lowe, IJCV04]
HOG [Dalal, CVPR05]
LCS [Perronnin, ECCV10]
Clustering Methods:
K-Means
Hierarchical K-Means [Nister, CVPR06]
Approximate K-Means [Philbin, CVPR07]
Hard/Soft/Sparse Coding methods:
Vector Quantization
LLC Encoding [Wang, CVPR10]
Fisher Vector Encoding [Perronnin, ECCV10]
Spatial Pooling:
Sum Pooling/Max Pooling,
Spatial Pyramid Matching [Lazebnik, CVPR06]
Geometric Phrase Pooling [Xie, ACMMM12]
2/24/2025
ICCV 2015 - Presentation
7
Raw Image
Image
Descriptors
Visual
Vocabulary
Compact
Feature Codes
Image-level
Vector
Gradient-based Local Descriptors:
SIFT [Lowe, IJCV04]
HOG [Dalal, CVPR05]
LCS [Perronnin, ECCV10]
Clustering Methods:
K-Means
Hierarchical K-Means [Nister, CVPR06]
Approximate K-Means [Philbin, CVPR07]
Hard/Soft/Sparse Coding methods:
Vector Quantization
LLC Encoding [Wang, CVPR10]
Fisher Vector Encoding [Perronnin, ECCV10]
Spatial Pooling:
Sum Pooling/Max Pooling,
Spatial Pyramid Matching [Lazebnik, CVPR06]
Geometric Phrase Pooling [Xie, ACMMM12]
Outline
Introduction
The Bag-of-Feature Model
The RIDE Algorithm
Motivation
Towards Reversal Invariance
Experimental Results
Conclusions
2/24/2025
ICCV 2015 - Presentation
8
Image Matching: Reversal Copy
2/24/2025
ICCV 2015 - Presentation
9
 
What We Want
 
What SIFT Does
Image Matching: Reversal Objects
2/24/2025
ICCV 2015 - Presentation
10
 
What We Want
 
What SIFT Does
Image Retrieval: Settings
 
Aircraft-100 Dataset
100 Aircraft Models, 100 Samples in each Model
ALL Images are Manually Oriented to Right
Why Aircrafts?
The Orientation of an Aircraft is Easy to Judge!
2/24/2025
ICCV 2015 - Presentation
11
 
Orientation?
Image Retrieval: Sample Images
2/24/2025
ICCV 2015 - Presentation
12
Image Retrieval: Original Image
2/24/2025
ICCV 2015 - Presentation
13
 
QUERY
 
Mean AP:
 0.4143
Mean Dist.: 0.83
Mean 
TP
 Dist.: 0.34
Self-Ranking:
 #1
First 
FP
: #18
 
#1
:
 
BAE-125
 
#2
:
 
BAE-125
 
#3
:
 
BAE-125
 
#4
:
 
BAE-125
 
#5
:
 
BAE-125
 
#6
:
 
BAE-125
 
Model:
BAE-125
 
RESULT
Image Retrieval: Reversed Image
2/24/2025
ICCV 2015 - Presentation
14
 
QUERY
 
Model:
BAE-125
 
Mean AP:
 0.
0025
Mean Dist.: 1.09
Mean 
TP
 Dist.: 1.06
Self-Ranking:
 #514
First 
TP
: #388
 
RESULT
 
#1:
 
707-320
 
#2:
 
DC-3
 
#3:
 
Cessna-560
 
#4:
 
MD-80
 
#5:
 
737-400
 
#6:
 
747-100
Image Retrieval: Comparison
2/24/2025
ICCV 2015 - Presentation
15
QUERY
Mean AP:
 0.4143
Mean Dist.: 0.83
Mean 
TP
 Dist.: 0.34
Self-Ranking:
 #1
First 
FP
: #18
Model:
BAE-125
QUERY
Model:
BAE-125
Mean AP:
 0.
0025
Mean Dist.: 1.09
Mean 
TP
 Dist.: 1.06
Self-Ranking:
 #514
First 
TP
: #388
What is Observed?
After an Image is Reversed ...
Handcrafted Descriptors cannot be Matched
Feature Representations are Completely Different
Classifying a Dataset with Reversed Objects
Reversed Objects are Different Prototypes
Less # of Training Samples for each Prototype
Inferior Recognition Accuracy
2/24/2025
ICCV 2015 - Presentation
16
Outline
Introduction
The Bag-of-Feature Model
The RIDE Algorithm
Motivation
Towards Reversal Invariance
Experimental Results
Conclusions
2/24/2025
ICCV 2015 - Presentation
17
What Happened after Reversal?
2/24/2025
ICCV 2015 - Presentation
18
 
Original
 
SIFT
 
Reversed
 
SIFT
 
2
 
1
 
0
 
7
 
6
 
5
 
4
 
3
 
2’
 
3’
 
4’
 
5’
 
6’
 
7’
 
0’
 
1’
 
Original
Index
 
Reversed
Index
Reversal Invariance: Formulation
2/24/2025
ICCV 2015 - Presentation
19
2/24/2025
ICCV 2015 - Presentation
20
2/24/2025
ICCV 2015 - Presentation
21
2/24/2025
ICCV 2015 - Presentation
22
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
 
Descriptor
 
Bin
 
0
 
1
 
2
 
3
 
4
 
5
 
6
 
7
Summary
Why Reversal Invariance?
Reversal Generates Different Prototypes
How to Obtain Reversal Invariance?
Compute Orientation, Get the Maximum
Extra Computational Costs
Cheap: Little Time, No Memory
RIDE can be Applied to Other Descriptors!
2/24/2025
ICCV 2015 - Presentation
23
Outline
Introduction
The Bag-of-Feature Model
The RIDE Algorithm
Motivation
Towards Reversal Invariance
Experimental Results
Conclusions
2/24/2025
ICCV 2015 - Presentation
24
Datasets
Fine-Grained Object Recognition
Oxford Pet-37 (37 Cats & Dogs, 7390 Images)
Aircraft-100 (100 Aircraft Models, 10000 Images)
Stanford Dog-120 (120 Dogs, 20780 Images)
Caltech-UCSD Bird-200 (200 Birds, 11788 Images)
Scene Classification
LandUse-21 (21 Land Uses, 2100 Images)
MIT Indoor-67 (67 Indoor Scenes, 15620 Images)
SUN-397 (397 In/Out-door Scenes, 108754 Images)
2/24/2025
ICCV 2015 - Presentation
25
Settings
The BoVW Model
Images are Resized, 300 Pixels on Longer Axis
Various Descriptors, Step = 6, Window Size = 12
PCA Reduced to 64 (Color-SIFT to 128)
GMM with 32 Components
Fisher Vector Encoding
Spatial Pyramid with 3 Horizontal Stripes
Linear SVM with C = 10
Stronger Features can be Used
2/24/2025
ICCV 2015 - Presentation
26
Models
Four Different Models
Original Descriptors (
ORIG
)
RIDE on Original Descriptors (
RIDE
)
Original Descriptors with Augmentation (
AUGM
)
RIDE with Doubled Codebook Size (
RIDEx2
)
Why Using 
RIDEx2
?
Comparable Computational Costs with 
AUGM
Fair Comparison
2/24/2025
ICCV 2015 - Presentation
27
Pet-37 Performance
2/24/2025
CVPR 2014 - Presentation
28
OPP-SIFT
46.53
SIFT
LCS
Fused
RGB-SIFT
ORIG
37.92
43.25
52.06
44.90
49.01
RIDE
42.28
44.27
54.69
47.35
48.72
AUGM
42.24
45.12
54.67
46.98
51.19
RIDEx2
45.61
46.83
57.51
49.53
Aircraft-100 Performance
2/24/2025
CVPR 2014 - Presentation
29
OPP-SIFT
47.06
SIFT
LCS
Fused
RGB-SIFT
ORIG
53.13
41.82
57.36
57.89
53.12
RIDE
57.82
42.86
61.27
63.09
51.39
AUGM
57.16
43.13
60.59
62.48
55.79
RIDEx2
60.14
44.81
63.62
65.11
Flower-102 Performance
2/24/2025
CVPR 2014 - Presentation
30
OPP-SIFT
76.12
SIFT
LCS
Fused
RGB-SIFT
ORIG
53.68
73.47
76.96
71.52
79.68
RIDE
59.12
75.30
80.51
74.97
78.83
AUGM
58.01
75.88
79.49
74.18
81.69
RIDEx2
61.09
77.40
82.14
77.10
Bird-200 Performance
2/24/2025
CVPR 2014 - Presentation
31
OPP-SIFT
35.40
SIFT
LCS
Fused
RGB-SIFT
ORIG
25.77
36.18
38.11
31.36
42.18
RIDE
32.14
38.50
44.73
39.16
41.72
AUGM
31.60
38.97
43.98
39.79
44.30
RIDEx2
34.07
40.16
46.38
41.73
Time & Memory Costs (on Bird-200)
2/24/2025
CVPR 2014 - Presentation
32
(RAM)
3.71GB
Desc.
Codeb.
Encod.
Classifi.
ORIG
2.27Hrs
0.13Hrs
0.78Hrs
1.21Hrs
3.71GB
RIDE
2.29Hrs
0.13Hrs
0.78Hrs
1.21Hrs
7.52GB
AUGM
2.30Hrs
0.13Hrs
1.56Hrs
2.46Hrs
7.51GB
RIDEx2
2.29Hrs
0.27Hrs
1.28Hrs
2.42Hrs
Comparison to the State-of-the-Art
Compared with
[DB]: the Paper which Proposed Database
[MAX]: Max-SIFT, Another Reversal Invariant
Descriptors for Image Classification, 
ICASSP 2015
Fine-Grained Object Recognition
[GMP]: Generalized Max Pooling, CVPR 2014
Scene Classification
[DIR]: Dirichlet-based Features, CVPR 2014
2/24/2025
ICCV 2015 - Presentation
33
Comparison: Fine-Grained
2/24/2025
CVPR 2014 - Presentation
34
[GMP]
56.8
ORIG
RIDE
[DB]
[MAX]
P-37
60.24
63.49
59.21
60.65
N/A
A-100
74.61
78.92
48.69
74.39
84.6
F-102
83.53
86.45
72.8
83.13
33.3
B-200
47.61
50.81
17.0
47.20
Comparison: Fine-Grained
2/24/2025
CVPR 2014 - Presentation
35
[DIR]
92.8
ORIG
RIDE
[DB]
[MAX]
LandU-21
93.64
94.71
81.19
92.91
63.4
Indoor-67
63.17
64.93
26.1
62.45
46.1
SUN-397
48.35
50.12
38.0
47.69
Bird-200 with Detected Parts
2/24/2025
CVPR 2014 - Presentation
36
Paper
Ours
+RIDE
+RIDEx2
Chai, ICCV13
56.6
57.7
60.7
61.9
Gavves, ICCV13
62.7
62.9
65.2
66.1
Summary
Reversal Invariance is Useful for Recognition
RIDE Produces Consistent Accuracy Gain
For Every Single Case (Dataset with Descriptor)
RIDE Cooperates well with Part Detectors
RIDE is Cheap and Efficient
Extra Time and Memory Costs are Ignorable
2/24/2025
ICCV 2015 - Presentation
37
Outline
Introduction
The Bag-of-Feature Model
The RIDE Algorithm
Motivation
Towards Reversal Invariance
Experimental Results
Conclusions
2/24/2025
ICCV 2015 - Presentation
38
Conclusions
Reversal Invariance is Important
Causes Significant Difference in Images
More Prototypes, Less Training Samples
A Popular Solution: Data Augmentation
Our Solution: RIDE
Reversal Invariant Descriptor Enhancement
Aligning an Image with Its Orientation
Accuracy Gain, Little Extra Computation
2/24/2025
ICCV 2015 - Presentation
39
Future Proposals
Other Image Applications?
Tell the Orientation of an Image
Cooperation with Deep CNN?
Convolutional is NOT Reversal Invariant!
Can we Apply Similar Techniques?
Reversal Invariance vs. Data Augmentation
2/24/2025
ICCV 2015 - Presentation
40
Thank you!
 
Questions please?
2/24/2025
ICCV 2015 - Presentation
41
Slide Note
Embed
Share

Enhance your understanding of the RIDE algorithm presented at ICCV 2015, focusing on reversal invariance and image classification. Dive deep into image-level vector spatial pooling, geometric phrase pooling, and compact feature coding methods. Explore gradient-based local descriptors like SIFT and HOG, along with visual vocabulary clustering techniques. Delve into insightful experimental results and conclusions to elevate your knowledge in visual recognition technology.

  • ICCV
  • Image Classification
  • Feature Coding
  • Visual Recognition
  • Image Descriptors

Uploaded on Feb 24, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. ICCV 2015 RIDE: Reversal Invariant Descriptor Enhancement Speaker: Lingxi Xie Authors: Lingxi Xie, Jingdong Wang, Weiyao Lin, Bo Zhang, Qi Tian State Key Laboratory of Intelligent Technology and Systems Department of Computer Science and Technology Tsinghua University http://www.tsinghua.edu.cn

  2. Outline Introduction The Bag-of-Feature Model The RIDE Algorithm Motivation Towards Reversal Invariance Experimental Results Conclusions 2/24/2025 ICCV 2015 - Presentation 2

  3. Outline Introduction The Bag-of-Feature Model The RIDE Algorithm Motivation Towards Reversal Invariance Experimental Results Conclusions 2/24/2025 ICCV 2015 - Presentation 3

  4. Image Classification 2/24/2025 ICCV 2015 - Presentation 4

  5. Outline Introduction The Bag-of-Feature Model The RIDE Algorithm Motivation Towards Reversal Invariance Experimental Results Conclusions 2/24/2025 ICCV 2015 - Presentation 5

  6. Image-level Vector Spatial Pooling: Sum Pooling/Max Pooling, Spatial Pyramid Matching [Lazebnik, CVPR06] Geometric Phrase Pooling [Xie, ACMMM12] Compact Feature Codes Hard/Soft/Sparse Coding methods: Vector Quantization LLC Encoding [Wang, CVPR10] Fisher Vector Encoding [Perronnin, ECCV10] Visual Vocabulary Clustering Methods: K-Means Hierarchical K-Means [Nister, CVPR06] Approximate K-Means [Philbin, CVPR07] Image Descriptors Gradient-based Local Descriptors: SIFT [Lowe, IJCV04] HOG [Dalal, CVPR05] LCS [Perronnin, ECCV10] Raw Image 2/24/2025 ICCV 2015 - Presentation 6

  7. Image-level Vector Spatial Pooling: Sum Pooling/Max Pooling, Spatial Pyramid Matching [Lazebnik, CVPR06] Geometric Phrase Pooling [Xie, ACMMM12] Compact Feature Codes Hard/Soft/Sparse Coding methods: Vector Quantization LLC Encoding [Wang, CVPR10] Fisher Vector Encoding [Perronnin, ECCV10] Visual Vocabulary Clustering Methods: K-Means Hierarchical K-Means [Nister, CVPR06] Approximate K-Means [Philbin, CVPR07] Image Descriptors Gradient-based Local Descriptors: SIFT [Lowe, IJCV04] HOG [Dalal, CVPR05] LCS [Perronnin, ECCV10] Raw Image 2/24/2025 ICCV 2015 - Presentation 7

  8. Outline Introduction The Bag-of-Feature Model The RIDE Algorithm Motivation Towards Reversal Invariance Experimental Results Conclusions 2/24/2025 ICCV 2015 - Presentation 8

  9. Image Matching: Reversal Copy What We Want What SIFT Does 2/24/2025 ICCV 2015 - Presentation 9

  10. Image Matching: Reversal Objects What We Want What SIFT Does 2/24/2025 ICCV 2015 - Presentation 10

  11. Image Retrieval: Settings Aircraft-100 Dataset 100 Aircraft Models, 100 Samples in each Model ALL Images are Manually Oriented to Right Why Aircrafts? The Orientation of an Aircraft is Easy to Judge! 2/24/2025 ICCV 2015 - Presentation 11

  12. Image Retrieval: Sample Images 2/24/2025 ICCV 2015 - Presentation 12

  13. Image Retrieval: Original Image RESULT QUERY #1:BAE-125 #2:BAE-125 ? = 0.00 ? = 0.22 Model: BAE-125 Mean AP: 0.4143 Mean Dist.: 0.83 Mean TP Dist.: 0.34 Self-Ranking: #1 First FP: #18 ? = 0.23 #4:BAE-125 #3:BAE-125 ? = 0.23 #5:BAE-125 #6:BAE-125 ? = 0.24 ? = 0.25 2/24/2025 ICCV 2015 - Presentation 13

  14. Image Retrieval: Reversed Image RESULT QUERY #1:707-320 #2:DC-3 ? = 0.81 ? = 0.83 Model: BAE-125 Mean AP: 0.0025 Mean Dist.: 1.09 Mean TP Dist.: 1.06 Self-Ranking: #514 First TP: #388 #3:Cessna-560 ? = 0.84 #4:MD-80 ? = 0.84 #5:737-400 #6:747-100 ? = 0.84 ? = 0.85 2/24/2025 ICCV 2015 - Presentation 14

  15. Image Retrieval: Comparison QUERY QUERY Model: BAE-125 Model: BAE-125 Mean AP: 0.4143 Mean Dist.: 0.83 Mean TP Dist.: 0.34 Self-Ranking: #1 First FP: #18 Mean AP: 0.0025 Mean Dist.: 1.09 Mean TP Dist.: 1.06 Self-Ranking: #514 First TP: #388 2/24/2025 ICCV 2015 - Presentation 15

  16. What is Observed? After an Image is Reversed ... Handcrafted Descriptors cannot be Matched Feature Representations are Completely Different Classifying a Dataset with Reversed Objects Reversed Objects are Different Prototypes Less # of Training Samples for each Prototype Inferior Recognition Accuracy 2/24/2025 ICCV 2015 - Presentation 16

  17. Outline Introduction The Bag-of-Feature Model The RIDE Algorithm Motivation Towards Reversal Invariance Experimental Results Conclusions 2/24/2025 ICCV 2015 - Presentation 17

  18. What Happened after Reversal? Original SIFT 0 1 Reversed SIFT 0 1 3 2 1 0 2 3 2 3 3 2 1 0 4 4 5 5 7 6 5 4 6 7 6 7 7 6 5 4 8 8 11 10 9 8 9 10 11 9 10 11 11 10 9 8 12 13 12 13 15 14 14 14 15 15 13 12 15 14 13 12 3 3 2 2 1 1 1 2 3 1 2 3 Original Index Reversed Index 4 4 0 0 0 4 0 4 14 8 + 5 = 117 13 8 + 7 = 111 5 5 6 6 7 7 7 6 5 7 6 5 2/24/2025 ICCV 2015 - Presentation 18

  19. Reversal Invariance: Formulation Descriptors Original Descriptor: ? Reversed Descriptor: ?R What is Reversal Invariance? A Function: ? ? Which Holds: ? ? = ? ?Rfor ANY ? 2/24/2025 ICCV 2015 - Presentation 19

  20. How to Find ? ? ? A Specific Definition Define: ? ? ? ?,?R In Which, ? ?,?R must be Symmetric So, ? ? ? ?,?R= ? ?R,? = ? ?R NOTE: ?RR= d Reversing an Image Twice is NO Change! 2/24/2025 ICCV 2015 - Presentation 20

  21. How to Find ? ?,?R? Another Specific Definition Define: ? ?,?R is either ? or ?R Maximally Preserving the Description Power of SIFT There can be Many Other Solutions An Orientation Function ? ? The Extent that ? is Oriented to Right Compare ? ? with ? ?R The One with Larger Value is Selected If Equal, Select the One with Larger Alphabetical Order 2/24/2025 ICCV 2015 - Presentation 21

  22. How to Find ? ? ? Bin 2 Descriptor 3 1 ? 0 1 2 3 4 0 ? 4 5 6 7 5 6 7 2 8 9 10 11 ?= ?3,1 ?3,1 2 12 13 14 15 2 ?= ?3,1 ?3,1 2 15 7 15 7 ? ? ??= ??= ??,? ??,? ? ? = ??? ?=0 ?=0 ?=0 ?=0 2/24/2025 ICCV 2015 - Presentation 22

  23. Summary Why Reversal Invariance? Reversal Generates Different Prototypes How to Obtain Reversal Invariance? Compute Orientation, Get the Maximum Extra Computational Costs Cheap: Little Time, No Memory RIDE can be Applied to Other Descriptors! 2/24/2025 ICCV 2015 - Presentation 23

  24. Outline Introduction The Bag-of-Feature Model The RIDE Algorithm Motivation Towards Reversal Invariance Experimental Results Conclusions 2/24/2025 ICCV 2015 - Presentation 24

  25. Datasets Fine-Grained Object Recognition Oxford Pet-37 (37 Cats & Dogs, 7390 Images) Aircraft-100 (100 Aircraft Models, 10000 Images) Stanford Dog-120 (120 Dogs, 20780 Images) Caltech-UCSD Bird-200 (200 Birds, 11788 Images) Scene Classification LandUse-21 (21 Land Uses, 2100 Images) MIT Indoor-67 (67 Indoor Scenes, 15620 Images) SUN-397 (397 In/Out-door Scenes, 108754 Images) 2/24/2025 ICCV 2015 - Presentation 25

  26. Settings The BoVW Model Images are Resized, 300 Pixels on Longer Axis Various Descriptors, Step = 6, Window Size = 12 PCA Reduced to 64 (Color-SIFT to 128) GMM with 32 Components Fisher Vector Encoding Spatial Pyramid with 3 Horizontal Stripes Linear SVM with C = 10 Stronger Features can be Used 2/24/2025 ICCV 2015 - Presentation 26

  27. Models Four Different Models Original Descriptors (ORIG) RIDE on Original Descriptors (RIDE) Original Descriptors with Augmentation (AUGM) RIDE with Doubled Codebook Size (RIDEx2) Why Using RIDEx2? Comparable Computational Costs with AUGM Fair Comparison 2/24/2025 ICCV 2015 - Presentation 27

  28. Pet-37 Performance ORIG RIDE AUGM RIDEx2 SIFT 37.92 42.28 42.24 45.61 43.25 44.27 45.12 46.83 LCS Fused 52.06 54.69 54.67 57.51 RGB-SIFT 44.90 47.35 46.98 49.53 OPP-SIFT 46.53 49.01 48.72 51.19 2/24/2025 CVPR 2014 - Presentation 28

  29. Aircraft-100 Performance ORIG RIDE AUGM RIDEx2 SIFT 53.13 57.82 57.16 60.14 41.82 42.86 43.13 44.81 LCS Fused 57.36 61.27 60.59 63.62 RGB-SIFT 57.89 63.09 62.48 65.11 OPP-SIFT 47.06 53.12 51.39 55.79 2/24/2025 CVPR 2014 - Presentation 29

  30. Flower-102 Performance ORIG RIDE AUGM RIDEx2 SIFT 53.68 59.12 58.01 61.09 73.47 75.30 75.88 77.40 LCS Fused 76.96 80.51 79.49 82.14 RGB-SIFT 71.52 74.97 74.18 77.10 OPP-SIFT 76.12 79.68 78.83 81.69 2/24/2025 CVPR 2014 - Presentation 30

  31. Bird-200 Performance ORIG RIDE AUGM RIDEx2 SIFT 25.77 32.14 31.60 34.07 36.18 38.50 38.97 40.16 LCS Fused 38.11 44.73 43.98 46.38 RGB-SIFT 31.36 39.16 39.79 41.73 OPP-SIFT 35.40 42.18 41.72 44.30 2/24/2025 CVPR 2014 - Presentation 31

  32. Time & Memory Costs (on Bird-200) ORIG RIDE AUGM RIDEx2 Desc. 2.27Hrs 2.29Hrs 2.30Hrs 2.29Hrs 0.13Hrs 0.13Hrs 0.13Hrs 0.27Hrs Codeb. Encod. 0.78Hrs 0.78Hrs 1.56Hrs 1.28Hrs Classifi. 1.21Hrs 1.21Hrs 2.46Hrs 2.42Hrs (RAM) 3.71GB 3.71GB 7.52GB 7.51GB 2/24/2025 CVPR 2014 - Presentation 32

  33. Comparison to the State-of-the-Art Compared with [DB]: the Paper which Proposed Database [MAX]: Max-SIFT, Another Reversal Invariant Descriptors for Image Classification, ICASSP 2015 Fine-Grained Object Recognition [GMP]: Generalized Max Pooling, CVPR 2014 Scene Classification [DIR]: Dirichlet-based Features, CVPR 2014 2/24/2025 ICCV 2015 - Presentation 33

  34. Comparison: Fine-Grained P-37 A-100 F-102 B-200 ORIG 60.24 74.61 83.53 47.61 63.49 78.92 86.45 50.81 RIDE [DB] 59.21 48.69 72.8 17.0 [MAX] 60.65 74.39 83.13 47.20 [GMP] 56.8 N/A 84.6 33.3 2/24/2025 CVPR 2014 - Presentation 34

  35. Comparison: Fine-Grained LandU-21 Indoor-67 SUN-397 ORIG 93.64 63.17 48.35 94.71 64.93 50.12 RIDE [DB] 81.19 26.1 38.0 [MAX] 92.91 62.45 47.69 [DIR] 92.8 63.4 46.1 2/24/2025 CVPR 2014 - Presentation 35

  36. Bird-200 with Detected Parts Chai, ICCV13 Gavves, ICCV13 Paper 56.6 62.7 57.7 62.9 Ours +RIDE 60.7 65.2 +RIDEx2 61.9 66.1 2/24/2025 CVPR 2014 - Presentation 36

  37. Summary Reversal Invariance is Useful for Recognition RIDE Produces Consistent Accuracy Gain For Every Single Case (Dataset with Descriptor) RIDE Cooperates well with Part Detectors RIDE is Cheap and Efficient Extra Time and Memory Costs are Ignorable 2/24/2025 ICCV 2015 - Presentation 37

  38. Outline Introduction The Bag-of-Feature Model The RIDE Algorithm Motivation Towards Reversal Invariance Experimental Results Conclusions 2/24/2025 ICCV 2015 - Presentation 38

  39. Conclusions Reversal Invariance is Important Causes Significant Difference in Images More Prototypes, Less Training Samples A Popular Solution: Data Augmentation Our Solution: RIDE Reversal Invariant Descriptor Enhancement Aligning an Image with Its Orientation Accuracy Gain, Little Extra Computation 2/24/2025 ICCV 2015 - Presentation 39

  40. Future Proposals Other Image Applications? Tell the Orientation of an Image Cooperation with Deep CNN? Convolutional is NOT Reversal Invariant! Can we Apply Similar Techniques? Reversal Invariance vs. Data Augmentation 2/24/2025 ICCV 2015 - Presentation 40

  41. Thank you! Questions please? 2/24/2025 ICCV 2015 - Presentation 41

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#giItT1WQy@!-/#