AN INTRODUCTION TO PATTERN RECOGNITION
Pattern recognition, also known as pattern classification, involves analyzing data to categorize patterns efficiently. This field encompasses various approaches such as Statistical Pattern Recognition, Structural Pattern Recognition, Neural Pattern Recognition, and Fuzzy Logic Pattern Recognition. The process includes system design, problem learning, feature reduction, and machine learning techniques like Support Vector Machine and Neural Network. Understanding these methods is crucial for designing systems that can recognize patterns effectively across different domains.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
AN INTRODUCTION TO AN INTRODUCTION TO PATTERN RECOGNITION PATTERN RECOGNITION Rapporteur: (Chia-Chun Hsu) Advisor: (Jian-Jiun Ding, Ph.D) Graduate Institute of Communication Engineering National Taiwan University Taiwan,Taipei Digital Image and Signal Processing Laboratory
Outline Introduction to Pattern Recognition What is Pattern Recognition(PR)? Four Major Approaches PR system Design Problem Learning Bayesian Decision Theory Minimum-Error-Rate/MAP/ML/MD Classifier Error Bounds Approach Minimax Criterion Feature Reduction Machine Learning Support vector Machine Neural Network Fuzzy Logic Pattern Recognition Conclusion
Outline Introduction to Pattern Recognition What is Pattern Recognition(PR)? Four Major Approaches PR system Design Problem Learning Bayesian Decision Theory Minimum-Error-Rate/MAP/ML/MD Classifier Error Bounds Approach Minimax Criterion Feature Reduction Machine Learning Support vector Machine Neural Network Fuzzy Logic Pattern Recognition Conclusion
Introduction to Pattern Recognition What is Pattern Recognition(PR)? PR is the act of taking in raw data and taking an action based on the category of the pattern Hence, we also use Pattern Classification instead of Pattern recognition Not only restrict to image . PR is the research area that studies the operation and design of systems that recognize patterns in data.
Introduction to Pattern Recognition Four Major Approaches Statistical P.R.(Decision Theoretic P.R.) Structural P.R.(Syntactic P.R.) Grammar, String , tree and relation between each components Neural P.R. Fuzzy Logic P.R.
Introduction to Pattern Recognition input PR system Sensing Sounds 1D signal Image 2D signal Vedio,etc Segmentation Remove the background Feature extraction E.g. length, gender , etc. Classification Sensing segmentation Feature extraction Classification Post-processing This stage is Domain dependent E.g. Symptom and treatment Need expert Knowledge Post-processing decision
Introduction to Pattern Recognition Design Problem Collect Data Time Consuming Choose Features Choose Model Train Classifier Curse of the dimensionality From:[3]
Introduction to Pattern Recognition Learning Supervised Learning Category label for each pattern in the training set is provided. Unsupervised Learning The system forms clusters or natural groupings of input patterns.
Outline Introduction to Pattern Recognition What is Pattern Recognition(PR)? Four Major Approaches PR system Design Problem Learning Bayesian Decision Theory Minimum-Error-Rate/MAP/ML/MD Classifier Error Bounds Approach Minimax Criterion Feature Reduction Machine Learning Support vector Machine Neural Network Fuzzy Logic Pattern Recognition Conclusion
Bayesian Decision Theory Bayesian Error Rate :notation of category 1 :Class 1(e.g. male) 2 :Class 2(e.g. female) i(x):Choose Class i under the observed sample P(A,B)=P(A|B)P(B) =P(B|A)P(A) P(error)=P( choose 1 & actually from 2 ) +P( choose 2 & actually from 1 ) =P(choose 1 |actually 2 )P(actually 2) +P(choose 2 |actually 1 ) P(actually 1) =P( 1| 2 )P( 2)+P( 2| 1 ) P( 1) (without any measurement)
Bayesian Decision Theory Minimum-Error-Rate classifier Minimize P(error) with respect to (x) for all x That is , min ( ) P error (.) = where ( ) ( | ) ( ) P error P error x p x dx ( ) 0 p x x R x R min ( ) min ( | ) P error P error x (.) ( ) x
Bayesian Decision Theory Maximum a posteriori Classifier(MAP) The minimum-error-rate classifier min ( | ) xP error x x R ( ) For all x ( ( | ) | ) x choose choose P P x if we if we 2 1 = ( | ) p error x 1 2 Hence, given x, choose 1if P( 1|x)>P( 2|x) will minimize P(error|x). Maximum a posteriori classifier!!
Bayesian Decision Theory Maximum Likelihood Classifier(ML) MAP Classifier choose 1if P( 1|x)>P( 2|x) for all x P(x| 1)P( 1)>P(x| 2)P( 2) for all x If P( 1) =P( 2) (Equal prior) Choose 1if P(x| 1)>P(x| 2) for all x Maximum Likelihood classifier!! where P(x| i) is called the likelihood of iwith respect to x.
Bayesian Decision Theory Minimum-Distance Classifier(MD) Consider the ML Classifier with P(x| i)~N( i, i2) also = ifor all i Choose 1if P(x| 1)>P(x| 2) for all x Choose 1if (Minimum Distance Classifier) 1 2 ( ) ( ) x x 2 2
Bayesian Decision Theory Minimum error rate MAP Classifier Equal Prior P( 1)=P( 2) Maximum Likelihood Classifier Minimum distance Classifier Equal Prior P( 1)=P( 2) P(x| i)~ N( i, i2) = ifor all i
Bayesian Decision Theory Error Bounds approach In general, the error rate probability P(error) is hard to calculate. Chernoff Distance ( ) = ( ) 1 1 ( ) ( ) ( ) ( | x ) ( | x ) P error P P p p dx 1 2 1 2 where 0 1, and the minimum of is called Chernoff Bound. Bhattacharyya Bound 1 2 ( ) ( ) P error
Bayesian Decision Theory Minimax Criterion Worst case Design = + ( ) ( | p x ) ( ) ( | ) ( | ) P error dx P P x dx P x dx ( ) 2 1 1 2 P 1 R R R 1 2 1 = ( | ) ( | ) 0 P x dx P x dx 1 2 R R 2 1 = ( ) ( | ) P error P x dx ( ) 2 P 1 R 1 Minimum error rate From:[3]
Outline Introduction to Pattern Recognition What is Pattern Recognition(PR)? Four Major Approaches PR system Design Problem Learning Bayesian Decision Theory Minimum-Error-Rate/MAP/ML/MD Classifier Error Bounds Approach Minimax Criterion Feature Reduction Machine Learning Support vector Machine Neural Network Fuzzy Logic Pattern Recognition Conclusion
Feature Reduction It s intuitive that the accuracy of the classifier increase as the feature dimension become higher. The high dimension features degrade the efficiency of the system. Cures of the dimensionality. We should use a reasonably small number of features.
Feature Reduction Two major approaches PCA(Principal Component Analysis) Seeks a projection that best represents the data in a least-squares sense. LDA(Linear Discriminant Anaysis) Seeks a projection that best separates the data.
Feature Reduction Principle Components Analysis(PCA) How closely we can represent samples of a distribution with a set of features? Maximize the transformed variance = = t t argmax(var( )) argmax( ) x :Covariance matrix of x 1 1 1 1 1 1 1 is the eigenvector corresponding to the largest eigenvalue of 1 is called the first principal component = = t t argmax(var( i is the eigenvector corresponding to the largest eigenvalue of but < , 0 for i j i j = 1 [ y )) argmax( ) x i i i i i i = ]T ... x ' 1 2 ' d d
Feature Reduction Principle Components Analysis(PCA) Minimizing representation error ' d + = x m ae :mean of x m i i = 1 i is the eigenvector corresponding to the largest eigenvalue of but < , 0 for i j i j e e = e i Conclusion: Minimum representation Maximizing Transformed Variance
Feature Reduction From:[3]
Feature Reduction Linear(Multiple) Discriminant Analysis Based on projection of high dimension data onto a subspace, hoping well separated by class . The subspace is chosen to maximize the class separation. 1 Projection=(S +S ) ( ) m m 1 2 1 2 , :Covariance matrix of Class 1 sample and Class 2 sample S S 1 2 , : mean of Class 1 sample and Class 2 sample m m sample 1 2
Feature Reduction From:[3]
Outline Introduction to Pattern Recognition What is Pattern Recognition(PR)? Four Major Approaches PR system Design Problem Learning Bayesian Decision Theory Minimum-Error-Rate/MAP/ML/MD Classifier Error Bounds Approach Minimax Criterion Feature Reduction Machine Learning Support vector Machine Neural Network Fuzzy Logic Pattern Recognition Conclusion
Machine Learning Neural Networks Without any statistical model or information. Imitate the neural networks in biology. Supervised Learning. Neural Network unit: Picture from: Google search
Machine Learning Neural Networks Mathematical description: = g b w x + + + + ( ... ) Output w x w x 1 1 2 2 N N : b Bias term 1 e + ( ): g x ( . . function ) Some logistic function e g sigmoid x 1 Picture from: Google search
Machine Learning Neural Networks Traditional Neural Network
Machine Learning Neural Networks Training :Back Propagation Algorithm Idea: Minimize the error function. Back propagate the error and update the weights. Algoithm: Output layer: error 1 L L ( ) ( )(1 g in ( )) g in yj w w d y jk jk k k 1 L w w yj L jk jk Hidden layer: 1 1 2 L L L g in ( ij )(1 ( )) w w g in y W k ij ij i jk k w w ij 1 L
Machine Learning Support Vector Machine(SVM) Supervised Learning Look for the appropriate hyper-plane to separate the data into different class. Less training time than the neural networks
Machine Learning Support Vector Machine(SVM) Training: Idea: Maximize the margin d+ and d- The data on the hyper-plane is called the Support Vector . = ( ) i T 1 1 ( = 2) x b y class i x b ( ) i T 1 1 ( 1) y class + Tx ( ' || || ) 1 1 x b b = = = = where d d + || || || || 2 max( ) (non-convex) || 1 2 || || ) (convex) 2 min( ||
Outline Introduction to Pattern Recognition What is Pattern Recognition(PR)? Four Major Approaches PR system Design Problem Learning Bayesian Decision Theory Minimum-Error-Rate/MAP/ML/MD Classifier Error Bounds Approach Minimax Criterion Feature Reduction Machine Learning Support vector Machine Neural Network Fuzzy Logic Pattern Recognition Conclusion
Fuzzy Logic Pattern Recognition What is the Fuzzy set? Crisp Set X = {1,2,3,4,5} A= {1,2,3} 1,2,3 A 4,5 A Fuzzy set 0.2 1 {( , 0.5 2 ( ))| A x 0.8 3 x 0.1 4 } X 1 5 A= + + + + = ( ):membership function Ax A x
Fuzzy Logic Pattern Recognition An example of Fuzzy set: Age Old Young Middle age age Picture from: [4]
Fuzzy Logic Pattern Recognition Fuzzy Inference Feature x={x1x2x3 xN-1xN} If x1is A and x2 is B ., then y is E.g. If Color = red and Shape = round, then y = tomato. Another application: Fuzzy C-means Algorithm: Used in image segmentation Iterative method Use the weighted similarity - Weight: The membership value of a pixel in some cluster. - Similarity: Inner product norm metric
Outline Introduction to Pattern Recognition What is Pattern Recognition(PR)? Four Major Approaches PR system Design Problem Learning Bayesian Decision Theory Minimum-Error-Rate/MAP/ML/MD Classifier Error Bounds Approach Minimax Criterion Feature Reduction Machine Learning Support vector Machine Neural Network Fuzzy Logic Pattern Recognition Conclusion
Conclusion Pattern recognition is nearly everywhere in our life, each case relevant to classification or detection may be a topic of pattern recognition. Bayesian model use the statistical information to make the decision rule. Feature reduction can increase the efficiency of PR system. SVM has the advantage of less training time, and neural network has the advantage of high detection rate.
Reference [1]R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, ed. John Wiley & Sons, 2001. [2]Nagalakshmi, G., and S. Jyothi. "A Survey on Pattern Recognition using Fuzzy Clustering Approaches." [3] Yi-Ping Hung, Pattern Classification and analysis ,NTU Ceiba, Lecture note. [4] Pei-Hwa Huang, Fuzzy system , NTOU moodle, Lecture note. [5] Ken-Yi Lee, Support Vector Machine tutorial , NTU, available from: http://www.cmlab.csie.ntu.edu.tw/~cyy/learning/tutorials/SVM1.pdf