Artificial Neural Networks Overview and Models

1 / 20

Embed Share

Explore the fundamentals of Artificial Neural Networks (ANN) including the structure, types, activation functions, and the application of Perceptron models. Learn how ANN processes information and identifies patterns in data for effective decision-making.

michalski_l Follow

Uploaded on Mar 20, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Data Mining Lecture Notes for Chapter 4 Artificial Neural Networks Introduction to Data Mining , 2nd Edition by Tan, Steinbach, Karpatne, Kumar 02/14/2018 Introduction to Data Mining, 2ndEdition 1

Artificial Neural Networks (ANN) Black box Input X1 1 1 1 1 0 0 0 0 X2 0 0 1 1 0 1 1 0 X3 0 1 0 1 1 0 1 0 Y -1 1 1 1 -1 -1 1 -1 X1 Output X2 Y X3 Output Y is 1 if at least two of the three inputs are equal to 1. 02/14/2018 Introduction to Data Mining, 2nd Edition 2

Artificial Neural Networks (ANN) Input nodes Black box X1 1 1 1 1 0 0 0 0 X2 0 0 1 1 0 1 1 0 X3 0 1 0 1 1 0 1 0 Y -1 1 1 1 -1 -1 1 -1 Output node X1 0.3 0.3 X2 Y X3 0.3 t=0.4 = + + if 3 . 0 ( 3 . 0 3 . 0 x 4 . 0 ) Y sign X X X 1 2 3 0 1 0 = where ( ) sign x 1 if x 02/14/2018 Introduction to Data Mining, 2nd Edition 3

Artificial Neural Networks (ANN) Input nodes Model is an assembly of inter-connected nodes and weighted links Black box Output node X1 w1 w2 X2 Y w3 Output node sums up each of its input value according to the weights of its links X3 t Perceptron Model d = i = ( ) Y sign w X t Compare output node against some threshold t i i 1 d = i = ( ) sign w X i i 0 02/14/2018 Introduction to Data Mining, 2nd Edition 4

General Structure of ANN x1 x2 x3 x4 x5 Input Layer Input Neuron i Output I1 wi1 wi2 wi3 Activation function g(Si ) Oi Si I2 Oi Hidden Layer I3 threshold, t Output Layer Training ANN means learning the weights of the neurons y 02/14/2018 Introduction to Data Mining, 2nd Edition 5

Artificial Neural Networks (ANN) Various types of neural network topology single-layered network (perceptron) versus multi-layered network Feed-forward versus recurrent network Various types of activation functions (f) ( = ) Y f w iX i i 02/14/2018 Introduction to Data Mining, 2nd Edition 6

Perceptron Single layer network Contains only input and output nodes Activation function: f = sign(w x) Applying model is straightforward = + + if 3 . 0 ( 3 . 0 3 . 0 x ) 4 . 0 Y sign X X X 1 2 3 0 1 0 = where ( ) sign x 1 if x X1 = 1, X2 = 0, X3 =1 => y = sign(0.2) = 1 02/14/2018 Introduction to Data Mining, 2nd Edition 7

Perceptron Learning Rule Initialize the weights (w0, w1, , wd) Repeat For each training example (xi, yi) Compute f(w, xi) Update the weights: i x ) 1 + = + ( ( ) ( ) k k k ( , ) w w y f w x i i Until stopping condition is met 02/14/2018 Introduction to Data Mining, 2nd Edition 8

Perceptron Learning Rule Weight update formula: ) 1 + = + learning : ( ( ) ( ) k k k ( , ) ; i x rate w w y f w x i i Intuition: Update weight based on error: If y=f(x,w), e=0: no update needed If y>f(x,w), e=2: weight must be increased so that f(x,w) will increase If y<f(x,w), e=-2: weight must be decreased so that f(x,w) will decrease ) i x = ( ) k ( , e y f w i 02/14/2018 Introduction to Data Mining, 2nd Edition 9

Example of Perceptron Learning i x + = + ( 1 ) ( ) ( ) k k k ( , ) w w y f w x i i d = i = ( ) Y sign w iX i 0 = 1 . 0 w0 0 -0.2 0 0 0 -0.2 -0.2 0 -0.2 w1 0 -0.2 0 0 0 0 0 0 0 w2 0 0 0 0 0 0 0 0.2 0.2 w3 0 0 0.2 0.2 0.2 0 0 0.2 0.2 Epoch w0 w1w2w3 0 0 1 -0.2 2 -0.2 3 -0.4 4 -0.4 0.2 0.4 0.4 5 -0.6 0.2 0.4 0.2 6 -0.6 0.4 0.4 0.2 X1 X2 X3 1 0 1 0 1 1 1 1 0 0 0 1 0 1 0 0 0 1 2 3 4 5 6 7 8 Y -1 1 1 1 -1 -1 1 -1 0 0 0 0 0 0 0 1 0 1 1 0 1 0 0.2 0.2 0.4 0.2 0.4 0.2 02/14/2018 Introduction to Data Mining, 2nd Edition 10

Perceptron Learning Rule Since f(w,x) is a linear combination of input variables, decision boundary is linear For nonlinearly separable problems, perceptron learning algorithm will fail because no linear hyperplane can separate the data perfectly 02/14/2018 Introduction to Data Mining, 2nd Edition 11

Nonlinearly Separable Data XOR Data = y x x 1 2 x1 0 1 0 1 x2 0 0 1 1 y -1 1 1 -1 02/14/2018 Introduction to Data Mining, 2nd Edition 12

Multilayer Neural Network Hidden layers intermediary layers between input & output layers More general activation functions (sigmoid, linear, etc) 02/14/2018 Introduction to Data Mining, 2nd Edition 13

Multi-layer Neural Network Multi-layer neural network can solve any type of classification task involving nonlinear decision surfaces XOR Data Input Layer Hidden Layer Output Layer w31 x1 n1 n3 w53 w41 y n5 w32 w54 x2 n2 n4 w42 02/14/2018 Introduction to Data Mining, 2nd Edition 14

Learning Multi-layer Neural Network Can we apply perceptron learning rule to each node, including hidden nodes? Perceptron learning rule computes error term e = y-f(w,x) and updates weights accordingly Problem: how to determine the true value of y for hidden nodes? Approximate error in hidden nodes by error in the output nodes Problem: Not clear how adjustment in the hidden nodes affect overall error No guarantee of convergence to optimal solution 02/14/2018 Introduction to Data Mining, 2nd Edition 15

Gradient Descent for Multilayer NN E Weight update: + = ( ) 1 ( ) k k w w j j w j N 1 Error function: = i j = ( ) E t f w x i j ij 2 1 Activation function f must be differentiable For sigmoid function: i + = + ( ) 1 ( ) k k ( ) i 1 ( ) w w t o o o x j j i i i ij Stochastic gradient descent (update the weight immediately) 02/14/2018 Introduction to Data Mining, 2nd Edition 16

Gradient Descent for MultiLayer NN Hidden layer k-1 Hidden layer k Hidden layer k+1 For output neurons, weight update formula is the same as before (gradient descent for perceptron) Neuron p Neuron x wpi wix Neuron i wqi wiy For hidden neurons: Neuron q Neuron y j 1 ( + = + ( ) 1 ( ) k k i 1 ( ) w w o o w x pi pi i j ij pi i : = Output neurons )( ) o o t o j j j j j k : = Hidden neurons j 1 ( ) o o w j j k jk j 02/14/2018 Introduction to Data Mining, 2nd Edition 17

Design Issues in ANN Number of nodes in input layer One input node per binary/continuous attribute k or log2 k nodes for each categorical attribute with k values Number of nodes in output layer One output for binary class problem k or log2 k nodes for k-class problem Number of nodes in hidden layer Initial weights and biases 02/14/2018 Introduction to Data Mining, 2nd Edition 18

Characteristics of ANN Multilayer ANN are universal approximators but could suffer from overfitting if the network is too large Gradient descent may converge to local minimum Model building can be very time consuming, but testing can be very fast Can handle redundant attributes because weights are automatically learnt Sensitive to noise in training data Difficult to handle missing attributes 02/14/2018 Introduction to Data Mining, 2nd Edition 19

Recent Noteworthy Developments in ANN Use in deep learning and unsupervised feature learning Seek to automatically learn a good representation of the input from unlabeled data Google Brain project Learned the concept of a cat by looking at unlabeled pictures from YouTube One billion connection network 02/14/2018 Introduction to Data Mining, 2nd Edition 20

Artificial Neural Networks Overview and Models

Download Presentation

Presentation Transcript

Related

More Related Content