wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Transcript

Machine Learning

Winner Takes All Networks (WTA)

Prof. Dr. Volker Sperschneider

AG Maschinelles Lernen und Natürlichsprachliche Systeme

Institut für Informatik Technische Fakultät

Albert-Ludwigs-Universität Freiburg

[email protected]

Page 2: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Motivation •  Unsupervised learning, clustering, prototyping •  Training set of real-valued n-dimensional vectors

(= „blue“ dots in next diagram)

•  Determine k vectors in n-dimensional space (“red“ dots in next diagram) that best represent all training vectors

nlxx ℜ∈,,1

nkww ℜ∈,,1

Page 3: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Motivation

pxwdist −= 3

2w1w

Page 4: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Complexity of Clustering

•  Given point set

•  Given allowed number of clusters k.

•  Problem 1: n = 2, l and k variable

Finding optimal clustering is NP-hard.

nlxxx ℜ∈,,, 21

Page 5: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Complexity of Clustering

•  Given point set

•  Given allowed number of clusters k.

•  Problem 2: k = 2, l and n variable

Finding optimal clustering is NP-hard.

nlxxx ℜ∈,,, 21

Page 6: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Complexity of Clustering

•  Given point set

•  Given allowed number of clusters k.

•  Problem 3: k = 2, n fixed, l variable Finding optimal clustering is polynomially solvable with polynomial of degree

nlxxx ℜ∈,,, 21

)log( 1 llO nk+

Page 7: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Motivation •  Assume that norm of training vectors x has

been normalized to 1 on average. •  Therefore we may assume that norm of

prototype vectors w can be fixed to 1, too.

•  Shows that distance minimization is equivalent to inner product maximization.

)()(),(

xxwwxxxwww

xwxwxwxwdistTTTT

+−=+−

=−−=−=

Page 8: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Motivation •  Inner products are computed as net input of

neurons. Linear neuron also output this as activation value.

•  Inner product of vectors measure their similarity.

•  Using k linear output neurons by taking the one that outputs maximum activation (= inner product) selects the one whose weight vector maximally resembles the input vector: THE WINNER

Page 9: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Motivation Thus the typical WTA-architecture looks as follows – here 6 input neurons and 4 output neurons used:

1wxT1w2w3w4w

x2wxT3wxT4wxT

Page 10: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Sometimes output neurons are arranged in a specific manner with neighborhood relations between them – an example with 2 input neurons and 15 = 5 x 3 output neurons arranged in a 2-dimensional grid:

Page 11: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Or as a ring structure

Page 12: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Another arrangement is the hexagonal grid:

Page 13: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Neighborhoods are useful when prototype vectors carry symbolic labels and identical labels occur in local neighborhoods:

Page 14: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

•  Neighborhoods are useful when neighbours are prototypes for similar tactile (e.g.) sensory inputs - inner topology preserving map of body surface:

Page 15: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

•  Map of face with eyes, nose and mouth:

Page 16: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

•  Linearly arranged output neurons may become prototypes for city locations and define a short tour – remember TSP

•  Note that topology preservation is only in one

direction: Neurons far away in the line may nevertheless represent close vectors.

Page 17: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

•  2-dimenional grid may be trained on triples (x,y,z) with z = f(x,y)) to span the graph of a 2-dimensio- nal real function f in a topology preserving manner – later used for function approximation, and inverse kinematics

•  After training the net, fresh function values are

recovered as follows: Given (x,y), determine the winner neuron of the net with all third weights omitted; then read out function value f(x,y) as third weight of the winner neuron.

Page 18: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

x y

Page 19: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

x y

Page 20: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Pole balancing

Keep a pole in balance by applying an acceleration (force) depending on angle θ measuring deviation from vertival position, and derivative v of θ by time that measures of how fast angle θ changes.

force force

θ θ

Page 21: Winner Takes All Networks (WTA) - uni-freiburg.deml.informatik.uni-freiburg.de/_media/teaching/ss13/ml/sperschneider/wta.pdf · Winner Takes All Networks (WTA) Prof. Dr. Volker Sperschneider

Pole balancing The correct control function is, using appropriate constants a and b: A 2-dimensional grid as Kohonen-net is trained to map the graph of function f, that is, to map to the surface of points in 3-dimensional space

dtd

dtd baf θθ θθ += sin),(

)),(,,( vfv θθ