Perceptrons (popularized by Marvin Minsky) are the building blocks of a certain kind of ANNs.
Input to a perceptron consists of a vector of real values. Output is
1 if a linear combination of these inputs is greater than a threshold
and -1 otherwise. Formally, if
is the
input to a perceptron, then its output
is given by
Learning a perceptron amounts to just choosing the best
.
The hypothesis space H for an n input perceptron is the set of all
possible real valued weight vectors in n+1 dimensions.
If
is given, then
partitions the
(n+1)-dimensional input space into points where
and
.
In fact,
is the equation for an (n+1)-dimensional hyperplane that we can
interpret as a decision surface.
If we are asked to find some
such that
divides all the positive examples
from the negative
examples
,
is this possible? Unfortunately, not always,
because we can imagine linearly inseparable points.
Important observation: If we have m linearly separable training examples, note that we have a linear system of m simultaneous inequations in n unknowns (weights). We can solve this using standard linear programming techniques.
Representational power of perceptrons:
If we represent the boolean values True and False by 1 and -1 resp., all primitive boolean functions, AND, OR, NAND and NOR can be represented by perceptrons.
Each is a special case of the m-of-n functions, i.e. functions where at least m of n inputs must be true.
To implement an m-of-n function, set all input weights
to some identical constant value. Then solve an
inequality for the w0 that gives the desired function. Work out an
example.
Unfortunately, XOR is a boolean function that cannot be represented by a single perceptron. The set of possible instances is linearly inseparable.
XOR can, however, be represented using combinations (networks) of perceptrons, because XOR can be expressed using just AND and OR.
So, in fact, every boolean function can be represented by some network of perceptrons only two levels deep.
Point to ponder: What is the easiest way to determine the change in weights required to convert an AND perceptron into an OR one?