*Here I continue my effort to write an accessible introduction to Parallel Distributed Processing. In Part I, I introduced units, connections, and connection strengths. In Part 2, I introduce the concept of the activation function, and the difference between linear and saturating (e.g., sigmoidal) activation functions. But with vampires!*

**3. The Activation Function**Remember, the goal of this network is to answer questions like “If I date young vampire and old vampire simultaneously, how much danger will I be in?” or “Is it more dangerous to date a human and an old vampire or a young vampire and a shapeshifter?” In order to answer these questions with mathematical rigor, we need to find away to put numbers on our units and connections, in order to put a number on DANGER!.

We already used our
intuition to come up with some relative connection strengths: these were reflected in the thicknesses of
the connections in Figure 3. As it would
turn out, using intuition to come up with connection weights is, historically,
the first method that was used to build this type of network. However, it was quickly discovered that when
you have networks that have, say, 100,000 connections, this is not a very
practical plan. What works better for
most of the networks we will consider is using a mathematical rule called a

**training algorithm**to set the weights. You won’t ever have to know much about training rules, other than that they exist and they set weights automatically, without the modeler having to figure out 100,000 (or more!) weights.
We definitely don’t need
to think about training rules right now.
Instead, let’s use the intuitive pipe-thicknesses we came up with
earlier to assign some numerical weights to the network:

Figure 4 |

In Figure 4, numbers are
assigned to weights based on our intuitions about how dangerous each type of
suitor is. That is, we thought that we
would rank these potential suitors as follows in order of dangerousness, from
least to most:

1: Human 2: Young Vampire 3(tie): Shapeshifter,
Werewolf 4: Old Vampire

The weights in Figure 4
are simply the numbers from that ranking.

Now, how do these
numbers help us to find numerical solutions to Sookie's DANGER problem? One way to think of them
is as how much the DANGER unit would be activated for each type of suitor, if
Sookie poured “1” drop of activation in to that suitor’s unit. So, if in her simulations Sookie activates
the “Young Vampire” unit to “1”, the DANGER unit would be activated to “2”.
That number is, of course, meaningless by itself, and it only starts to make
sense when we consider what would happen if Sookie activated the “Old Vampire”
unit instead. In that situation, based
on these weights, the DANGER unit would be activated to “4”. This reflects the idea that the old vampire
is twice as dangerous as the young vampire.

In this example, the

**activation function**for our network is the following:
Work out the examples
from above for yourself, to convince yourself this is true. Remember, Sookie is starting out with simple
simulations, so she’s only pouring “1” of activation in to a single unit to
start.

Our activation function
permits quantification of some more of our intuitions. For example, if Sookie enters in to only a
half-hearted relationship—pours in only “1/2” activation to a suitor—that is
probably only half as dangerous as a more serious relationship. This intuition is reflected by the fact that
a weight has to be multiplied by its suitor’s activation before being “sent” to
the DANGER unit. Similarly, the more
suitors Sookie engages in a relationship with, the more items will go in to the
sum, resulting in more activation on the DANGER unit. In fact, with this formula, we now have the
mathematical machinery to answer questions of arbitrary complexity. For example “How dangerous is it to date 2
humans very seriously (human unit activation “6”) while simultaneously dating a
shapeshifter and werewolf half-heartedly (shapeshifter activation “.5” and
werewolf activation .5) after having revoked young vampire’s invitation to my
house (young vampire activation “-1”)?”
Check for yourself!

Once again, exploring
our intuitions about this model also reveals one of its flaws. Consider the following: In this model, every time Sookie adds a new
suitor (either by increasing the activation in a pre-existing unit or adding
some activation in a new unit—say she wants to start dating a necromancer, for
some reason), the total amount of predicted DANGER will increase. However, after a certain threshold of DANGER,
Sookie will no longer be in DANGER at all, because she will simply be
DEAD. That is, DANGER is a quantity that

**saturates.**There’s no realistic situation where Sookie can be in infinite danger. Mathematically, the problem here is that we have used a**linear**activation function, instead of a**saturating**activation function. That is, the relationship between input (suitor) activation and output (DANGER) activation in our network looks like this:
When in reality it
should look like this:

Figure 6 |

For essentially this
very reason (though not usually spelled out so luridly), most of the networks
we will study will use the

**sigmoidal**activation shown in Figure 6.**4. Interim Summary**At this point, in a very serious sense, we have learned everything there ever will be to know about connectionist networks. Namely, there will be

**units.**They will be

**connected.**The connections will differ in their

**weights,**and most of the time weights will be determined by a

**learning algorithm.**Information will flow between units at different levels as numerically specified by the

**activation function**, which will usually

**saturate.**Your first step in understanding any model that we read about will be to identify what kind of units there are, and how they are connected. If you can do that, you will be in good shape.

However, the models we
read about will be substantially more complicated than this one. Next, I will talk about two of the most
common complications we will encounter:
deep networks and networks with distributed representation.

## No comments:

## Post a Comment