High-Capacity Recursive Neural Coding
New IFS RAAM
Is there a "natural" behavior of a RAAM system that we are not taking advantage
of? The single-layer network with (by default) k inputs and 2k outputs is used
recurrently, with either the left or right k outputs fed back to the input.
Because of the sigmoidal "squashing" function on the outputs, this is a mapping
which operates much like a contractive map; thus, in the limit, the sets of all
possible decoded representations will fall on a fractal attractor
(
Pollack, 1991 ;
Stucki and Pollack, 1992 ;
; Kolen, 1994). Given a random
initial
condition, any sequence of left/right decodings will ultimately put the output
on the attractor; therefore, any initial condition not on the attractor will
have a structured transient to the attractor.
Thus we can fully understand the logical conundrums described above:
- By not using the attractor as the terminal test, decoding a
random initial condition could lead to an infinite loop of decodings
which are on the attractor yet never satisfy the "logical" terminal test.
- Because the training allowed non-terminals to float around,
some non-terminal could float into regions defined as terminals, leading to
the early termination problem.
- Finally, because the attractor has a fractal nature, it could
not be easily modeled by simple single-layer terminal tests or by epsilon
regions around specific points.
With this knowledge in hand, we can now propose a new terminal test: "Is this
point on the attractor?" This is not a simple calculation, but is more a
logical one: the non-terminals (elements in space not on the attractor) are clearly
distinct from the terminals, and every code has a finite tree-shaped transient.
By assigning different terminal symbols to different aras of the attractor derived from the weights of the decoder, and entire grammar of trees over those symbols is induced!