Slightly technical, this paper provides a good overview of how Johnson thought about cognition and perception, with particular focus on the importance of interaction, play, and intention.
ABSTRACT
Introspection leads us to believe that universals and meaning are extracted from sensory information before responsive action commences: as if we act first as. passive observers whose subsequent reactions follow a process of recognition and decision. We are inclined to base our designs of cognitive systems upon such subjective insights. however, if we take into account the necessity for a prior, ontogenetic growth of a repertoire of perception, it appears more probable that we axe engaged in unceasing interaction with our environment and that a response itself, driven by homeostatic mechanisms, contains the meaningful information about the object of the response. The components of the world are thereby described not in terms of their intrinsic spatio-temporal parameters but by their relation to the observer. A rudimentary visual system employing a Probability State Variable (PSV) Controller as a homeostat is described, and the prototype peripheral hardware suggested. The addition of an associative memory would allow the buildup of a cognitive model of the world, but the reactive searching by the system described–as if it were an autonomous "perceptuo-postural’’ mechanism–must begin the process of discovering the properties of reality.
INTRODUCTION
It is the purpose of this paper to suggest both in conceptual and in experimental form a self-organizing system of coordinated sensorymotor behavior which acts as an active perceptual buffer between the real world and the stored cognitive model of that world. It is, in essence, a reflexive postural mechanism of a very general sort.
When asked to describe the process by which one recognizes objects or to build a system which will simulate this process, one is generally led by way of introspection to formulate the steps of information reduction as if observation were a passive activity: as if the sensory inputs existed in isolation from effector mechanisms which make possible an interaction with the environment; as if the information retrieval which devolves from our opportunity to participate in the world is simply ”in addition to" that which is available without participation. Our very desire to formulate the process as a sequential algorithm obscures for us the alternative notion that cur meaningful contact with the world might be more accurately described as arising from the equilibrium-seeking adjustments of a complex, homeostatic mechanism which is in operation at all times. The difference in point of view is not trivial; it is the difference between the idea, that an object itself has meaning within its own spatio-temporal context, and the idea that the meaning of the object is to be found in its relation to tne observer. Gibson (Ref. 4) considers the senses as perceptual systems; the intent of this present paper is to move a step further away from tne raw data, and to suggest active, behavioral contact with the world through a homeostatic buffer.
Most of the pattern recognition procedures given currency (Refs. 5, 6, 7) are no more than response to stimulus machines and have no means by which to probe the world nor to reorient their gaze. The latter limitations necessarily obtain when the patterns are discrete, predetermined, and fixed as in tne analysis of cloud-chamber tracks or weather satellite photographs, however, when the object to be recognized is available as an attribute of the immediate environment of the observing system, it is possible to derive from it more than is conveyed by a simple sequence of ’’looks’’ at its spatio-temporal qualities. In fact, if only passive observation is allowed, then a very important question is begged because we require that the observer be sophisticated: i.e., that somewhere there exist a model for comparison so that a meaningful output may result from, the observation. The question: How did that model arise in the first place? What mechanism of ”knowing’’ is available to conscious, physiological organisms?
If one may take the contrasting view that an organism has at its disposal at the outset a process of interaction with its world which is determined solely by the physical constraints of its own embodiment and those of the immediate environment and by a rudimentary homeostatic, equilibrium-seeking response mechanism whies it has inherited as the most primitive of survival tactics, then insight may be gained into the build up of a repertoire of interpretation and meaning. The repertoire grows by way of direct, participating interaction between the physical constraints of the world pitted against those which are peculiar to the organism. The world and the self are discovered simultaneously by way of a sort of interface impedance measurement having the nature of an on-going loop interaction. Whatever cognitive model the organism may derive as the result of experience is not a necessary part of the interactive process. Quite the contrary: the information serving to build up a cognitive model is that which is descriptive of the responsive behavior of the organism necessary for the maintenance of homeostasis. Any testing of the model which might be desirable is performed by the system’s inserting from within additional constraints at the input, there to modify system behavior as if they were another component of the external world. Thus, the homeostatic process acts within the mechanism of perception, as if it were a peripheral postural reflex, performing its assigned task in the face of applied perturbations both external and internal.
A COMPARATIVE EXAMPLE
Consider Fig. 1, which embodies in block form a familiar and generalized approach to the problem of pattern recognition. Many of the systems to be found described in the literature do not continue the information processing around a closed loop but to terminate at the successful classification of the viewed pattern or at extracting some meaning from the data, through comparison with a stored, learned model of the world. When an effector feedback loop is included and can either alter the environment or reorient the system’s view of it, a more comprehensive set of data may be obtained, but the arrival at an object identification is still the function of the recognition algorisms, the attempt is to identify the external object by characteristics peculiar to it but not in terms descriptive of the relation of the system to the object. Appreciation of relation is a lamer function, sequentially, of the decision clock.
Note also that if the effector outputs (labelled ’Behavior’’) are to be effective and useful to the system, they must be a somewhat predictable function of the decisions made and of the ’’Do’’ commands issued. In other words, the world-model stored by the system must not only be able to identify ar. input pattern sufficiently well to select an appropriate course of action, but the decisions and resulting commands must be meaningful within the context of a prior knowledge of the effector dynamics and of the other physical constraints presented by the environment. It may be necessary but is not sufficient that the block, labelled "Effectors” be a stable postural mechanism containing its own error-correcting feedback paths. The ’’Do’’ command itself must be tailored to produce recognizable effects. Fig.1a is a vastly simplified view of Fig. 1 and is presented for comparison site Fig. 2a as illustrative of the main point of this paper.
Now consider Fig. 2 ’which diagrams the general form of the existential, experiential system discussed in the introduction. The sensor array might be the same as that of Fig. 1, as might the effectors, but there the similarity ends. In fact, the interface with the environment need not be specified very precisely, provided only that it offer opportunities for dialog in appropriate parameters, for the information regarding their spatial arrangement will be found to be redundant. What is essential is the behavior of the system, not its structure. The Zen master will tell you that what kills is not the’ arrow’ but its flight.
The block labelled "Statistical Fooling” accepts the raw data and performs various convergent operations upon it, preserving the qualities of the ensemble but dispensing with the addresses of the data origin. The specific statistical characteristics to be derived from the ensemble of data channels will depend upon the homeostatic repertoire one wishes to build into the system. A detailed example is provided below in ’which sums of data magnitudes and a maximum difference are chosen. Whatever the convergent operations, their results are presented to the "Homeostat” whose goal, via manipulation of the entire system-environment loop, is the nulling or maximizing of its inputs. Actual nulls or maxima, may never in fact be reached but the system will continue to experiment with its output channels in such a manner as to tend toward its goals. Note that neither a prior, knowledge of the dynamics of the effector system nor of the properties of the physical environment is necessary. The meaningful output of the system is now the "Behavior” itself. Fig. 2a is an amplification of Fig. 2.
It is as through one system were asked to describe a pencil lying on the table before it. The system of Fig. 1 ’would tell us of the color, linearity, and pointedness of the pencil. The system of Fig. 2 would reach for it, pick it up. and write with it. The latter ’would be content that in demonstrating its relation of the pencil actively, it had thereby defined and described a pencil, and in particular that pencil. The system has made a Metaphoric identification of the pencil within a given context (Ref. 6).
A PROTOTYPE EXPERIMENTAL SYSTEM
Fig. 3 diagrams a rudimentary "visual” system ’which incorporates two identical mosaic arrays Ry and Bp of photoresistors as sensing elements, and a number of component position controls as effector outputs. The latter include alteration of the binocular convergence by manipulation of one mirror, K?; simultaneous focus variation of the two sets of optics, L^ and L; and independent control of azimuth and elevaticrNof the entire system (indicated). All outputs might be driven by incremental stepping motors.
The two photomosaic "retinas” are to be distributed as in Fig. 4 so that the elements are more densely concentrated at the center than, in the surround. The latter property – effectively an informational zoom – will operate to assure a centering upon the array of the image of whatever object the system finds "interesting”.
Electrically, the elements at corresponding locations in the two array’s will be connected bridge-fashion as in Fig. 5 so that the common tie-point voltage level, Ei, is a measure of the inequality of light falling onto the two cells. Likewise, the current flowing through the pair, li, is a measure of the sum of the light levels at the two corresponding points.
The "Statistical Pooling” operation is performed upon all of the bridge current and voltage measures without regard to the position within the mosaic arrays that individual pairs of cells maintain. It is suggested that by use of diode bridge techniques the following convergent operations be performed upon the data:
The "Homeostat” will be a multivariable probability State Variable (PSV) Controller (Refs. 9, 10) which will, manipulate the incremental outputs described above in such a manner as to null F, to maximize G (or null 1/c), and to null H.
Eq. (1) If Eq is a measure of the difference in the light falling upon the cells in the ith position of the two retinas, then a tendency to null the sum of all such absolute differences will serve to make the two images falling on those retinas as identical as possible. That is, seeking to null F will promote convergence of the binocular optics upon features of the visual field.
Eq. (2) If the current flowing through any pair of photocells is a measure of the brightness of corresponding points on the two retinas, then the tendency to maximize the largest difference to be found between any two such points assures us that the system will be likely to pay .more attention to boundaries of high contrast than low, since the former will provide large local differences. Maximizing G should also cause improvement in the sharpness of focus of the images.
Eq. (3) As an operation, H is similar to F, but the variables of interest are the time rates of change of the differences between corresponding image points. If F works to make the images as identical as possible, H is conducive to keeping those images from changing rapidly. The function of H nulled, within the context of the total behavior of the system, is to keep the random scanning fairly subdued except in a featureless environment, but to assure locking onto and following closely any moving object which contrasts with the background. Overall the system will attempt to maintain fixed and focussed upon its retinas the boundaries of highest contrast irrespective of their position, orientation, or movement.
It is anticipated that a rudimentary visual system, as described above, would exhibit sufficiently complex behavior when presented with a "real” environment to make evident to a viewer of the output behavior alone a wealth of information about the scene under scrutiny. The Observer of that behavior will be limited in his accuracy of interpretation largely by the amount of experience he has had in common with the system — the amount, that is, by which he is familiar with the system’s responses to the observable properties of the environment as observed by both. To put it another way: an appended "cognitive processor” which can introduce its own input ensembles and observe the system behavior, will be limited in its ’’knowledge’’ of the world by the experience it has had in building up a model. The knowledge that it will have will be describable only in. terms of experience: of relationship of the world and its parts to the system. All such relationships will have been discovered by way of participation in them (Ref. 11).
By its very nature, the PSV Controller will cause the system sufficient unrest, whatever the eventfulness of the surroundings, to assure that its gaze does not become fixed upon. a single feature but rather that it hunts constantly, to some extent, even in the presence of a homeostatically "satisfying” condition. Relative distribution of attention time will be more heavily-weighted toward the interesting features of the world, but only with a higher probability than for the others.
DISCUSSION
If one of the purposes in discussing cognitive systems is to model them after biological organisms, account should be taken of the behavioral properties necessary both for phylogenetic and for ontogenetic development. For the first, a primitive mechanism of survival must be provided; for the second, allowance must be made for the build-up of cognition from nothing except the immediate interactions available. This paper attempts to offer a crude first step to a solution, to both problems in the form of a selforganizing, "perceptuo-postural” mechanism.
Prior to the existence of any cognitive model — a preferred perceptual posture — the equilibrium-seeking mechanism, often termed homeostasis, can serve the survival requirements of the organism while offering a stable mode of interaction with the environment. Emergence from the preconscious (Ref. 1) mode toward a repertoire of integrated responses and predictive recognition of objective relations becomes more understandable when in fact the data employed for cognition is descriptive primarily of those relations. Furthermore, the homeostatic mechanisms themselves may be viewed as determined by a set of physical constraints rather than an acquired habit of information selection. In a sense, then, the system behavior may be described as having the qualities of ultimate heedfulness: attention only to the surroundings of the moment, unmodified by prior experience and therefore not yet useful for inductive verifications but always homeostatically purposeful.
The arrangement of Fig. 2a begins to suggest a form in which to envision the triadic relationship of an observer to his world (Ref. 12). There is shown separately the world (object), the relationship he has to it (word spoken), and the model in the head which generated the active expression of the relationship. The systems of Figs. 1 and la do not provide such a clear means of separation of these relations.
In the growing field of technology directed toward the development of semi-autonomous, remote, exploratory vehicles, it is anticipated that very general homeostatic systems will be of great use. The most difficult problems to be faced concern the optimum strategies of information gathering and reduction for return to earth via channels that are severely band-limited. Simple, behavioral descriptions of the process of interaction with the remote environment may offer the most meaningful and efficient use of those channels.
REFERENCES
Burrow, T., The Preconscious Foundations of Human Experience, Basic Books, 1964.
Piaget, J., The Origins of Intelligence in Children, Borton, 1903.
The title of this paper is adapted from an unpublished memorandum of Quillian, M.R., Wortman, P.M., and Baylor, G. W.: The Programmable Piaget: Behavior From the Standpoint of Radical Computerist.
Gibson, J. J., The Senses Considered as Perceptual Systems. Roughton Mifflin, 1966.
Kilsson, J. J., Adaptive Pattern Recognition: A Survey, 1966 Bionics Symposium, Dayton, Ohio.
Sebestyen, G. S., Decision-Making Precesses in Pattern Recognition, ACM Monograph Series, Macmillan, 1962.
Lin, W. C., and Fu, X. S., An Adaptive Pattern Recognition System Using Keuron- Like Elements, 1966 Bionics Symposium, Bayton, Okie.
Hermann, H. I. and Kotelly, J.C., An Approach to Formal Psychiatry, Perspectives in Biology and Medicine, Vol. 10, No. 2, Winter 1967.
Barron, R. L., Self-Organizing and Learning Control Systems, 1966 Bionics Symposium, Dayton, Ohio.
Barron, R. L., and Schalkowsky, S., On-Line Self-Organizing Control of Multiple-Goal, Multiple-Actuator Systems, 1967 Joint Automatic Control Conference, University of Pennsylvania, June 28-30, 1967.
Held, R., Plasticity in Sensory-Motor Systems. Scientific American, 213, 5, November 1963.
McCulloch, W.S., Lekton, Part XIV of Communication Theory and Research, Lee Thayer, Ed., Charles C. Thomas, 1967.