|
|
Introduction
This
is a book chapter written by Peter Voss and
published in "Artificial General Intelligence" - Goertzel, Ben; Pennachin, Cassio (Eds).
Written in 2002, this describes the foundation of our project: the
low level, conceptual underpinnings that remain an important
functioning part of our current more advanced research. Note that many
crucial aspects of our current working model of higher-level
intelligence are not explicitly detailed in the book chapter that
follows below, or were developed after the chapter was written.
Peter Voss is
an entrepreneur with a background in electronics, computer systems,
software, and management. His research interest in cognitive science
and the inter-relationship between philosophy, psychology, ethics and
computer science culminated in the development of a breakthrough model
of Artificial General Intelligence. He founded Adaptive A.I.
Inc., with the goal of building and commercializing this highly
adaptive, general-purpose AI engine. He considers himself an Extropian,
and is actively involved in futurism, free-market ideas, and extreme
life-extension.
Book Chapter also available as Word 2000 (.doc)
Essentials of General Intelligence:
The direct path to AGI
1. Introduction
This paper explores the concept of 'artificial general intelligence' (AGI) - its nature, importance, and how best to achieve it. Our theoretical model posits that general intelligence
comprises a limited number of distinct, yet highly integrated,
foundational functional components. Successful implementation of this
model will yield a highly adaptive, general-purpose system that can
autonomously acquire an extremely wide range of specific knowledge and
skills. Moreover, it will be able to improve its own cognitive ability
through self-directed learning. We believe that, given the right
design, current hardware/ software technology is adequate for
engineering practical AGI systems. Our current implementation of a
functional prototype is described below.
The idea of 'general intelligence' is quite controversial; I do not
substantially engage this debate here but rather take the existence of
such non domain-specific abilities as a given (Gottfredson 1998). It
must also be noted that this essay focuses primarily on low-level (i.e.
roughly animal level) cognitive ability. Higher-level functionality,
while an integral part of our model, is only addressed peripherally.
Finally, certain algorithmic details are omitted for reasons of
proprietary ownership.
2. General Intelligence
Intelligence can be defined simply as an entity's ability to achieve
goals - with greater intelligence coping with more complex and novel
situations. Complexity ranges from the trivial - thermostats and
mollusks (that in most contexts don't even justify the label
'intelligence') - to the fantastically complex; autonomous flight
control systems and humans.
Adaptivity, the ability to deal with changing and novel
requirements, also covers a wide spectrum: from rigid, narrowly
domain-specific to highly flexible, general purpose. Furthermore,
flexibility can be defined in terms of scope
and permanence
- how much, and how often it changes. Imprinting is an example of
limited scope and high permanence, while innovative, abstract problem
solving is at the other end of the spectrum. While entities with high
adaptivity and flexibility are clearly superior - they can potentially
learn to achieve any possible goal - there is a hefty efficiency price
to be paid: For example, had Deep Blue also been designed to learn
language, direct airline traffic, and do medical diagnosis, it would
not have become Chess champion (all other things being equal).
General Intelligence comprises the essential, domain-independent skills necessary for acquiring
a wide range of domain-specific knowledge (data & skills) - i.e.
the ability to learn anything (in principle). More specifically, this
learning ability needs to be autonomous, goal-directed, and highly
adaptive:
·
Autonomous -- Learning occurs both automatically, through exposure to
sense data (unsupervised), and through bi-directional interaction with
the environment, including exploration/ experimentation
(self-supervised).
·
Goal-directed - Learning is directed (autonomously) towards achieving
varying and novel goals and sub-goals -- be they 'hard-wired',
externally specified, or self-generated. Goal-directedness also implies
very selective learning and data acquisition (from a massively
data-rich, noisy, complex environment).
·
Adaptive - Learning is cumulative, integrative, contextual and adjusts
to changing goals and environments. General adaptivity not only copes
with gradual changes, but also seeds and facilitates the acquisition of
totally novel abilities.
General cognitive ability stands in sharp contrast to inherent
specializations such as speech- or face-recognition, knowledge
databases/ ontologies, expert systems, or search, regression or
optimization algorithms. It allows an entity to acquire a virtually
unlimited range of new specialized abilities. The mark of a generally intelligent system is not having a lot of knowledge and skills, but being able to acquire and improve them - and to be able to appropriately apply
them. Furthermore, knowledge must be acquired and stored in ways
appropriate both to the nature of the data, and to the goals and tasks
at hand.
For example, given the correct set of basic core capabilities, an
AGI system should be able to learn to recognize and categorize a wide
range of novel perceptual patterns that are acquired via different
senses, in many different environments and contexts. Additionally, it
should be able to autonomously learn appropriate, goal-directed
responses to such input contexts (given some feedback mechanism).
We take this concept to be valid not only for high-level human
intelligence, but for lower-level animal-like ability. The degree of
'generality' (i.e., adaptability) varies along a continuum from
genetically 'hard-coded' responses (no adaptability), to high-level
animal flexibility (significant learning ability as in, say, a dog),
and finally to self-aware human general learning ability.
Core Requirements for General Intelligence
General intelligence, as described above, demands a number of
irreducible features and capabilities. In order to proactively
accumulate knowledge from various (and/ or changing) environments, it
requires:
1.
Senses to obtain features from 'the world' (virtual or actual),
2.
A coherent means for storing knowledge obtained this way, and
3.
Adaptive output/ actuation mechanisms (both static and dynamic).
Such knowledge also needs to be automatically adjusted and updated
on an ongoing basis; new knowledge must be appropriately related to
existing data. Furthermore, perceived entities/ patterns must be stored
in a way that facilitates concept formation and generalization. An
effective way to represent complex feature relationships is through
vector encoding (Churchland 1995).
Any practical applications of AGI (and certainly any real-time uses) must inherently
be able to process temporal data as patterns in time - not just as
static patterns with a time dimension. Furthermore, AGIs must cope with
data from different sense probes (e.g., visual, auditory, and data),
and deal with such attributes as: noisy, scalar, unreliable,
incomplete, multi-dimensional (both space/ time dimensional, and having
a large number of simultaneous features), etc. Fuzzy pattern matching
helps deal with pattern variability and noise.
Another essential requirement of general intelligence is to cope
with an overabundance of data. Reality presents massively more features
and detail than is (contextually) relevant, or that can be usefully
processed. This is why the system needs to have some control over what
input data is selected for analysis and learning - both in terms of which
data, and also the degree of detail. Senses ('probes') are needed not
only for selection and focus, but also in order to ground concepts - to
give them (reality-based) meaning.
While input data needs to be severely limited by focus and
selection, it is also extremely important to obtain multiple views of
reality - data from different feature extractors or senses. Provided
that these different input patterns are properly associated, they can
help to provide context for each other, aid recognition, and add
meaning.
In addition to being able to sense via its multiple, adaptive input
groups and probes, the AGI must also be able to act on the world - be
it for exploration, experimentation, communication, or to perform
useful actions. These mechanisms need to provide both static and
dynamic output (states and behavior). They too, need to be adaptive and
capable of learning.
Underlying all of this functionality is pattern processing. What is
more, not only are sensing and action based on generic patterns, but so
is internal cognitive activity. In fact, even high-level abstract
thought, language, and formal reasoning - abilities outside the scope
of our current project - are 'just' higher-order elaborations of this
(Margolis 1987).
Advantages of Intelligence being General
The advantages of general intelligence are almost too obvious to
merit listing; how many of us would dream of giving up our ability to
adapt and learn new things? In the context of artificial intelligence
this issue takes on a new significance.
There exists an inexhaustible demand for computerized systems that
can assist humans in complex tasks that are highly repetitive,
dangerous, or that require knowledge, senses or abilities that its
users may not possess (e.g., expert knowledge, 'photographic' recall,
overcoming disabilities, etc.). These applications stretch across
almost all domains of human endeavor.
Currently, these needs are filled primarily by systems engineered
specifically for each domain and application (e.g., expert systems).
Problems of cost, lead-time, reliability, and the lack of adaptability
to new and unforeseen situations, severely limit market potential.
Adaptive AGI technology, as described in this paper, promises to
significantly reduce these limitations and to open up these markets. It
specifically implies -
-
That systems can learn (and be taught) a wide spectrum of data and functionality
-
They can adapt to changing data, environments and uses/ goals
-
This can be achieved without program changes - capabilities are learned, not coded.
More specifically, this technology can potentially:
-
Significantly reduce system 'brittleness' through fuzzy pattern matching and adaptive learning -
increasing robustness in the face of changing and unanticipated
conditions or data.
-
Learn autonomously, by automatically accumulating knowledge about new environments through exploration.
-
Allow systems to be operator-trained to identify new objects and
patterns; to respond to situations in specific ways, and to acquire new
behaviors.
-
Eliminate programming in many applications. Systems can be employed in
many different environments, and with different parameters simply
through self-training.
-
Facilitate easy deployment in new domains. A general intelligence
engine with pluggable custom input/ output probes allows rapid and
inexpensive implementation of specialized applications.
From a design perspective, AGI offers the advantage that all effort can be focused on achieving the best general
solutions - solving them once, rather than once for each particular
domain. AGI obviously also has huge economic implications: because AGI
systems acquire most of their knowledge and skills (and adapt to
changing requirements) autonomously, programming lead times and costs
can be dramatically reduced, or even eliminated.
The fact that no (artificial!) systems with these capabilities
currently exist seems to imply that it is very hard (or impossible) to
achieve these objectives. However, I believe that, as with other
examples of human discovery and invention, the solution will seem
rather obvious in retrospect. The trick is correctly choosing a few
critical development options.
3. Shortcuts to AGI
When explaining Artificial General Intelligence to the uninitiated
one often hears the remark that, surely, everyone in AI is working to
achieve general intelligence. This indicates how deeply misunderstood
intelligence is. While it is true that eventually conventional
(domain-specific) research efforts will converge with those of AGI,
without deliberate guidance this is likely to be a long, inefficient
process. High-level intelligence must be adaptive, must be general - yet very little work is being done to specifically identify what general intelligence is, what it requires, and how to achieve it.
In addition to understanding general intelligence, AGI design also requires an appreciation of the differences between artificial (synthetic) and biological intelligence, and between designed and evolved systems.
Our particular approach to achieving AGI capitalizes on extensive
analysis of these issues, and on an incremental development path that
aims to minimize development effort (time and cost), technical
complexity, and overall project risks. In particular, we are focusing
on engineering a series of functional (but low-resolution/ capacity)
proof-of-concept prototypes. Performance issues specifically related to
commercialization are assigned to separate development tracks.
Furthermore, our initial effort concentrates on identifying and
implementing the most general and foundational components first,
leaving high-level cognition such as abstract thought, language, and
formal logic for later development (more on that later). We also focus
more on selective, unsupervised, dynamic, incremental, interactive
learning; on noisy, complex, analog data; and on integrating entity
features and concept attributes in one comprehensive network.
While our project may not be the only one proceeding on this
particular path, it is clear that by far the majority of AI work being
done today follows a substantially different overall approach. Our work
focuses on:
-
General rather than domain-specific cognitive ability
-
Acquired knowledge and skills, versus loaded databases and coded skills
-
Bi-directional, real-time interaction, versus batch processing
-
Adaptive attention (focus & selection), versus human pre-selected data
-
Core support for dynamic patterns, versus static data
-
Unsupervised and self-supervised, versus supervised learning
-
Adaptive, self-organizing data structures, versus fixed neural nets or databases
-
Contextual, grounded concepts, versus hard-coded, symbolic concepts
-
Explicitly engineering functionality, versus evolving it
-
Conceptual design, versus reverse-engineering
-
General proof-of-concept, versus specific real applications development
-
Animal level cognition, versus abstract thought, language, and formal logic.
Let's look at each of these choices in greater detail.
General rather than domain-specific cognitive ability. The advantages listed in the previous section flow from the fact that generally intelligent systems can ultimately learn any specialized knowledge and skills possible - human intelligence is the proof! The reverse is obviously not true.
A complete, well-designed AGI's ability to acquire domain-specific
capabilities is limited only by processing and storage capacity. What
is more, much of its learning will be autonomous - without teachers,
and certainly without explicit programming. This approach implements
(and capitalizes on) the essence of 'Seed AI' - systems with a limited,
but carefully chosen set of basic, initial capabilities that allow them
(in a 'bootstrapping' process) to dramatically increase their knowledge
and skills through self-directed learning and adaptation. By
concentrating on carefully designing the seed of intelligence, and then
nursing it to maturity, one essentially bootstraps intelligence. In our
AGI design this self-improvement takes two distinct forms/ phases:
-
Coding the basic skills that allow the system to acquire a large amount of specific knowledge.
-
The system reaching sufficient intelligence and conceptual
understanding of its own design, to enable it to deliberately improve
its own design.
Acquired knowledge and skills, versus loaded databases and coded skills. One crucial measure of general intelligence is its ability to acquire
knowledge and skills, not how much it possesses. Many AI efforts
concentrate on accumulating huge databases of knowledge and coding
massive amounts of specific skills. If AGI is possible - and evidence
presented here and elsewhere seems overwhelming - then much of this
effort will be wasted. Not only will an AGI be able to acquire these
additional smarts (largely) by itself, but moreover, it will also be
able to keep its knowledge up-to-date, and to improve it. Not only will
this save initial data collection and preparation as well as
programming, it will also dramatically reduce maintenance.
An important feature of our design is that there are no traditional
databases containing knowledge, nor programs encoding learned skills:
All acquired knowledge is integrated into an adaptive central
knowledge/ skills network. Patterns representing knowledge are
associated in a manner that facilitates conceptualization and
sensitivity to context. Naturally, such a design is potentially far
less prone to brittleness, and more resiliently fault-tolerant.
Bi-directional, real-time interaction, versus batch processing. Adaptive
learning systems must be able to interact bi-directionally with the
environment - virtual or real. They must both sense data and act/ react
on an ongoing basis. Many AI systems do all of their learning in batch
mode and have little or no ability to learn incrementally. Such
systems cannot easily adjust to changing environments or requirements -
in many cases they are unable to adapt beyond the initial training set
without reprogramming or retraining.
In addition to real-time perception and learning, intelligent
systems must also be able to act. Three distinct areas of action
capability are required:
-
Acting
on the 'world' - be it to communicate, to navigate or explore, or to
manipulate some external function or device in order to achieve goals.
-
Controlling
or modifying the system's internal parameters (such as learning rate or
noise tolerance, etc.) in order to set or improve functionality.
-
Controlling the system's sense input parameters such as focus,
selection, resolution (granularity) as well as adjusting feature
extraction parameters.
Adaptive attention (focus & selection), versus human pre-selected data.
As mentioned earlier, reality presents far more sense data abundance,
detail, and complexity than are required for any given task - or than
can be processed. Traditionally, this problem has been dealt with by
carefully selecting and formatting data before feeding it to the
system. While this human assistance can improve performance in specific
applications, it is often not realized that this additional
intelligence resides in the human, not the software.
Outside guidance and training can obviously speed learning; however, AGI systems must inherently
be designed to acquire knowledge by themselves. In particular, they
need to control what input data is processed - where specifically to
obtain data, in how much detail, and in what format. Absent this
capability the system will either be overwhelmed by irrelevant data or,
conversely, be unable to obtain crucial information, or get it in the
required format. Naturally, such data focus and selection mechanisms
must themselves be adaptive.
Core support for dynamic patterns, versus static data.
Temporal pattern processing is another fundamental requirement of
interactive intelligence. At least three aspects of AGI rely on it: perception needs to learn/ recognize dynamic entities and sequences, action usually comprises complex behavior, and cognition
(internal processing) is inherently temporal. In spite of this obvious
need for intrinsic support for dynamic patterns, many AI systems only
process static data; temporal sequences, if supported at all, are often
converted ('flattened') externally to eliminate the time dimension.
Real-time temporal pattern processing is technically quite challenging,
so it is not surprising that most designs try to avoid it.
Unsupervised and self-supervised, versus supervised learning.
Auto-adaptive systems such as AGIs require comprehensive capabilities
to learn without supervision. Such teacher-independent knowledge and
skill acquisition falls into two broad categories: unsupervised
(data-driven, bottom-up), and self-supervised (goal-driven, top-down).
Ideally these two modes of learning should seamlessly integrate with
each other - and of course, also with other, supervised methods.
Here, as in other design choices, general adaptive systems are
harder to design and tune than more specialized, unchanging ones. We
see this particularly clearly in the overwhelming focus on
back-propagation in artificial neural network (ANN) development. Relatively
little research aims at better understanding and improving incremental,
autonomous learning. Our own design places heavy emphasis on these
aspects.
Adaptive, self-organizing data structures, versus fixed neural nets or databases.
Another core requirement imposed by data/ goal-driven, real-time
learning is having a flexible, self-organizing data structure. On the
one hand, knowledge representation must be highly integrated, while on
the other hand it must be able to adapt to changing data densities (and
other properties), and to varying goals or solutions. Our AGI encodes all
acquired knowledge and skills in one integrated network-like structure.
This central repository features a flexible, dynamically
self-organizing topology. The vast majority of other AI designs rely
either on loosely-coupled data objects or agents, or on fixed network
topologies and pre-defined ontologies, data hierarchies or database
layouts. This often severely limits their self-learning ability,
adaptivity and robustness, or creates massive communication bottlenecks
or other performance overhead.
Contextual, grounded concepts, versus hard-coded, symbolic concepts. Concepts are probably the most important design aspect of AGI; in fact, one can say that 'high-level intelligence is
conceptual intelligence'. Core characteristics of concepts include
their ability to represent ultra-high-dimensional fuzzy sets that are
grounded in reality, yet fluid with regard to context. In other words,
they encode related sets of complex, coherent, multi-dimensional
patterns that represent features of entities. Concepts obtain their
grounding (and thus their meaning) by virtue of patterns emanating from
features sensed directly from entities that exist in reality. Because
concepts are defined by value ranges within each feature
dimension (sometimes in complex relationships), some kind of fuzzy
pattern matching is essential. In addition, the scope of concepts must be fluid; they must be sensitive and adaptive to both environmental and goal contexts.
Autonomous concept formation is one of the key tests of
intelligence. The many AI systems based on hard-coded or human-defined
concepts fail this fundamental test. Furthermore, systems that do not
derive their concepts via interactive perception are unable to ground
their knowledge in reality, and thus lack crucial meaning. Finally,
concept structures whose activation cannot be modulated by context and
degree of fit are unable to capture the subtlety and fluidity of
intelligent generalization. In combination, these limitations will
cripple any aspiring AGI.
Explicitly engineering (and learning) functionality, versus evolving it.
Design by evolution is extremely inefficient - whether in nature or in
computer science. Moreover, evolutionary solutions are generally
opaque; optimized only to some specified 'cost function', not
comprehensibility, modularity, or maintainability. Furthermore,
evolutionary learning also requires more data or trials than are
available in everyday problem solving.
Genetic and evolutionary programming do have their uses - they are powerful tools
that can be used to solve very specific problems, such as optimization
of large sets of variables; however they generally are not appropriate
for creating large systems of infrastructures. Artificially evolving
general intelligence directly seems particularly problematic
because there is no known function measuring such capability along a
single continuum - and absent such direction, evolution doesn't know
what to optimize. One approach to deal with this problem is to try to
coax intelligence out of a complex ecology of competing agents -
essentially replaying natural evolution.
Overall, it seems that genetic programming techniques are
appropriate when one runs out of specific engineering ideas. Here is a
short summary of advantages of explicitly engineered functionality:
-
Designs can directly capitalize on and encode the designer's knowledge and insights.
-
Designs have comprehensible design documentation.
-
Designs can be more far more modular - less need for multiple
functionality and high inter-dependency of sub-systems than found in
evolved systems.
-
Systems can have a more flow-chart like, logical design - evolution has no foresight.
-
They can be designed with debugging aids - evolution didn't need that.
-
These features combine to make systems easier to understand, debug,
interface, and - importantly - for multiple teams to simultaneously
work on the design.
Conceptual design, versus reverse-engineering. In addition to
avoiding the shortcomings of evolutionary techniques, there are also
numerous advantages to designing and engineering intelligent systems
based on functional requirements rather than trying to copy
evolution's design of the brain. As aviation has amply demonstrated, it
is much easier to build planes than it is to reverse-engineer birds -
much easier to achieve flight via thrust than flapping wings.
Similarly, in creating artificial intelligence it makes sense to
capitalize on our human intellectual and engineering strengths - to
ignore design parameters unique to biological systems, instead of
struggling to copy nature's designs. Designs explicitly engineered to
achieve desired functionality are much easier to understand, debug,
modify, and enhance. Furthermore, using known and existing technology
allows us to best leverage existing resources. So why limit ourselves
to the single solution to intelligence created by a blind, unconscious
Watchmaker with his own agenda (survival in an evolutionary environment
very different from that of today)?
Intelligent machines designed from scratch carry neither the
evolutionary baggage, nor the additional complexity for epigenesis,
reproduction, and integrated self-repair of biological brains.
Obviously this doesn't imply that we can learn nothing from studying
brains, just that we don't have to limit ourselves to biological
feasibility in our designs. Our (currently) only working example of
high-level general intelligence (the brain) provides a crucial conceptual model of cognition, and can clearly inspire numerous specific design features.
Here are some desirable cognitive features that can be included in
an AGI design that would not (and in some cases, could not) exist in a
reverse-engineered brain:
-
More effective control of neurochemistry ('emotional states')
-
Selecting the appropriate degree of logical thinking versus intuition
-
More effective control over focus and attention
-
Being able to learn instantly, on demand
-
Direct and rapid interfacing with databases, the Internet, and other
machines - potentially having instant access to all available knowledge
-
Optional 'photographic' memory and recall ('playback') on all senses!
-
Better control over remembering and forgetting (freezing important knowledge, and being able to unlearn)
-
The ability to accurately backtrack and review thought and decision processes (retrace and explore logic pathways)
-
Patterns, nodes and links can easily be tagged (labeled) and categorized
-
The ability to optimize the design for the available hardware instead of being forced to conform to the brain's requirements
-
The ability to utilize the best existing algorithms and software
techniques - irrespective of whether they are biologically plausible
-
Custom designed AGI (unlike brains) can have a simple speed/ capacity upgrade path
-
The possibility of comprehensive integration with other AI systems
(like expert systems, robotics, specialized sense pre-processors, and
problem solvers)
-
The ability to construct AGIs that are highly optimized for specific domains
-
Node, link, and internal parameter data is available as 'input data' (full introspection)
-
Design specifications are available (to the designer and to the AGI itself!)
-
Seed AI design: A machine can inherently be designed to more easily
understand and improve its own functioning - thus bootstrapping
intelligence to ever higher levels.
General proof-of-concept, versus specific real applications development.
Applying given resources to minimalist proof-of-concept designs
improves the likelihood of cutting a swift, direct path towards an
ultimate goal. Having identified high-level artificial general
intelligence as our goal, it makes little sense to squander resources
on inessentials. In addition to focusing our efforts on the ability to acquire
knowledge autonomously, rather than capturing or coding it, we further
aim to speed progress towards full AGI by reducing cost and complexity
through -
· Concentrating
on proof-of-concept prototypes, not commercial performance. This
includes working at low data resolution and volume, and putting aside
optimization. Scalability is addressed only at a theoretical level, and
not necessarily implemented.
·
Working with radically-reduced sense and motor capabilities. The fact
that deaf, blind, and severely paralyzed people can attain high
intelligence (Helen Keller, Stephen Hawking) indicates that these are
not essential to developing AGI.
·
Coping with complexity through a willingness to experiment and
implement poorly understood algorithms - i.e. using an engineering
approach. Using self-tuning feedback loops to minimize free parameters.
·
Not being sidetracked by attempting to match the performance of domain-specific designs - focusing more on how
capabilities are achieved (e.g. learned conceptualization, instead of
programmed or manually specified concepts) rather than raw performance.
·
Developing and testing in virtual environments, not physical
implementations. Most aspects of AGI can be fully evaluated without the
overhead (time, money, and complexity) of robotics.
Animal level cognition, versus abstract thought, language, and formal logic.
There is ample evidence that achieving high-level cognition requires only modest structural
improvements from animal capability. Discoveries in cognitive
psychology point towards generalized pattern processing being the
foundational mechanism for all higher level functioning. On the other
hand, relatively small differences between higher animals and humans
are also witnessed by studies of genetics, the evolutionary timetable,
and developmental psychology.
The core challenge of AGI is achieving the robust, adaptive
conceptual learning ability of higher primates or young children. If
human level intelligence is the goal, then pursuing robotics, language,
or formal logic (at this stage) is a costly sideshow - whether
motivated by misunderstanding the problem, or by commercial or
'political' considerations.
Summary. While our project leans heavily on research done in
many specialized disciplines, it is one of the few efforts dedicated to
integrating such interdisciplinary knowledge with the specific goal of
developing general
artificial intelligence. We firmly believe that many of the issues
raised above are crucial to the early achievement of truly intelligent
adaptive learning systems.
4. Foundational Cognitive Capabilities
General intelligence requires a number of foundational cognitive abilities. At a first approximation, it must be able to -
-
Remember and recognize patterns representing coherent features of reality
-
Relate such patterns by various similarities, differences, and associations
-
Learn and perform a variety of actions
-
Evaluate and encode feedback from a goal system
-
Autonomously adjust its system control parameters.
As mentioned earlier, this functionality must handle a very wide
variety of data types and characteristics (including temporal), and
must operate interactively, in real-time. The expanded description
below is based on our particular implementation; however, the features
listed would generally be required (in some form) in any implementation of artificial general intelligence.
Pattern learning, matching, completion, and recall. The
primary method of pattern acquisition consists of a proprietary
adaptation of lazy learning (Aha 1997, Yip 1997). Our implementation
stores feature patterns (static and dynamic) with adaptive fuzzy
tolerances that subsequently determine how similar patterns are
processed. Our recognition algorithm matches patterns on a competitive
winner-take-all basis, as a set or aggregate of similar patterns, or by
forced choice. It also offers inherent support for pattern completion,
and recall (where appropriate).
Data accumulation and forgetting. Because our system learns
patterns incrementally, mechanism are needed for consolidating and
pruning excess data. Sensed patterns (or sub-patterns) that fall within
a dynamically set noise/ error tolerance of existing ones are
automatically consolidated by a hebbian-like mechanism that we call
'nudging'. This algorithm also accumulates certain statistical
information. On the other hand, patterns that turn out not to be
important (as judged by various criteria) are deleted.
Categorization and clustering. Vector-coded feature patterns
are acquired in real-time and stored in a highly adaptive network
structure. This central self-organizing repository automatically
clusters data in hyper-dimensional vector-space. Our matching
algorithm's ability to recall patterns by any dimension provides
inherent support for flexible, dynamic categorization. Additional
categorization mechanisms facilitate grouping patterns by additional
parameters, associations, or functions.
Pattern hierarchies and associations. Patterns of perceptual
features do not stand in isolation - they are derived from coherent
external reality. Encoding relationships between patterns serves the
crucial functions of added meaning, context, and anticipation. Our
system captures low-level, perception-driven pattern associations such
as: sequential or coincidental in time, nearby in space, related by
feature group or sense modality. Additional relationships are encoded
at higher levels of the network, including actuation layers. This
overall structure somewhat resembles the 'dual network' described by
Goertzel (1993).
Pattern priming and activation spreading. The core function of association links is to prime related nodes. This helps to disambiguate pattern matching,
and to select contextual alternatives. In the case where activation is
particularly strong and perceptual activity is low, stored patterns
will be 'recognized' spontaneously. Both the scope and decay rate of
such activation spreading are controlled adaptively. These dynamics
combine with the primary, perception-driven activation to form the
system's short-term memory.
Action patterns. Adaptive action circuits are used to control parameters in the following three domains:
-
Senses, including adjustable feature extractors, focus and selection mechanisms
-
Output actuators for navigation and manipulation
-
Meta-cognition and internal controls.
Different actions states and behaviors (action sequences) for each
of these control outputs can be created at design time (using a
configuration script) or acquired interactively. Real-time learning
occurs either by means of explicit teaching, or autonomously through
random exploration. Once acquired, these actions can be tied to
specific perceptual stimuli or whole contexts through various
stimulus-response mechanisms. These S-R links (both activation and
inhibition) are dynamically modified through ongoing reinforcement
learning.
Meta-cognitive control. In addition to adaptive perception
and action functionality, an AGI design must also allow for extensive
monitoring and control of overall system parameters and functions. Any
complex interactive learning system contains numerous crucial control
parameters such as noise tolerance, learning and exploration rates,
priorities and goal management, and a myriad others. Not only must the
system be able to adaptively control these many interactive vectors, it
must also appropriately manage its various cognitive functions (such as
recognition, recall, action, etc.). Our design deals with these
requirements by means of a highly adaptive introspection/ control
'probe'.
High-level intelligence. Our AGI model posits that no additional foundational
functions are necessary for higher-level cognition. Abstract thought,
language, and logical thinking are all elaborations of core abilities.
This controversial point is elaborated on further on.
5. An AGI in the making
The functional prototype currently under development at Adaptive
A.I. Inc. aims to embody all the abovementioned choices, requirements,
and features. Our development path is as follows:
-
Development framework
-
Memory core and interface structure
-
Individual foundational cognitive components
-
Integrated low-level cognition
-
Increasing level of functionality.
The software comprises an AGI engine framework with the following basic components:
-
A set of pluggable, programmable (virtual) sensors and actuators (called 'probes')
-
A central pattern store/ engine including all data and cognitive algorithms
-
A configurable, dynamic 2D virtual world, plus various training and diagnostic tools.
The AGI engine design is based on, and embodies insights from a wide
range of research in cognitive science - including computer science,
neuroscience, epistemology (Rand 1990, Kelley 1986), and psychology
(Margolis 1987). Particularly strong influences include: embodied
systems (Brooks 1994), vector encoded representation (Churchland 1995),
adaptive self-organizing neural nets (esp. Growing Neural Gas, Fritzke
1995), unsupervised and self-supervised learning, perceptual learning
(Goldstone 1998), and fuzzy logic (Kosko 1997).
While our design includes several novel, and proprietary
algorithms, our key innovation is the particular selection and
integration of established technologies and prior insights.
AGI Engine Architecture & Design Features
Our AGI engine (which provides this foundational cognitive ability)
can logically be divided into three parts (See figure above.):
-
Cognitive core
-
Control/ interface logic
-
Input/ output probes
This 'situated agent architecture' reflects the importance of having
an AGI system that can dynamically and adaptively interact with the
environment. From a theory-of-mind perspective it acknowledges both the
crucial need for concept grounding (via senses), plus the absolute need
for experiential, self-supervised learning.
The components listed below have been specifically designed with
features required for adaptive general intelligence in (ultimately)
real environments. Among other things, they deal with a great variety
and volume of static and dynamic data, cope with fuzzy and uncertain
data and goals, foster coherent integrated representations of reality,
and - most of all - promote adaptivity.
Cognitive Core: This is the central repository of all static
and dynamic data patterns - including all learned cognitive and
behavioral states and sequences. All data is stored in a single,
integrated node-link structure. The design innovates the specific
encoding of pattern 'fuzziness' (in addition to other attributes). The
core allows for several node/ link types with differing dynamics to
help define the network's cognitive structure.
The network's topology is dynamically self-organizing - a feature
inspired by 'Growing Neural Gas' design (Fritzke 1995). This allows
network density to adjust to actual data feature and/ or goal
requirements. Various adaptive local and global parameters further
define network structure and dynamics in real time.
Control and Interface Logic: An overall control system
coordinates the network's execution cycle, drives various cognitive and
housekeeping algorithms, and controls/ adapts system parameters. Via an
Interface Manager, it also communicates data and control information to
and from the probes.
Probes: The Interface Manager provides for dynamic addition
and configuration of probes. Key design features of the probe
architecture include the ability to have programmable feature
extractors, variable data resolution, and focus & selection
mechanisms. Such mechanisms for data selection are imperative for
general intelligence: even moderately complex environments have a
richness of data that far exceeds any system's ability to usefully
process.
The system handles a very wide variety of data types and control
signal requirements - including those for visual, sound, and raw data
(e.g., database, internet, keyboard), as well as various output
actuators. A novel 'system probe' provides the system with monitoring
and control of its internal states (a form of meta-cognition).
Additional probes - either custom interfaces with other systems or
additional real-world sensors/ actuators - can easily be added to the
system.
Development Environment/ Language/ Hardware. The complete AGI
engine plus associated support programs are implemented in (Object
Oriented) C# under Microsoft's .NET framework. The system is designed
for optional remoting of various components, thus allowing for some
distributed processing. Current tests show that practical
(proof-of-concept) prototype performance can be achieved on a single,
conventional PC (2 Ghz, 512 Meg). Even a non-performance-tuned
implementation can process several complex patterns per second on a
database of well over a million stored features.
6. From Algorithms to General Intelligence
This section covers some of our near-term research and development;
it aims to illustrate our expected path toward meaningful general
intelligence. While this work barely approaches higher-level animal
cognition (exceeding it in some aspects, but falling far short in
others such as sensory-motor skills), we take it to be a crucial step
in proving the validity and practicality of our model. Furthermore, the
actual functionality achieved should be highly competitive, if not
unique, in applications where significant autonomous adaptivity and
data selection, lack of brittleness, dynamic pattern processing,
flexible actuation, and self-supervised learning are central
requirements.
General intelligence doesn't comprise one single, brilliant
knock-out invention or design feature; instead, it emerges from the
synergetic integration of a number of essential fundamental components.
On the structural side, the system must integrate sense inputs, memory,
and actuators, while on the functional side various learning,
recognition, recall and action capabilities must operate seamlessly on
a wide range of static and dynamic patterns. In addition, these
cognitive abilities must be conceptual and contextual - they must be
able to generalize knowledge, and interpret it against different
backgrounds.
A key milestone in our project is testing the integrated
functionality of the basic cognitive components within our overall AGI
framework. A number of custom-developed, highly-configurable test
utilities are used to test the cohesive functioning of the whole
system. This automated training and evaluation is supplemented by
manual experimentation in numerous different environments and
applications. Experience gained by these tests helps to refine the
complex dynamics of interacting algorithms and parameters.
One of the general difficulties with AGI development is to determine
absolute measures of success. Part of the reason is that this field is
still nascent, and thus no agreed definitions, let alone tests or
measures of low-level general intelligence exist. As we proceed with
our project we expect to develop ever more effective protocols and
metrics for assessing cognitive ability. Our system's performance
evaluation is guided by this description: 'General intelligence
comprises the ability to acquire (and adapt) the knowledge and skills
required for achieving a wide range of goals in a variety of domains.'
-
In this context, 'acquisition' includes all of the following:
automatic, via sense inputs (feature/ data driven); explicitly taught;
discovered through exploration or experimentation; internal processes
(e.g., association, categorization, statistics, etc.).
-
'Adaptation' implies that new knowledge is integrated appropriately.
-
'Knowledge and skills' refer to all kinds of data and abilities
(states and behaviors) that the system acquires for the short or long
term.
Our initial protocol for evaluating AGIs aims to cover a wide
spectrum of domains and goals by simulating sample applications in 2D
virtual worlds. In particular, these tests should assess the degree to
which the foundational abilities operate as an integrated, mutually
supportive whole - and without programmer intervention! Here are three
examples:
Sample Test Domains for Initial Performance Criteria
Adaptive Security Monitor. This system scans video monitors
and alarm panels that oversee a secure area (say, factory, office
building, etc.), and responds appropriately to abnormal conditions.
Note, this is somewhat similar to a site monitoring application at MIT
(Grimson 1998).
This simulation calls for a visual environment that contains a lot
of detail but has only limited dynamic activity - this is its normal
state (green). Two levels of abnormality exist: (i) minor, or known
disturbance (yellow); (ii) major, or unknown disturbance (red).
The system must initially learn the normal state by simple exposure
(automatically scanning the environment) at different resolutions
(detail). It must also learn 'yellow' conditions by being shown a
number of samples (some at high resolution). All other states must
output 'red'.
Standard operation is to continuously scan the environment at low
resolution. If any abnormal condition is detected the system must learn
to change to higher resolution in order to discriminate between
'yellow' and 'red'.
The system must adapt to changes in the environment (and totally different environments) by simple exposure training.
Sight Assistant. The system controls a movable 'eye' (by
voice command) that enables the identification (by voice output) of at
least a hundred different objects in the world. A trainer will
dynamically teach the system new names, associations, and eye movement
commands.
The visual probe can select among different scenes (simulating
rooms) and focus on different parts of each scene. The scenes depict
objects of varying attributes: color, size, shape, various dynamics,
etc. (and combinations of these), against different backgrounds.
Initial training will be to attach simple sound commands to maneuver
the 'eye', and to associate word labels with selected objects. The
system must then reliably execute voice commands and respond with
appropriate identification (if any). Additional functionality could be
to have the system scan the various scenes when idle, and to
automatically report selected important objects.
Object identification must cover a wide spectrum of different
attribute combinations and tolerances. The system must easily learn new
scenes, objects, words and associations, and also adapt to changes in
any of these variables.
Maze Explorer. A (virtual) entity explores a moderately
complex environment. It discovers what types of objects aid or hinder
its objectives, while learning to navigate this dynamic world. It can
also be trained to perform certain behaviors.
The virtual world is filled with a great number of different objects
(see previous example). In addition, some of these objects move in
space at varying speeds and dynamics, and may be solid and/ or
immovable. Groups of different kinds of objects have pre-assigned
attributes that indicate negative or positive. The AGI engine controls
the direction and speed of an entity in this virtual world. Its goal is
to learn to navigate around immovable and negative objects to reliably
reach hidden positives.
The system can also be trained to respond to operator commands to
perform behaviors of varying degrees of complexity (for example,
actions similar to 'tricks' one might teach a dog). This 'Maze
Explorer' can easily be set up to deal with fairly complex tasks.
Towards Increased Intelligence
Clearly, the tasks described above do not by themselves represent
any kind of breakthrough in artificial intelligence research. They have
been achieved many times before. However, what we do believe to
be significant and unique is the achievement of these various tasks
without any task-specific programming or parameterization. It is not what is being done, but how it is done.
Development beyond these basic proof-of-concept tests will advance
in two directions: 1) to significantly increase resolution, data
volume, and complexity in applications similar to the tests; 2) to add
higher-level functionality. In addition to work aimed at further
developing and proving our general intelligence model, there are also
numerous practical enhancements that can be done. These would include
implementing multi-processor and network versions, and integrating our
system with databases or with other existing AI technology such as
expert systems, voice recognition, robotics, or sense modules with
specialized feature extractors.
By far the most important of these future developments concern
higher-level ability. Here is a partial list of action items, all of
which are derived from lower-level foundations:
-
Spread activation and retain context over extended period
-
Support more complex internal temporal patterns, both for
enhanced recognition and anticipation, and for cognitive and action
sequences
-
Internal activation feedback for processing without input
-
Deduction, achieved through selective concept activation
-
Advanced categorization by arbitrary dimensions
-
Learning of more complex behavior
-
Abstract and merged concept formation
-
Structured language acquisition
-
Increased awareness and control of internal states (introspection)
-
Learning logic and other problem-solving methodologies.
7. Other Research
[5]
Many different approaches to AI exist; some of the differences are
straight forward while others are subtle and hinge on difficult
philosophical issues. As such the exact placement of our work relative
to that of others is difficult and, indeed, open to debate. Our view
that 'intelligence is a property of an entity that engages in two way
interaction with an external environment', technically puts us in the
area of 'agent systems' (Russel 1995). However, our emphasis on a
connectionist rather than classical approach to cognitive modeling,
places our work in the field of 'embodied cognitive science'. (See
Pfeifer and Scheier 1999 for a comprehensive overview.)
While our approach is similar to other research in embodied cognitive science, in some respects our goals
are substantively different. A key difference is our belief that a core
set of cognitive abilities working together is sufficient to produce
general intelligence. This is in marked contrast to others in embodied
cognitive science who consider intelligence to be necessarily specific
to a set of problems within a given environment. In other words, they
believe that autonomous agents always exist in ecological niches. As
such they focus their research on building very limited systems that
effectively deal with only a small number of problems within a specific
limited environment. Almost all work in the area follows this -- see
Braitenberg (1984), Brooks (1994) or Arbib (1992) for just a few well
known examples. Their stance contradicts the fact that humans possess
general intelligence; we are able to effectively deal with a wide range
of problems that are significantly beyond anything that could be called
our 'ecological niche'.
Perhaps the closest project to ours that is strictly in the area of
embodied cognitive science is the Cog project at MIT (Brooks 1993). The
project aims to understand the dynamics of human interaction by the
construction of a human-like robot complete with upper torso, a head,
eyes, arms and hands. While this project is significantly more
ambitious than other projects in terms of the level and complexity of
the system's dynamics and abilities, the system is still essentially
niche focused (elementary human social and physical interaction) when
compared to our own efforts at general intelligence.
Probably the closest work to ours in the sense that it also aims to
achieve general rather than niche intelligence is the Novamente project
under the direction of Ben Goertzel. (The project was formerly known as
Webmind -- see Goertzel 1997, 2001.) Novamente relies on a hybrid of
low-level neural net-like dynamics for activation spreading and concept
priming, coupled with high-level semantic constructs to represent a
variety of logical, causal and spatial-temporal relations. While the
semantics of the system's internal state are relatively easy to
understand compared to a strictly connectionist approach, the classical
elements in the system's design open the door to many of the
fundamental problems that have plagued classical AI over the last fifty
years. For example, high-level semantics require a complex meta-logic
contained in hard coded high-level reasoning and other high-level
cognitive systems. These high-level systems contain significant
implicit semantics that may not be grounded in environmental
interaction but are rather hard coded by the designer - thus causing
symbol grounding problems (Harnad 1990). The relatively fixed,
high-level methods of knowledge representation and manipulation that
this approach entails are also prone to 'frame of reference' (McCarthy
and Hayes 1969; Pylyshyn 1987) and 'brittleness' problems. In a
strictly embodied cognitive science approach, as we have taken, all
knowledge is derived from agent-environment interaction thus avoiding
these long-standing problems of classical AI.
Andy Clark (1997) is another researcher
whose model closely resembles our own, but there are no implementations
specifically based on his theoretical work. Igor Aleksander's
(now dormant) MAGNUS project (1996) also incorporated many key AGI
concepts that we have identified, but it was severely limited by a
classical AI, finite-state machine approach. Valeriy Nenov and Michael
Dyer of UCLA (1994) used 'massively' parallel hardware (a
CM-2 Connection Machine) to implement a virtual, interactive perceptual
design close to our own, but with a more rigid, pre-programmed
structure. Unfortunately, this ambitious, ground-breaking work has
since been abandoned. The project was probably severely hampered by
limited (at the time) hardware.
Moving further away from embodied cognitive science to purely
classical research in general intelligence, perhaps the best known
system is the Cyc project being pursued by Lenat (1990). Essentially
Lenat sees general intelligence as being 'common sense'. He hopes to
achieve this goal by adding many millions of facts about the world into
a huge database. After many years of work and millions of dollars in
funding there is still a long way to go as the sheer number of facts
that humans know about the world is truly staggering. We doubt that a
very large database of basic facts is enough to give a computer much
general intelligence - the mechanisms for autonomous knowledge
acquisition are missing. Being a classical approach to AI this also
suffers from the fundamental problems of classical AI listed above. For
example the symbol grounding problem again: if facts about cats and
dogs are just added to a database that the computer can use even though
it has never seen or interacted with an animal, are those concepts
really meaningful to the system? While his project also claims to
pursue 'general intelligence', it is really very different from our
own, both in its approach and in the difficulties it faces.
Analysis of AI's ongoing failure to overcome its long-standing
limitations reveals that it is not so much that Artificial General
Intelligence has been tried and that it has failed, but rather that the
field has largely been abandoned - be it for theoretical, historic, or
commercial reasons. Certainly, our particular type of approach, as
detailed in previous sections, is receiving scant attention.
8. Fast-track AGI - Why so Rare?
Widespread application of AI has been hampered by a number of core
limitations that have plagued the field since the beginning, namely:
-
The expense and delay of custom programming individual applications
-
Systems' inability to automatically learn from experience, or to be user teachable/ trainable
-
Reliability and performance issues caused by 'brittleness' (the
inability of systems to automatically adapt to changing requirements,
or data outside of a predefined range)
-
Their limited intelligence and common sense.
The most direct path to solving these long-standing problems is to
conceptually identify the fundamental characteristics common to all
high-level intelligence, and to engineer systems with this basic
functionality, in a manner that capitalizes on human and technological
strength.
General intelligence is the key to achieving robust autonomous
systems that can learn and adapt to a wide range of uses. It is also
the cornerstone of self-improving, or Seed AI - using basic abilities
to bootstrap higher-level ones. This essay identified foundational
components of general intelligence, as well as crucial considerations
particular to the effective development of the artificial variety. It
highlighted the fact that very few researchers are actually following
this most direct route to AGI.
If the approach outlined above is so promising, then why is has it
received so little attention? Why is hardly anyone actually working on
it?
A short answer: Of all the people working in the field called 'AI',
-
80% don't believe in the concept of General Intelligence (but instead, in a large collection of specific skills and knowledge)
-
Of those that do, 80% don't believe that artificial, human-level
intelligence is possible - either ever, or for a long, long time
-
Of those that do, 80% work on domain-specific AI projects for
commercial or academic-political reasons (results are more immediate)
-
Of those left, 80% have a poor conceptual framework...
Even though the above is a caricature, in contains more than a grain of truth.
A great number of researchers reject the validity or importance of
'general intelligence'. For many, controversies in psychology (such as
those stoked by The Bell Curve) make this an unpopular, if not
taboo subject. Others, conditioned by decades of domain-specific work,
simply do not see the benefits of Seed AI - solving the problems only
once.
Of those that do not in principle object to general intelligence,
many don't believe that AGI is possible - in their life-time, or ever.
Some hold this position because they themselves tried and failed 'in
their youth'. Others believe that AGI is not the best approach
to achieving 'AI', or are at a total loss on how to go about it. Very
few researchers have actually studied the problem from our (the general
intelligence/ Seed AI) perspective. Some are actually trying to
reverse-engineer the brain - one function at a time. There are also
those who have moral objections, or who are afraid of it.
Of course, a great many are so focused on particular, narrow aspects
of intelligence that they simply don't get around to looking at the big
picture - they leave it to others to make it happen. It is also
important to note that there are often strong financial and
institutional pressures to pursue specialized AI.
All of the above combine to create a dynamic where Real AI is not
'fashionable' - getting little respect, funding, and support - further
reducing the number of people drawn into it!
These should be more than enough reasons to account for the dearth
of AGI progress. But it gets worse. Researchers actually trying to
build AGI systems are further hampered by a myriad of misconceptions,
poor choices, and lack of resources (funding and research). Many of the
technical issues were explored previously (See sections 3 and 7.), but
a few others are worth mentioning:
Epistemology. Models of AGI can only be as good as their
underlying theory of knowledge - the nature of knowledge, and how it
relates to reality. The realization that high-level intelligence is
based on conceptual representation of reality underpins design
decisions such as adaptive, fuzzy vector encoding, and an interactive,
embodied approach. Other consequences are the need for sense-based
focus and selection, and contextual activation. The central importance
of a highly-integrated pattern network - especially including dynamic
ones - becomes obvious on understanding the relationship between
entities, attributes, concepts, actions, and thoughts. These and
several other insights lay the foundation for solving problems related
to grounding, brittleness, and common sense. Finally, there is still a
lot of unnecessary confusion about the relationship between concepts
and symbols. A dynamic that continues to handicap AI is the lingering
schism between traditionalists and connectionists. This unfortunately
helps to perpetuate a false dichotomy between explicit symbols/ schema,
and incomprehensible patterns.
Theory of Mind. Another area of concern is sloppy formulation
and poor understanding of several key concepts: consciousness,
intelligence, volition, meaning, emotions, common sense, and 'qualia'.
The fact that hundreds of AI researchers attend conferences every year
where key speakers proclaim that 'we don't understand consciousness (or
qualia, or whatever), and will probably never
understand it' indicates just how pervasive this problem is. Marvin
Minsky's characterization of consciousness being a 'suitcase word' is correct. Let's just unpack it!
Errors like these are often behind research going off at a tangent
relative to stated long-term goals. Two examples are an undue emphasis
on biological feasibility, and the belief that embodied intelligence
cannot be virtual, that it has to be implemented in physical robots.
Cognitive psychology. It goes without saying that a proper
understanding of the concept 'intelligence' is key to engineering it.
In addition to epistemology, several areas of cognitive psychology are
crucial to unraveling its meaning. Misunderstanding intelligence has
led to some costly disappointments, such as manually accumulating huge
amounts of largely useless data (knowledge without meaning), efforts to
achieve intelligence by combining masses of dumb agents, or trying to
obtain meaningful conversation from an isolated network of symbols.
Project focus. The few projects that do pursue AGI
based on relatively sound models run yet another risk: they can easily
lose focus. Sometimes commercial considerations hijack a project's
direction, while others get sidetracked by (relatively) irrelevant
technical issues, such as trying to match an unrealistically high level
of performance, fixating on biological feasibility of design, or
attempting to implement high-level functions before their time. A
clearly mapped-out developmental path to human-level intelligence can
serve as a powerful antidote to losing sight of 'the big picture'. A
vision of how to get from 'here' to 'there' also helps to maintain
motivation in such a difficult endeavor.
Research support. AGI utilizes, or more precisely, is an
integration of a large number of existing AI technologies.
Unfortunately, many of the most crucial areas are sadly
under-researched. They include:
-
Incremental, real-time, unsupervised/ self-supervised learning (vs. back-propagation)
-
Integrated support for temporal patterns
-
Dynamically-adaptive neural network topologies
-
Self-tuning of system parameters, integrating bottom-up (data driven) and top-down (goal/ meta-cognition driven) auto-adaptation
-
Sense probes with auto-adaptive feature extractors.
Naturally, these very limitations feed back to reduce support for AGI research.
Cost and difficulty. Achieving high-level AGI will be hard.
However, it will not be nearly as difficult as most experts think. A
key element of 'Real AI' theory (and its implementation) is to
concentrate on the essentials of intelligence. Seed AI becomes a
manageable problem - in some respects much simpler than other
mainstream AI goals - by eliminating huge areas of difficult, but
inessential AI complexity. Once we get the crucial fundamental
functionality working, much of the additional 'intelligence' (ability)
required is taught or learned, not programmed. Having said this, I do
believe that very substantial resources will be required to scale up
the system to human-level storage and processing capacity. However, the
far more moderate initial prototypes will serve as proof-of-concept for
AGI while potentially seeding a large number of practical new
applications.
9. Conclusion
Understanding general intelligence and identifying its essential
components are key to building next-generation AI systems - systems
that are far less expensive, yet significantly more capable. In
addition to concentrating on general
learning abilities, a fast-track approach should also seek a path of
least resistance - one that capitalizes on human engineering strengths
and available technology. Sometimes, this involves selecting the AI
road less traveled.
We believe that the theoretical model, cognitive components, and
framework described above, joined with our other strategic design
decisions provide a solid basis for achieving practical AGI
capabilities in the foreseeable future. Successful implementation will
significantly address many traditional problems of AI. Potential
benefits include:
-
Minimizing initial environment-specific programming (through self-adaptive configuration)
-
Substantially reducing ongoing software changes,
because a large amount of additional functionality and knowledge will
be acquired autonomously via self-supervised learning
-
Greatly increasing the scope of applications, as users teach and train additional capabilities
-
Improved flexibility and robustness resulting from systems' ability to adapt to changing data patterns, environments and goals.
AGI promises to make an important contribution toward realizing
software and robotic systems that are more usable, intelligent, and
human-friendly. The time seems ripe for a major initiative down this
new path of human advancement that is now open to us.
References
|