Agent-Based Engineering, the Web, and Intelligence

Charles J. Petrie
Stanford Center for Design Research
<petrie@cdr.stanford.edu>
The print version of this paper appeared in the December 1996 issue of IEEE Expert.

Abstract

We describe the use of KQML-like Agents and their compatibility with the World-Wide Web. One distinguishing characteristic of such agents is the necessity for a peer-to-peer protocol vs. the client-server protocol of HTTP. This is indicative of a major conflict between the web and agent paradigms that must be resolved for integration of the two technologies, both of which are useful for design and engineering applications. We also note that "intelligence" is not a necessary property of useful agents and is not helpful in distinguishing agents from other kinds of software.

Internet-based Software Agents

This article concerns Internet-based "agents", about which there has been much hyperbole recently. There has been much discussion on the software agents email list about the defining nature of agents on the Internet. Some have tried to offer the general definition of agents as someone or something that acts on one's behalf, but that seems to cover all of computers and software. Other than such generalities, there has been no consensus on the essential nature of agents. This suggests that the word is overloaded for a variety of contexts.

In this article I will survey the types and definitions of agents eventually focusing on those useful for engineering. Because it is simply silly to discuss software agents without distinguishing them from other known types of software, I will venture to offer a definition. It will be iconoclastic and perhaps applicable only to a certain type of engineering agent. But it will be useful in identifying some technical implementation issues.

Intelligent Agents

For some, the term "agent" means only "autonomous, intelligent" agents. An example of this type of thinking can be found in Franklin and Graesser's paper "Is it an Agent, or just a Program?: A Taxonomy for Autonomous Agents". Another example of this view is Lenny Foner's excellent article "Agents and Appropriation". (Then there is the other side of the coin: Sverker Janson's list of "Intelligent Software Agents" includes anything called an "agent".)

The Franklin and Graesser paper is a good paper because it 1) surveys various agents, 2) presents a reasoned taxonomy based on features, and 3) avoids assigning any meaning to the word "intelligent". However, it proposes a "mathematically formal" definition: "An autonomous agent is a system situated within and a part of an environment that senses that environment and acts on it, over time, in pursuit of its own agenda and so as to effect what it senses in the future." This is, of course, not a definition any mathematician would recognize as being formal.The idea of "senses in the future" is just too open for interpretation to be an objective, much less formal, definition. Moreover, it equates being an agent with this quality of "autonomous" [Only autonomous agents were defined - other kind of agents may exist. - Private Communication from Stan Franklin, June, 1996.]

For Foner, an agent is necessarily "intelligent" and "autonomy" is just one crucial characteristic. His definition of autonomy has a bit more of operational semantics: "This requires aspects of periodic action, spontaneous execution, and initiative, in that the agent must be able to take preemptive or independent actions that will eventually benefit the user."

There are three major problems with attempts to define "agents" as "intelligent". First, as I have alluded above, the meaning of the adjectives "intelligent" and "autonomous", so far, are subjective labels. The Foner definition suggests that there might be a test for autonomy, but saying that some action is "preemptive" or "independent" does not get us far. This definition of intelligence, as do all, depends upon the opinion of an intelligent observer after interacting with the candidate agent.

Furthermore, the example agent, Julia, does not exhibit much initiative. The fact that Julia maps a maze without direction from users with whom she interacts does not distinguish Julia from almost any other software that performs a background task while answering queries from users and performing other tasks when directed, such as message forwarding. In fact, Julia never interrupts to volunteer information except to deliver a message as directed: she speaks only when spoken to. Julia's claim of intelligence is much more of the Eliza sort: Julia strikes users as a person. And indeed, the implementation and documentation suggests that Julia is intended to pass a Turing test just above the level of Eliza.

Second, these subjective labels are applicable only to an epiphenonmenon rather than a design objective. Except to pass a Turing test, no one sets out to build an "intelligent agent" as that is a poor target for software. One sets out to build an agent that accomplishes a task in hopes that the task is so difficult or it is so well-accomplished that the agent might be considered intelligent or somehow self-directed. This begs the question of why the agent is one, and not some other kind of software.

Third, various definitions of intelligence exist, but the main deficiency of such a label is that it does not sufficiently distinguish the resulting software from other technologies that may also claim intelligence as an attribute. One can take any definition of intelligent software that covers the work in Artificial Intelligence and find that it does not serve to distinguish "agents" as a kind of software. The point is that if it is claimed that to be an agent is to be intelligent, then we have still begged the question of what is an "agent" apart from all of the other intelligent software that has been developed.

Autonomy vs Intelligence?

As opposed to the general notion of intelligence, autonomy seems to be more useful in distinguishing agents from other kinds of software. (This is a virtue of the Franklin and Graesser paper.) The issue is clouded by the fact that autonomy is sometimes used to define "intelligence". For example, one of Reddy's major characterizations of intelligent software [Reddy 96] is that it "must be capable of creating for itself an agenda of goals to be satisfied". It is at least arguable, that this is just what it means to be "autonomous" software.

Nevertheless, autonomy seems to be central to agenthood. For instance, Pattie Maes' Autonomous Agents Group clearly has identified a group of good research projects under a common theme. And Reddy, Foner, Franklin, and Graesser all point to autonomy as critical to the notion of an agent. But what is an operational, objective definition of autonomy? Is there even a subjective Turing test for autonomy? Is there autonomy without intelligence?

When the terms "autonomous" or "intelligent" is used it is clear the user means the software to be something more than a mere server, mobile or not. Often, the term is only a reference to a context of a community and technology. With respect to agents, the "intelligent" label often refers to a concern with abstract, domain-independent theories of agent architecture and communication and/or aspects of human characteristics. That "autonomous" is emerging as an important characteristic does not mean that it is yet sufficiently well-defined a term to have a formal technical meaning.

Just to drive this into the ground, one might say that intelligent software that is accessible via the Internet is an agent, but this would then include a continuously running expert system to which one could open a remote display. Where's the novel technology in this? Very simply, the term "intelligence" does not sufficiently specify agenthood. The term "autonomy" gets at something more in that expert systems are not. But how did I know that? What's the objective operational definition? And, what difference would it make in software design? These are the sorts of questions one needs to answer in designing agents for engineering.

Finally, we note that there are formal definitions of autonomy and agenthood. Wooldridge and Jennings give a comprehensive overview of theories of "strong" agenthood in their paper "Intelligent Agents: Theory and Practice". It is just that there are many theories and this is why using subjective terms (e.g., "intention" and "belief") make agenthood debatable. Similarly, as previously noted the Franklin and Graesser paper also uses subjective terms for in their formal definition of autonomous agents. This is not to say that such terms should not be used or formal theories not developed: just that "intelligent/autonomous agents" is a term that, for the moment, is not of obvious utility and competing theories are best left to the research literature. These definitions have nothing to do with the World-Wide Web and are not very helpful for the integration and interaction of engineering agents, as we shall see.

Servers and Mobile Agents

There is another claim to "intelligent agents" somewhat outside the AI research community. These are the Web agents that are so "hot" at the time of writing, 1995. It is interesting that the agents from AI groups tend not to be web-based, and these agents are web-based but different in nature, and commercial rather than research projects (there are, of course, research versions such as ShopBot).

Some examples of these agents include the "BargainFinder Agent" and Cyber Yenta, which perform searches for the user. The first is incredibly simpleminded and and the latter an incredibly simplified version of the original MIT Yenta work/, which is still in progress. The claim to intelligence here is basically string matching. Along the same lines, but somewhat more "AIish", CompassWare offers an Intelligent News Filter that parses natural language to perform a search.

Certainly it is far from clear that any of this web-based software should be described as "intelligent", regardless of the definition of "agent". This point is well-made in Foner's paper as well as in articles in the trade press[Griswold 96]. Rather than dwell on the fact this software is not so clever, I would like to note there are more useful descriptions of this kind of software than "agent".

BargainFinder, Yenta, and CompassWare are essentially one-time query answering mechanisms, much like the "MetaCrawler Multi-Threaded Web Search Service" (and the very userful Ahoy!). And even though AlphaConnect searches even legacy systems and translates it to a variety of formats, it is still a search service. It is notable in that updating is automatic, but since this happens according to a user-defined schedule, much like the automatic timer in your house, it is also not very autonomous.

The term "agent" may connote that these software services contact other sources of information and compile it according to the parameters set by the user, but we already have a perfectly good word in computer engineering for such mechanisms: "server". Notice also that these servers also do not move far from the familiar database servers that answer carefully formatted questions. Calling these servers "agents" may be good marketing but obscures the technical understanding of the mechanisms. While I am against proscribing the use of the term "agent" in general, I find it helpful to understand that these are servers in the same sense that a (perhaps distributed) database server is: I send a query and get back a response. No other behavior is implied by "server" and none is exhibited by these "search agents".

Then there is the software claiming to be "intelligent agents" because the software is mobile and can go from machine to machine performing tasks on behalf of the human that spawned the agents. One of the more well-know examples of enabling technology for these kinds of agents is General Magic's Telescript. Sun's Java is often also touted as this kind of an agent development technology though its "applets" are even less likely candidates for agenthood than Telescript's remote processes and a characterization of these as "agents" is highly controversial among writers to the agents email list. However, at least one vendor has used the Java technology to build a competitor to Telescript: CyberAgent (not to mention research efforts such as Bill Li's Java-To-Go framework). Let us agree to call some software applications built upon this technology "mobile agents", but understand that the crucial technical meaning is an infrastructure (e.g., a "Listener" for CyberAgent) that allows processes to run securely on foreign machines. That this functionality has previously existed in other computer engineering mechanisms (e.g., RPCs, telnetting, and distributed computing) does not distract from the utility of these new mechanisms. Let us understand this is what is meant by "mobile agent" though "mobile process" would be less confusing.

Typed-Message Agents

Apart from intelligent/autonomous agents, servers, and mobile agents, a fourth common type of agent is the type using a KQML or a similar agent protocol such as the one being developed for SRI's Open Agent Architecture. These KQML agents may also be considered intelligent though not often mobile. An excellent description of this kind of software agent is Michael Genesereth's and Steven Ketchpel's paper "Software Agents". This paper takes more of a systems engineering approach to the definition of agents, which has the advantage that it more objectively distinguishes agents from other types of software. In this paper, software agents communicate using a shared outer language, inner (content) language, and ontology. This approach is also called the "weak" notion of agenthood by Wooldridge and Jennings. Within the engineering community, this view is especially appropriate to the use of agents as an integration technology, as in the paper by Cranefield and Purvis and the Stanford Agent-Based Engineering work.

We follow Genesereth's approach, but differ somewhat from the definition of this paper in light of our experience with Next-Link agents and comparison with other KQML-like agents, our Typed-Message Agents are defined in terms of communities of agents. (We may also call these "ACL Agents" after Genesereth.) The community must exchange messages in order to accomplish a task. They must use a shared message protocol, such as KQML, in which the some of the message semantics are typed and independent of the application. And semantics of the message protocol necessitate that the transport protocol not be only client/server but rather a peer-to-peer protocol. An individual software module is not an agent at all if it can communicate with the other candidate agents with only a client/server protocol without degradation of the collective task performance.

Perhaps DAI, More than OO

This emphasis on the protocol is admittedly unintuitive and requires explanation. Its motivation is to differentiate these kind of software agents from other software technologies. If we cannot do this, then at least this use of "agent" is meaningless. Moreover, we need to make this differentiation in a way that is objective and has computational consequences. For instance, that candidate agents should exchange shared protocol messages to collectively perform a task differentiates agents from simple expert systems or other knowledge-based systems. Thus typed-message agents squarely intersect DAI agents though we have no requirement that they be intelligent per se or have any sort of internal architecture corresponding to "beliefs" or "intentions" or employ high-level runtime coordination strategies: e.g., Sen's scheduling agents, among others.

The typed-message requirement also differentiates agents from object-oriented programming technologies that also use message passing to collectively perform tasks. The difference is the commitment to an application-independent protocol of typed messages. This is not to say that, for example, KQML agents could not be implemented using OO techniques. But OO programming does not make any commitment to such a protocol. CORBA does not make such commitments, though it makes others. It is the commitments that define a technology. (Whether these are good or bad commitments is another issue.)

No Surprise without Peer-to-Peer

The client/server protocol ban is an attempt (and it may need improvement) to omit mere servers as full-fledged agents. One common intuition about agents is that they should surprise one by offering unexpected, but helpful, information. But this intuition alone admits things like printer daemons to agenthood. One sends a print request and receives the unexpected but possibly helpful email message that the file cannot be printed for some reason. If one supposes that the request could be made and answered by a message protocol with message types such as "Sorry", understood by other software agents, then printer daemons begin to look like agents in that they are performing a task in conjunction with your own remote software.

Such printer daemons as agents is still a counter example - they go against the intuition, especially that of autonomy, which continues to lurk in the background. Agents should be more, but more what? Imagine a printer daemon that not only sends you a "Sorry" message but remembers why the request did not work. Suppose the fault lay with the inability to fetch a file on a remote machine that was down. Suppose the next day, the remote machine comes up and the printer daemon sends you a "Reply" or a notification referring to your previously denied request and asking whether you would like that file printed after all? This daemon begins to feel more correctly labeled as an agent. What can one point to from a systems standpoint that makes a difference?

This last daemon/agent sent a message that was not a simple one-time response to a request. Instead, it seemed to volunteer information. It initiated a message. If it had been a mere server, it could not have done this. A client/server protocol, admitting one reply to one request, would not have permitted this transaction. The point is that client/server protocols do not allow servers to initiate messages, later volunteering surprising but useful information. The protocol must be peer-to-peer to allow this. Thus is a peer-to-peer protocol a necessary condition for at least typed-message agenthood.

Behind this emphasis on protocol is the intuition that "real agents" save the state of part of the collective problem solving task and contribute to the task by reasoning about changes in that state. This is a very real criterion, relevant to autonomy, but difficult to demonstrate. However, the initiation of multiple messages that are relevant to an earlier query is obvious and can be objectively determined. It may not be a sufficient condition for "autonomy", but we claim, as a thesis, that it is an important necessary condition. Moreover, it is a useful condition, because this protocol criterion also has computational consequences.

As a technical example, any one who has tried to use a CGI-bin program to communicate with a KQML agent will quickly discover the inadequacy of HTTP for this purpose. Such an example can be found in the Redux' Trip Agent demonstration. Multiple messages from the KQML agent are lost with the standard HTTP protocol of request/reply. The only remedy is to hold open the connection and use the advanced HTTP function of "server push". The addition of "client pull" and "server push" with HTML 3.0 effectively makes HTTP a peer-to-peer protocol and thus useful for agent communications, though this is an awkward "workaround" for a basically client/server protocol.

There may be further requirements on agenthood not covered by this necessity for peer-to-peer protocols. For example, the use of a public explicit ontology that allows term usage to be reasoned about for collaboration makes for stronger agenthood. Perhaps others will develop more counter examples that will sharpen the intuition about surprise and message exchange. For instance, database servers that have publish-subscribe notions with peer-to-peer message protocols, if implemented with message types such as "Subscribe", could be considered typed-message agents by this definition, yet database systems are usually considered to be too simple to be agents. Whether this is because of ignorance of advanced database functionality or whether further sharpening is required, the omission of mere "servers" is sufficient for our purposes. We also note that our criterion is a continuum: an agent is an agent to the degree that it collaborates with other agents using volunteered messages.

Servers vs Agents

There are many useful engineering servers. Examples include the motor gear sizing service, the thermochemical calculator, and the initial application of the Java Agent Template to providing an interface to a fixture advisor. All of these examples are web-based.

Engineering agents are typically research projects such as the Lockheed COSMOS system and the MACE application agents, the CONCUR examples, the CIFE agent projects (especially the ACL effort), the STRAND finite element analysis system of agents, and the Next-Link agent framework. All of these examples are typed-message agents; in fact, they are KQML agents, though the "flavor" or KQML varies. These agents also make varying degrees of commitments to a content language, such as KIF, and ontologies. Perhaps most important, they all allow engineers in multiple disciplines to collaborate with other engineers and software services. The engineer is generally aware of these other agents and exchanges messages with them.

Most of these engineering agents are focused on a particular engineering project application, following the PACT example that generated the KQML infrastructure. The MACE and Next-Link applications have similar features to PACT. All of the engineering agents of these examples have one thing in common - none of them use web-based interfaces or agents, unlike the engineering servers. That is, the distinction between servers and agents is not merely academic - there has so far been a real schism between web-based servers and agents in at least the engineering domain.

A major reason for this is the requirement for peer-to-peer protocols as discussed in the previous section. Server push capabilities and Java are new and were not originally part of these projects. But there is another major factor as well.

Consider the successful MADEFAST experiment in collaborative design. Several universities and companies collaborative designed, from scratch, and built a prototype missile seeker in six (6) months. Several Internet-based collaborative tools were tried and the WWW was clearly the most effective common technology. But no agents were used, nor have they been since.

Rather than protocol, in this case, the problem is the structure of information. Web pages are structured to the extent that HTML is, but none of the HTML tags correspond to the type of structure required by the engineering typed-message agents. HTML tags describe format. Agents need task-based semantics. It does no good to know that a word is bold-faced to an agent that needs a task-level computable structure. The extensive web pages that document the MADEFAST design cannot be read by agents and so they cannot participate.

The reverse is also true. Agents typically don't produce web pages. It's not that they could not. But doing so is not part of the commitment that defines the typed-message agent technology and is not needed for computation. Thus the web and these agents tend not to interact, but rather go their separate ways.

Servers use forms and values to obtain structure (and do generate web pages). But, in general, this approach is not very different from the OO programming paradigm. The messages, consisting of named values, being sent are entirely domain-specific. There is no shared common protocol, such as KQML, with some typed-semantics that are domain-independent. Thus the web environment, with its fundamental client-server nature and unstructured data is not conducive to agents - it might even be called "hostile".

Connecting Agents to the Web

What can be done to integrate these agents with the web? There are several first steps that are interesting if not a solution. One is the CGI-bin program written by Greg Olsen that translates form values into KQML messages, and messages into web pages in reverse, as used in the Redux' Trip Agent demonstration. However, this demonstration also shows the fundamental flaw in this approach - the one reply per request limitation. Multiple messages from the agent are lost. Possibly this could be corrected with server push but will take more work and our development experience so far indicates that such connections are more difficult than they first appear.

Some non-engineering agents have been connected to the web and don't have the problem of multiple messages from agents. The initiation of messages, for collaborative tasks, will be found in AI-style intelligent agents such as collaborative email filtering. The distinction between previously discussed web servers and collaborative agents initiating messages shows up strikingly in the recent commercialization of some of the MIT Autonomous Agent research such as "Firefly", an intelligent music recommendation service, Webdoggie", a personalized document filtering system, and the Cyber Yenta. Each of these is built upon the notion that agents that help people can help better if they learn from each other. So an agent recommending music for you, The Similarity Engine, for example, will try to find agents that are recommending music for other people like you. Thus, in the background, messages are being exchanged, and presumably, volunteered in an autonomous manner, though this is not clear from the documentation. However, for the user, this appears to be just a server. A query is made and an answer is returned.

The nature of web communications, in some sense, trivializes the agent collaboration process. The web is client/server-based and the autonomous agents require peer-to-peer communications. Thus agents become an an underlying technology for a web server, but it is easy to distinguish the behavior on the basis of message exchange. The result is that there is no agent behavior observable to the user. Users are not aware of other users or of other agents. Thus there is no problem with multiple messages and clients.

The web-based problems remain with engineering agents that do need to connect users in a collaborative task. Java seems to offer a more advanced, flexible approach for web-based agents, especially Rob Frost's Java Agent Template (JAT) that facilitates writing Java agents that send KQML messages. This approach is very promising in that it will allow people to interact with agents through browsers with nice interfaces. (NOTE: JAT was superceded after publication of this article by JATLite.

However, our group is already working on version 3.0 because we've found fundamental problems, mostly with respect to the need to have open peer-to-peer connections between the agents and browser clients. In particular, the client nature of browsers is reflected in the limitation that an applet can only open a connection to the server that spawned it. In order to send messages to multiple agents elsewhere on the Internet, it is necessary to write an agent router. This agent router must also keep open connections, another deviation from the web connection paradigm of one connection per request that requires substantial changes to the first versions of the JAT.

After the JAT allows multiple client browsers to connect to multiple agents, we will have a first integration of the web and agents at the level of access to agents. We still will have no access by agents to web pages. That is, we will have solved most of the protocol problem and will have some nice demonstrations, but web pages will still be unreadable by agents. Something more is needed.

There are two promising approaches to the structure problem. One is the conversion of web pages to relational databases in the Stanford Infomaster project. (There is related work in the reverse direction [Dossick and Kaiser 96].) Another approach starts at the data authoring. The ABSML approach is to extend HTML to include tags that include semantics. ABSML is currently used in Lockheed's MECE engineering design documentation system. We are working with Lockheed to extend this approach to include tags that can also be meaningful to Next-Link agents. It is not clear that a mark-up language like HTML is sufficient for this purpose and we may have to convert to a more specialized design language.

The same method of extending HTML is being used to document decision argumentation in Zeno. A similar, but more flexible way to add semantics to HTML documents is to add tags that refer to ontologies. This seems the most promising approach for providing access for agents to web pages.

Conclusions

While there is much discussion of "intelligent web agents", the commercial examples so far are better described as "servers". The adjectives "intelligent" and "autonomous" are problematic academic terms for software that is not based on the web. The agents that are being developed for engineering applications, typed-message agents, are, indeed, largely incompatible with the web, and are very different from engineering web servers. This incompatibility follows from a criterion of agenthood that agents be able to initiate messages to one another. The web is client/server-oriented and agents require peer-to-peer communications. Another major problem is that agents require structure reflecting task-level semantics and the web is oriented around formatting structure representing only transport and display of information.

These two fundamental sources of incompatibility must be addressed for each to leverage off the other. The JAT work seems to offer hope for overcoming the protocol problem. The lack of semantic structure in HTML documents is an even larger problem but may be addressed in the future by advanced authoring tools and by programs that can read extract semantics from web documents. Relatively simple examples of both approaches exist today. However, much work remains before useful engineering agents emerge on the web.

Finally, we note that many of the agents being developed for engineering applications are of the "weak" kind in that there is no commitment to powerful reasoning by the individual agents. In fact, "dumb" legacy systems can be accommodated by the typed-message approach that only commits to an application-dependent protocol. The protocol may be derived from a "strong" theory of agents, as advocated by Haddadi [Haddadi 96], or from a theory of design as with the Next-Link agent protocol. In both cases, the result is that typed-message agent-based systems can add value to engineering systems and even integrate heterogeneous services though no individual agent might be characterized as "intelligent". In short, weak agents can be powerful as well as being well-defined.

Bibliography

[Dossick and Kaiser 96] Dossick, S. and Kaiser, G., "WWW Access to Legacy Client/Server Applications," Computer Networks and ISDN Systems, 28 931-940, 1996.

[Griswold 96] Griswold, S., "Unleashing Agents The first wave of products incorporation software agent technology has hit the market. See what's afoot," Internet World, 7:5, May 1996.

[Haddadi 96 Haddadi, A., _Communication and Cooperation in Agent Systems : A Pragmatic Theory_, Springer Verlag, Lecture Notes in Computer Science, No. 1056, 1996,

[Reddy 96] Reddy, R., "To Dream the Possible Dream," 1996 Turing Award Lecture in Communications of the ACM, 39:5, May, 1996.

Other Resources:

Munindar Singh has since published a nice paper emphasizing the importance of communications over internal agent architecture.
JATLite Java Agent Template and infrastructure
IESERV Agent Links
IEEE Internet Computing Vol 1., No. 4, July/August 1997

Charles J. Petrie
Stanford Center for Design Research
<petrie@cdr.stanford.edu>