Nabil R. Adam![]()
Rutgers University, CIMIC
180 University Avenue,
Newark, NJ 07102
adam@adam.rutgers.edu, (201)648-5239
-
Shamim Naqvi
Bellcore, 445 South St., Morristown, NJ 07960, shamim@bellcore.com
We are interested in creating multimedia information objects that contain within themselves their own interfaces. Such objects are thus capable of making decisions as to how to render themselves in different situations.
Why is this problem interesting? First, in many client-server network information systems such as the World Wide Web (WWW) many clients with differing capabilities connect to servers for acquiring information objects. Indeed the emphasis is on moving intelligence to the periphery of the network. The server in this case decides what object to serve to the client (in some cases the client may, as a part of the communication protocol, specify to the server some of its capabilities). In practice, today, most servers serve the same object to all clients, assuming that the client is smart or powerful enough to make use of the object. Thus, the fact that a client does not have audio capabilities may not prevent a server from sending an audio stream to the client.
Second, in the digital information world of today, there is a multiplicity of clients. Users may use TV sets, radios, PCs, PDAs, laptops, and cellular phones for accessing information. One of the goals of the National Information Infrastructure (NII) project is to allow universal access to distributed stores of information (Digital Libraries) and to provide this access at a reasonable cost to every citizen [3]. Widespread use of Digital Libraries is predicated on the universal availability of access for a large number of users [1].
The question in providing universal access is, What information appliance will the populace at large use to access information? There is no agreement on a particular kind of information appliance in the marketplace today. Neither is there agreement or consensus on what kind of information will be contained in Digital Libraries [2]. A partial list of possible objects includes all kinds of printed content (books, scientific journals, magazines, user manuals, product literature, etc.), entertainment objects such as videos and audio, legal information, safety regulations, government information, and so on. These objects may contain multimedia information, e.g., text, audio, images, and video.
We believe that there will be no agreement on an information appliance that will be used by the populace to access objects in a digital library because different appliances fulfill different user needs. No single appliance will possess all the modalities that one needs for effective and enjoyable information processing. For example, the experience of reading or listening to a newspaper is quite different from the experience of interacting with digital news in a hypermedia format on a PC monitor. Mobile users need a different appliance from fixed-base users, etc. Newer homes are being constructed with Tiny Area Networks (TAN) that will connect many household appliances. The TV set, the PC, the telephone, the Personal Digital Assistant (PDA), wireless devices, all will be internetworked throughout the house. It will be possible to freely move digital objects from one appliance to another. A user may first want to find a multimedia object through a workstation and then decide to play it on a TV set in a more relaxed setting.
It then follows that Digital Libraries will contain many different kinds of information objects and will have to contend with many different information appliances. In such a heterogeneous world of objects and clients, the issue of object Interoperability in different client environments assumes paramount importance. We propose the construction of information objects that carry within themselves their own interface decisions. When confronted with a certain client possessing certain capabilities, the object decides on how to render itself in that client's environment. This self rendition is referred to as a manifestation and it is important to bear in my mind that a particular manifestation may preclude certain part of the content of the object. For example, a particular manifestation may not employ video components of the overall information object, restricting itself to audio and text data types only (this could be useful for a PDA client over a low bandwidth connection).
We feel that computing objects that carry within themselves their interfaces show promise of solving Interoperability in an increasingly heterogeneous computing world. Ultimately, we see the information networking future as consisting of information objects that not only control their own renderings but control more and more of their interactions with other computing objects, i.e., autonomous computing agents.
Figure 1: Conceptual Descriptions of Manifestations
Figure 2: Java Approach of Resolving Manifestations
For users with a powerful information appliance the manifestation of the news object may encompass all the different modalities of the object such as the full manifestation (see Figure 1). For terminal users the news object can be rendered only in text (the terminal manifestation in Figure 1). For users with audio devices, a the news object can be rendered only in audio (the audio manifestation in Figure 1). For users with printing devices we may have a manifestation that emphasizes more artistic page layouts of text and image data (the printer manifestation in Figure 1). We may also have special manifestations for handicapped users; for example, closed captioned news for hearing-handicapped users and audio only for visually handicapped users.
We may also have manifestation constraints based on user preferences. For example, a user may want to suppress the text portion of the digital news object and concentrate more on image and audio components (reduced text manifestation).
There are three basic approaches for addressing the manifestation problem. We discuss each approach below.
An advantage of this approach is that only useful information is sent to clients. A limitation of this approach, however, is that if a client does not have the corresponding player, the object can not be rendered. For example, a user may have a SVGA card, but not MPEG software, and would thus be unable to view MPEG encoded objects. This approach puts most of processing load onto the server and requires establishing the necessary conversion functions in advance. This makes manifestations static rather than dynamic.
The approach we propose is a variation of the client-based approach above. A client asks for an information object from a server. This request results in the server sending to the client a compact version of the object. Let us call it an oblet. An oblet has the capability to reside in the client, make manifestation decisions based on object characteristics, user profile/preference, client hardware and software characteristics, etc. Once a manifestation has been decided, the oblet asks the server to transmit the data components of the object needed for that manifestation. For example, the major data components of the multimedia news object of the previous section may be audio component of the news, video presentation, news for the handicapped, etc. The oblet also must contain the reasoning components necessary to make a manifestation decision.
The oblet runs at the client and based on the client's capabilities, user profile and object characteristics, a certain manifestation is chosen for this client. The server at this time is then requested to transmit the data needed for the manifestation. It is important to note that no data is transmitted to the client unless it is needed for a manifestation and is requested as a part of the manifestation decision.
Our approach is novel in that:
The basic idea of our proposal is that the user request for information results in a program that is transmitted from the server to the client. This program can be thought of as a proxy server that resides in the client, makes manifestation decisions, and interacts with the content server, getting data that it needs for a manifestation. The proxy server is constructed dynamically for the client by the content server.
The idea of a proxy server controlling manifestations at a client has an interesting consequence that can be referred to as dynamic manifestations. Consider the scenario that our multimedia news object is being manifested in a PC at a user's home. The user then decides to stop the manifestation and then to resume it from his/her car on the car radio (or PDA). The point of this example is that a manifestation was started on a client, stopped, and is now being asked to resume on a different client with quite different capabilities. Whereas the PC manifestation may have utilized video, text and audio, the car manifestation may only use audio. Moreover, how do we ``resume'' in the car from where the previous manifestation was stopped?
If we view this problem of dynamic manifestations from the point of view of the proxy server proposal, a solution presents itself in a straight-forward manner. The proxy server while running is maintaining state. Once a manifestation is stopped, the proxy server can save this state on some server in the network. The next time the user invokes a ``resume'' command, the proxy server makes a manifestation decision, re-establishes state, and starts the new manifestation from the previous state information. Thus, the user sees a continuation from the previous rendition of the object.
The notion of a proxy server can also be viewed as a generalization of the ``install'' programs available in most PC software installs which check the PC for certain resources (availability of disk space, version of operating systems, etc.) before installing a copy of the software under consideration. In networked environments the install program may in fact download a copy of the software from a server on the network. We generalize the install methodology by endowing the install script to be a general reasoning engine with explicit programmability, by allowing the downloading to be parametric and resource dependent, and by generalizing the installation to a run time execution of the downloaded object.