C.4 Multimedia Information Technologies

Principal Authors:

Dan Fishman and Hector Garcia-Molina

Additional Contributors:

William Ahearn, Gregory Allemann, Stephen Archer, Peggy Bair, Nathaniel Borenstein, James Foley, Shahram Ghaderharizadeh, Pat Hanrahan, Paula Hawthorn, Stephen F. Heil, Ronald B. Jennings, Monica Krueger, Andrew Lippman, Ajay Luthra, Bryan Lyles, Balas Kausik Natarajan, Scott Nelson, Dragutin Petkovic, James Romlein, Bruce L. Steger, Connie Stout, Lawrence J. Thorpe and Richard Watson


1. Introduction

Because the term "multimedia" may have a variety of meanings, we start by presenting our working definition. A medium is a way to convey information to or from a human. For example, black-and-white still images are a medium. Color images are a different medium because they carry different information. A sequence of moving images (e.g., a movie) is yet another type of medium. Multimedia refers to the use of multiple media (plural of medium) to convey information. Multimedia information is information meant to be conveyed via multiple media.

Multimedia systems offer a quantum leap over conventional systems, for instance, over one that uses plain text to convey information. There are three main reasons for this:

"All media are not created equal."

Some media are better for conveying certain information. For instance, a musical tune is best given via sound. Furthermore, even if a concept can be presented multiple ways, some people can better absorb it via a particular medium. For example, some students may remember color patterns while others may remember musical patterns best. Thus, a system that can offer both media will be useful to a much wider variety of people.

"A picture is worth a thousand words."

The rate at which information can be conveyed is much higher for some media. In addition, using multiple media concurrently has the potential for an even higher overall rate.

"The whole is greater than the sum of its parts."

There is strong synergy between the media, so a multimedia presentation is much more realistic. It can be more of an "experience" rather than a simple transfer of information.

In principle, multimedia systems cover a very wide spectrum, from smoke signals to virtual reality. However, modern multimedia systems usually have some, if not all, of the following features:

Because of its high impact as a communication medium, multimedia is a very desirable form of communication, and it is certain to play a very important role in National Information Infrastructure applications ranging from health care and education to entertainment and multiperson games.

2. The Technical Challenges of a Multimedia Information Infrastructure

The multimedia research area cuts across most of the other research tracks covered in this report. Thus, it is important to understand what technological problems are inherent to the use of multimedia. For instance, how does multimedia impact the information access problems (Section C.3), the dependability issues (Section C.6) or the network components (Sections C.1 and C.2)? We can answer these questions by identifying four key requirements of multimedia information systems: We have identified the following classes of technical areas that should be addressed in order to meet the above technical challenges: In the next section we outline the key research problems in each of the above areas. Section 4 then gives our recommendations for the development of testbeds.

3. Research and Development Recommendations: Research Problems

For each research problem identified in this section, we give its priority and time frame. By priority, we mean the priority for government investment. All the problems we identify here are important; the only difference is that we expect that some of them will be addressed mainly by industry, while others require much more attention by funding agencies. The former problems will be labeled "low priority" (L), the latter "high priority" (H), with "medium priority" (M) in between. For the time frame, we indicate when one can expect significant results. We expect that problems labeled "short term" will be best addressed by industry, with solutions in the next two to four years, while problems labeled "long term" will be addressed mainly by universities and industrial advanced research laboratories, with solutions expected in four to eight years.

By high priority, we mean those pressing research problems that are not likely to receive sufficient research attention by private industry. The funding for these research efforts we believe should be primarily directed at universities and government labs, possibly in partnership with industrial R&D organizations.

3.1 Multimedia Information Capture

New technologies are needed for creating multimedia sources with a desired content, and for manipulating, editing and otherwise converting existing sources into desired forms. This includes initial conversion and processing of existing multimedia information (e.g., images and film) into digital format, as well as digital capture and transformation of live events. It is also a goal to easily tag and annotate this information for subsequent retrieval. Due to the richness of multimedia, simple (i.e. text) retrieval techniques will not suffice. Techniques for indexing by content (color, texture, motion, etc.) and by similarity need to be devised, together with easy-to-use automated and semi-automated tools for database population and index computation.

Research Problems:

M -- Fast, inexpensive 24+ bit scanners, to capture various source materials, including deteriorating source materials, such as aging collections of photographic images and films. [medium term]

M -- Easy-to-use tools for image correction and restoration in digital format. [medium term]

H -- Media recognition and segmentation, automatic feature extraction and indexing for retrieval. [long term]

M -- Effective browsers as an integral part of query and retrieval systems. (Section 3.5) [medium term]

3.2 Multimedia Authoring and Generation

A limiter to the wide-scale use of multimedia communications is the inherent difficulty in creating multimedia presentations and the dearth of tools to assist in the process. There is a need to provide robust, yet inexpensive, multimedia authoring tools that enable non-experts to design and organize effective multimedia presentations, and to play back reasonably high-quality multimedia presentations from specifications. Such presentations may be interactive and guided by interaction with the intended user. For example, in educational applications, the tools will enable teachers and students to prepare training exercises, and multimedia reports and projects, respectively. These tools are analogous to current desktop publishing tools, except that they must address the layout, merging, synchronization and presentation of multiple streams of dynamic, time-varying information, in order to convey an intended communication experience. Techniques for simplifying the generation of multimedia presentations will accelerate the use of multimedia by non-experts.

Research Problems:

M -- Tools for designing multimedia presentations, such as for outlining non-linear presentations. [medium term]

M -- Design critics/design intelligence. [long term]

M -- Generation based upon standardized "style sheets." [medium term]

M -- Modalities for alternate senses, such as touch, taste and smell. [long term]

3.3 Ownership and Fair Use

Information property rights and economic models are very complex and are poorly understood. Current models and mechanisms are primitive and stand in the way of commercial distribution and use. Mechanisms are needed for protecting ownership and for charging, so that multimedia information becomes widely and commercially available. We note that while this is a critical issue for all types of information, multimedia introduces particular challenges. For example, a movie may be available in various formats, running times and resolutions. Each may have different use restrictions and pricing schemes. A multimedia system must understand all the variations, who can access what version, how one pays for each and how one format can be converted to another.

Research Problems:

H -- Economic and legal models for multimedia in the NII. This is a multidisciplinary problem. [long term]

H -- Media registration facilities to protect intellectual property. [medium term]

M -- Capturing authorship, ownership and pedigree. [long term]

H -- Mechanisms for tracking use of information, for maintaining histories of modifications (version management) and for validating the integrity of information. [long term]

3.4 Multimedia Representation

A multimedia system must store and manipulate information. This information is more that just the bytes that make up the objects (e.g. images, waveforms) to be conveyed to the end users. The information must also describe the semantics of the multimedia objects, e.g., what each means, how it is related to others and how it can be manipulated. The system must also understand the internal structure and format of each object. For example, for an image, the system needs to know if the pixels (each point of the image) are stored left to right, top to bottom, or some other way; if each pixel is a simple "on" or "off," or if it represents an intensity or a color; if the image is compressed and how; and so on.

The representation problem is how to manage all the information about the multimedia objects (sometimes called meta-information). This includes knowing what this information is in the first place, how to store it and how to use it to make the multimedia system more effective. Representation is intimately connected to retrieval (Section 3.5); for example, representations that include pre-computed search information can speed up retrieval.

It is also desirable to create mechanisms by which multimedia systems can accommodate unanticipated, new media types, as well as unanticipated query types and manipulations without rendering existing services or information obsolete.

Given the complexity of multimedia objects, it is important to develop efficient, effective and flexible representations, both to reduce load on system components and to make it easier to meet real-time constraints, thus improving the quality of the information conveyed.

Research Problems:

M -- Information modeling: How to model and represent complex/multiple media, including annotations. [medium term]

H -- Understanding and standardizing formats for synchronization and coordination. [medium term]

L -- Multimedia representations that can accommodate changes, as well as tools for supporting such schema extensions. [short term]

H -- Extending the concept of information objects to include "interactions." An interaction object could be a script or a flow control structure that represents an exchange between users and/or systems. [long term]

M -- Resolution-independent representation that reduces storage overhead of multiple representations for the same object. [short term]

H -- Improved representations. One important example is compression schemes that exploit the semantics of a particular application or the content type. Another example is representations that can improve performance and effectiveness of searching and browsing. [medium term]

L -- Mechanisms for mapping between representations. [medium term]

L -- Mechanisms for implementing multiple service levels. [long term]

3.5 Information Management Systems

The multimedia information management system is responsible for saving multimedia information and making it available to end users when they need it. Current commercial database systems will not be able to handle the requirements that multimedia introduces, mainly large data volume and synchronized, real-time delivery. Data storage and access software services that are adequate for multimedia must be developed. Furthermore, given the expected large amounts of multimedia information, it is critical for users to be able to focus on what is important for them. This requires the development of advanced querying, navigation and browsing techniques that enable users to find the information they require, when they require it.

Research Problems:

H -- Develop a scalable software architecture that covers different types of storage media and data volumes, from petabyte stores to desktop storage, from very fast access times to off-line storage. [medium term]

M -- Provide traditional database services such as reliability, access control and concurrency control in a way that scales to very large data collections distributed over many computers. [medium term]

H -- Deliver multiple data streams in an integrated fashion, with guaranteed timing constraints. [long term]

H -- Integration of data from heterogeneous sources, with possibly different models or formats. [long term]

H -- Content-based queries over multimedia information. For example, searching for images that contain a given shape or looking for scene changes in a movie. [long term]

H -- Queries over compressed or multi-representation objects. The goal is to search over objects without having to uncompress them or to change their representation. [long term]

H -- Optimization of queries that span distributed, heterogeneous information sources. [long term]

M -- Mechanisms for querying multimedia representations, including the relationships between objects, and the access restrictions and pricing information of objects. [medium term]

M -- Mechanisms for browsing and navigating (e.g., via hyper links) through multimedia information, possibly in conjunction with content-based queries. Good human interfaces that allow for interaction are critical here. [medium term]

H -- Strategies for displaying results of queries and searches, including ways to compactly (space- and time-wise) visualize large collections of multimedia information. [long term]

3.6 Storage Hardware

Multimedia requires large data volumes that will rapidly increase as resolutions, the types of media available and the number of sources increase. Storage systems must be developed that can handle the data volume and timing requirements of multimedia.

Research Problems:

M -- Higher density, higher bandwidth and lower latency storage devices. [short term]

L -- Storage arrays with parallel access to multiple storage devices. [short term]

M -- Storage hierarchies, from petabyte tertiary storage to gigabyte primary random-access storage. [medium term]

H -- Low power storage for mobile, multimedia access devices. [long term]

4. Research and Development Recommendations: Testbeds

In this section we recommend funding of testbeds as a way to explore issues of scalability, interoperability and usability. Because of the great demands placed by multimedia on all system resources, it is especially important to create large-scale deployments with significant multimedia content to understand its overall impact.

4.1 Criteria for Testbed Funding

Before presenting our testbed recommendations, we present a number of criteria that we believe should be considered in testbed selection, implementation and funding.

The chosen applications should drive critical research issues, such as:

The applications should themselves be important nationally, e.g., their success would represent progress in educational performance, manufacturing efficiency, health care delivery and so forth. There is a determined effort to leverage or reuse hardware and software components across separate application areas and testbeds.

The testbeds must be organized into effective experiments, with objective criteria and metrics used to evaluate results across multiple dimensions, including the technological and sociological factors. In assessing experiments, it will be important to consider such human factors as the quality of visual and auditory presentations as part of the overall assessment of effectiveness.

The testbed should not be just a "demo." It should involve real users and provide realistic uses of the content and technology. On the other hand, it must be regarded as only a pilot and not a real implementation of the NII, because it may be desirable to tear down the experiment in favor of trying something new or different.

Because of the nature and scale of the testbeds and the experiments that will be run on them, multiyear funding commitments will be required.

4.2 Recommended Testbeds

We recommend two testbeds, each of which represents a different point in the cost-versus-power curve. The first one is aimed at education, with the intent of pushing lower-cost solutions for less demanding applications. The second testbed is in health care. This testbed is recommended because of the very demanding requirements posed by health care.

4.2.1 Education Testbed

Middle school multimedia testbed: Phase A: Two- to three-year phased implementation, one school, 2,000 students, 100 computers/$5,000 each, deployed in learning centers. Research partnership should be established with a research university or an industrial research lab.

Phase B: Four to five years, one district, 10 schools, 20,000 students.

Challenges

4.2.2 Health Care Testbed

Multihospital testbed: Phase A: Two- to three-year phased implementation, three hospitals.

Phase B: Four to five years, 20 hospitals, richer functionality.

Challenges