C.4 Multimedia Information Technologies
Principal Authors:
Dan Fishman and Hector Garcia-Molina
Additional Contributors:
William Ahearn, Gregory Allemann, Stephen Archer, Peggy Bair, Nathaniel
Borenstein, James Foley, Shahram Ghaderharizadeh, Pat Hanrahan, Paula Hawthorn,
Stephen F. Heil, Ronald B. Jennings, Monica Krueger, Andrew Lippman, Ajay
Luthra, Bryan Lyles, Balas Kausik Natarajan, Scott Nelson, Dragutin Petkovic,
James Romlein, Bruce L. Steger, Connie Stout, Lawrence J. Thorpe and Richard
Watson
1. Introduction
Because the term "multimedia" may have a variety of meanings, we start by
presenting our working definition. A medium is a way to convey information to
or from a human. For example, black-and-white still images are a medium. Color
images are a different medium because they carry different information. A
sequence of moving images (e.g., a movie) is yet another type of medium.
Multimedia refers to the use of multiple media (plural of medium) to convey
information. Multimedia information is information meant to be conveyed via
multiple media.
Multimedia systems offer a quantum leap over conventional systems, for
instance, over one that uses plain text to convey information. There are three
main reasons for this:
"All media are not created equal."
Some media are better for conveying certain information. For instance, a
musical tune is best given via sound. Furthermore, even if a concept can be
presented multiple ways, some people can better absorb it via a particular
medium. For example, some students may remember color patterns while others may
remember musical patterns best. Thus, a system that can offer both media will
be useful to a much wider variety of people.
"A picture is worth a thousand words."
The rate at which information can be conveyed is much higher for some media. In
addition, using multiple media concurrently has the potential for an even
higher overall rate.
"The whole is greater than the sum of its parts."
There is strong synergy between the media, so a multimedia presentation is much
more realistic. It can be more of an "experience" rather than a simple transfer
of information.
In principle, multimedia systems cover a very wide spectrum, from smoke signals
to virtual reality. However, modern multimedia systems usually have some, if
not all, of the following features:
- Simultaneous and integrated delivery: The multiple media are
delivered concurrently and in an integrated or synchronized fashion. For
example, the voices in a sound track must match the corresponding mouth
movements in the movie.
- Interactive delivery: A human interacts with the delivery system,
answering questions or indicating what information is desired next. Note that
the information path from human to computer system can also use a variety of
media, e.g., a mouse for pointing or voice input.
- Dynamic and real-time delivery: In modern systems, at least one of
the media is dynamic (i.e., conveying a stream of information), such as a movie
or a soundtrack. The stream of information usually has to be delivered with
some timing constraints, e.g., 24 frames a second.
Because of its high impact as a communication medium, multimedia is a very
desirable form of communication, and it is certain to play a very important
role in National Information Infrastructure applications ranging from health
care and education to entertainment and multiperson games.
2. The Technical Challenges of a Multimedia Information Infrastructure
The multimedia research area cuts across most of the other research tracks
covered in this report. Thus, it is important to understand what technological
problems are inherent to the use of multimedia. For instance, how does
multimedia impact the information access problems (Section C.3), the
dependability issues (Section C.6) or the network components (Sections C.1 and
C.2)? We can answer these questions by identifying four key requirements of
multimedia information systems:
- Multimedia typically involves synchronized and real-time information
delivery. New technologies are required to meet these needs.
- Operations on multimedia information are often richer and more complex.
For example, end users may want to search images based on shapes or hues. They
may want to edit and merge videos. These operations present new computational
and user interface challenges.
- Multimedia technology is driven by a large consumer market. This leads to
requirements for low cost, ease of use, and the ability to accommodate large
numbers of users and information sources.
- Multimedia stresses all system components. Given the above requirements
and the large data volume requirements (e.g., a single high-quality image may
require millions of bytes), multimedia places large loads on all system
components. Just to mention a few examples, storage subsystems must have large
capacity and large transfer rates, networks must have high bandwidth and meet
real-time constraints, and displays must be high resolution.
We have identified the following classes of technical areas that should be
addressed in order to meet the above technical challenges:
- Multimedia information capture.
- Multimedia authoring and generation.
- Ownership and fair use.
- Representation.
- Multimedia information management systems.
- Storage hardware.
In the next section we outline the key research problems in each of the above
areas. Section 4 then gives our recommendations for the development of testbeds.
3. Research and Development Recommendations: Research Problems
For each research problem identified in this section, we give its priority and
time frame. By priority, we mean the priority for government investment. All
the problems we identify here are important; the only difference is that we
expect that some of them will be addressed mainly by industry, while others
require much more attention by funding agencies. The former problems will be
labeled "low priority" (L), the latter "high priority" (H), with "medium
priority" (M) in between. For the time frame, we indicate when one can expect
significant results. We expect that problems labeled "short term" will be best
addressed by industry, with solutions in the next two to four years, while
problems labeled "long term" will be addressed mainly by universities and
industrial advanced research laboratories, with solutions expected in four to
eight years.
By high priority, we mean those pressing research problems that are not likely
to receive sufficient research attention by private industry. The funding for
these research efforts we believe should be primarily directed at universities
and government labs, possibly in partnership with industrial R&D
organizations.
3.1 Multimedia Information Capture
New technologies are needed for creating multimedia sources with a desired
content, and for manipulating, editing and otherwise converting existing
sources into desired forms. This includes initial conversion and processing of
existing multimedia information (e.g., images and film) into digital format, as
well as digital capture and transformation of live events. It is also a goal to
easily tag and annotate this information for subsequent retrieval. Due to the
richness of multimedia, simple (i.e. text) retrieval techniques will not
suffice. Techniques for indexing by content (color, texture, motion, etc.) and
by similarity need to be devised, together with easy-to-use automated and
semi-automated tools for database population and index computation.
Research Problems:
M -- Fast, inexpensive 24+ bit scanners, to capture various
source materials, including deteriorating source materials, such as aging
collections of photographic images and films. [medium term]
M -- Easy-to-use tools for image correction and restoration in digital
format. [medium term]
H -- Media recognition and segmentation, automatic feature extraction
and indexing for retrieval. [long term]
M -- Effective browsers as an integral part of query and retrieval
systems. (Section 3.5) [medium term]
3.2 Multimedia Authoring and Generation
A limiter to the wide-scale use of multimedia communications is the inherent
difficulty in creating multimedia presentations and the dearth of tools to
assist in the process. There is a need to provide robust, yet inexpensive,
multimedia authoring tools that enable non-experts to design and organize
effective multimedia presentations, and to play back reasonably high-quality
multimedia presentations from specifications. Such presentations may be
interactive and guided by interaction with the intended user. For example, in
educational applications, the tools will enable teachers and students to
prepare training exercises, and multimedia reports and projects, respectively.
These tools are analogous to current desktop publishing tools, except that they
must address the layout, merging, synchronization and presentation of multiple
streams of dynamic, time-varying information, in order to convey an intended
communication experience. Techniques for simplifying the generation of
multimedia presentations will accelerate the use of multimedia by non-experts.
Research Problems:
M -- Tools for designing multimedia presentations, such as for
outlining non-linear presentations. [medium term]
M -- Design critics/design intelligence. [long term]
M -- Generation based upon standardized "style sheets." [medium
term]
M -- Modalities for alternate senses, such as touch, taste and smell.
[long term]
3.3 Ownership and Fair Use
Information property rights and economic models are very complex and are poorly
understood. Current models and mechanisms are primitive and stand in the way of
commercial distribution and use. Mechanisms are needed for protecting ownership
and for charging, so that multimedia information becomes widely and
commercially available. We note that while this is a critical issue for all
types of information, multimedia introduces particular challenges. For example,
a movie may be available in various formats, running times and resolutions.
Each may have different use restrictions and pricing schemes. A multimedia
system must understand all the variations, who can access what version, how one
pays for each and how one format can be converted to another.
Research Problems:
H -- Economic and legal models for multimedia in the NII. This is
a multidisciplinary problem. [long term]
H -- Media registration facilities to protect intellectual property.
[medium term]
M -- Capturing authorship, ownership and pedigree. [long
term]
H -- Mechanisms for tracking use of information, for maintaining
histories of modifications (version management) and for validating the
integrity of information. [long term]
3.4 Multimedia Representation
A multimedia system must store and manipulate information. This information is
more that just the bytes that make up the objects (e.g. images, waveforms) to
be conveyed to the end users. The information must also describe the semantics
of the multimedia objects, e.g., what each means, how it is related to others
and how it can be manipulated. The system must also understand the internal
structure and format of each object. For example, for an image, the system
needs to know if the pixels (each point of the image) are stored left to right,
top to bottom, or some other way; if each pixel is a simple "on" or "off," or
if it represents an intensity or a color; if the image is compressed and how;
and so on.
The representation problem is how to manage all the information about the
multimedia objects (sometimes called meta-information). This includes knowing
what this information is in the first place, how to store it and how to use it
to make the multimedia system more effective. Representation is intimately
connected to retrieval (Section 3.5); for example, representations that include
pre-computed search information can speed up retrieval.
It is also desirable to create mechanisms by which multimedia systems can
accommodate unanticipated, new media types, as well as unanticipated query
types and manipulations without rendering existing services or information
obsolete.
Given the complexity of multimedia objects, it is important to develop
efficient, effective and flexible representations, both to reduce load on
system components and to make it easier to meet real-time constraints, thus
improving the quality of the information conveyed.
Research Problems:
M -- Information modeling: How to model and represent
complex/multiple media, including annotations. [medium term]
H -- Understanding and standardizing formats for synchronization and
coordination. [medium term]
L -- Multimedia representations that can accommodate changes, as well
as tools for supporting such schema extensions. [short term]
H -- Extending the concept of information objects to include
"interactions." An interaction object could be a script or a flow control
structure that represents an exchange between users and/or systems. [long
term]
M -- Resolution-independent representation that reduces storage
overhead of multiple representations for the same object. [short
term]
H -- Improved representations. One important example is compression
schemes that exploit the semantics of a particular application or the content
type. Another example is representations that can improve performance and
effectiveness of searching and browsing. [medium term]
L -- Mechanisms for mapping between representations. [medium
term]
L -- Mechanisms for implementing multiple service levels. [long
term]
3.5 Information Management Systems
The multimedia information management system is responsible for saving
multimedia information and making it available to end users when they need it.
Current commercial database systems will not be able to handle the requirements
that multimedia introduces, mainly large data volume and synchronized,
real-time delivery. Data storage and access software services that are adequate
for multimedia must be developed. Furthermore, given the expected large amounts
of multimedia information, it is critical for users to be able to focus on what
is important for them. This requires the development of advanced querying,
navigation and browsing techniques that enable users to find the information
they require, when they require it.
Research Problems:
H -- Develop a scalable software architecture that covers
different types of storage media and data volumes, from petabyte stores to
desktop storage, from very fast access times to off-line storage. [medium
term]
M -- Provide traditional database services such as reliability, access
control and concurrency control in a way that scales to very large data
collections distributed over many computers. [medium term]
H -- Deliver multiple data streams in an integrated fashion, with
guaranteed timing constraints. [long term]
H -- Integration of data from heterogeneous sources, with possibly
different models or formats. [long term]
H -- Content-based queries over multimedia information. For example,
searching for images that contain a given shape or looking for scene changes in
a movie. [long term]
H -- Queries over compressed or multi-representation objects. The goal
is to search over objects without having to uncompress them or to change their
representation. [long term]
H -- Optimization of queries that span distributed, heterogeneous
information sources. [long term]
M -- Mechanisms for querying multimedia representations, including the
relationships between objects, and the access restrictions and pricing
information of objects. [medium term]
M -- Mechanisms for browsing and navigating (e.g., via hyper links)
through multimedia information, possibly in conjunction with content-based
queries. Good human interfaces that allow for interaction are critical here.
[medium term]
H -- Strategies for displaying results of queries and searches,
including ways to compactly (space- and time-wise) visualize large collections
of multimedia information. [long term]
3.6 Storage Hardware
Multimedia requires large data volumes that will rapidly increase as
resolutions, the types of media available and the number of sources increase.
Storage systems must be developed that can handle the data volume and timing
requirements of multimedia.
Research Problems:
M -- Higher density, higher bandwidth and lower latency storage
devices. [short term]
L -- Storage arrays with parallel access to multiple storage devices.
[short term]
M -- Storage hierarchies, from petabyte tertiary storage to gigabyte
primary random-access storage. [medium term]
H -- Low power storage for mobile, multimedia access devices. [long
term]
4. Research and Development Recommendations: Testbeds
In this section we recommend funding of testbeds as a way to explore issues of
scalability, interoperability and usability. Because of the great demands
placed by multimedia on all system resources, it is especially important to
create large-scale deployments with significant multimedia content to
understand its overall impact.
4.1 Criteria for Testbed Funding
Before presenting our testbed recommendations, we present a number of criteria
that we believe should be considered in testbed selection, implementation and
funding.
The chosen applications should drive critical research issues, such as:
- Scale/size: For example, large databases, large volumes of data
transfer, and coverage of large geographic areas and large numbers of users.
- Interoperability: For example, across diverse networks, database
models, data types, operating systems and hardware platforms.
- Usability and human factors considerations: For example, ease of
creating multimedia presentations, and of finding and retrieving desired
information.
The applications should themselves be important nationally, e.g., their success
would represent progress in educational performance, manufacturing efficiency,
health care delivery and so forth.
- The testbed experiments address issues of scalability.
- The testbed systems are accessible to the research community. End-user
hardware and software are made available.
- Hardware and software platforms are made available to researchers.
- Copyright-free content is available for experimentation.
There is a determined effort to leverage or reuse hardware and software
components across separate application areas and testbeds.
The testbeds must be organized into effective experiments, with objective
criteria and metrics used to evaluate results across multiple dimensions,
including the technological and sociological factors. In assessing experiments,
it will be important to consider such human factors as the quality of visual
and auditory presentations as part of the overall assessment of
effectiveness.
The testbed should not be just a "demo." It should involve real users and
provide realistic uses of the content and technology. On the other hand, it
must be regarded as only a pilot and not a real implementation of the NII,
because it may be desirable to tear down the experiment in favor of trying
something new or different.
Because of the nature and scale of the testbeds and the experiments that will
be run on them, multiyear funding commitments will be required.
4.2 Recommended Testbeds
We recommend two testbeds, each of which represents a different point in the
cost-versus-power curve. The first one is aimed at education, with the intent
of pushing lower-cost solutions for less demanding applications. The second
testbed is in health care. This testbed is recommended because of the very
demanding requirements posed by health care.
4.2.1 Education Testbed
Middle school multimedia testbed:
- Teachers prepare and deliver multimedia educational presentations.
- Students prepare multimedia project presentations.
- Support provided for interaction with and sharing of resources.
- Student portfolios (report cards, videos) are managed.
- Teacher-teacher, teacher-student and student-student collaborations are
facilitated.
Phase A: Two- to three-year phased implementation, one school,
2,000 students, 100 computers/$5,000 each, deployed in learning centers.
Research partnership should be established with a research university or an
industrial research lab.
Phase B: Four to five years, one district, 10 schools, 20,000 students.
Challenges
- Low cost.
- Privacy.
- Tools.
- How to implement.
- How to measure success.
4.2.2 Health Care Testbed
Multihospital testbed:
- Very large datasets.
- Sharing of data between hospitals.
- Integrated access to distributed patient data including traditional
record-oriented data and multimedia data types including images (CT, MRI),
signal data (EEG, ECG) and audio (e.g., voice annotations).
- Multiple (multivendor) sources of data, both real-time and within
heterogeneous database systems with different data models.
- Interactive access to and use of the data, with architectural support to
ensure fast response times.
- Controlled data access using a patient ID/smart card and other security
and encryption methods to prevent unauthorized access and snooping.
- Computer-supported remote consultation in patient care.
Phase A: Two- to three-year phased implementation, three
hospitals.
Phase B: Four to five years, 20 hospitals, richer functionality.
Challenges
- Privacy and security.
- Responsiveness to the user.
- Performance and scalability.
- Data modeling and integration.
- Dealing with the legacy.