Patent No. 4305131 Dialog between TV movies and human viewers
Patent No. 4305131 Dialog between TV movies and human viewers (Best, Dec 8, 1981)
Abstract
A video amusement system by which one or more viewers influence the course of a motion picture as if each viewer were a participant in a real-life drama or dialog. A speech-recognition unit recognizes a few spoken words such as "yes" and "run" spoken by a viewer at branch points in the movie, thus simulating a dialog between the screen actors and the viewer. The apparatus may read an optical videodisc containing independently addressable video frames, blocks of compressed audio, and/or animated cartoon graphics for the multiple story lines which the movie may take. A record retrieval circuit reads blocks of binary-coded control information comprising a branching structure of digital points specifying the frame sequence for each story line. A dispatcher circuit assembles a schedule of cueing commands specifying precisely which video frames, cartoon frames, and portions of audio are to be presented at which instant of time. A cueing circuit executes these commands by generating precisely timed video and audio signals, so that a motion picture with lip-synchronized sound is presented to the viewer. Recordings of the viewers' names may be inserted into the dialog so that the actors speak to each viewer using the viewer's own name. The apparatus can thus provide each viewer with an illusion of individualized and active participation in the motion picture.
Notes:
Parent
Case Text
This is a continuation of U.S. patent application Ser. No. 009,533, filed Feb.
5, 1979 now abandoned.
BACKGROUND
OF THE INVENTION
1. Field of the Invention
The apparatus and methods of this invention relate to the following classes:
voice controlled television, electric amusement devices, motion picture and
sound synchronizing, videodisc retrieval, digital generating of animated cartoons,
and branching motion pictures.
2. Description of the Prior Art
Since the beginning of the motion picture industry, movies have generally been
constrained to a predetermined sequence of predetermined scenes. Although a
vicarious sense of involvement is often felt by each viewer, the immutability
of the sequence of scenes limits the viewer's actual participation to a few
primative options such as cheering, commenting, and selecting what to watch.
This limitation in prior-art movies has not changed substantially with the advent
of television, video games, and audience-response systems.
Although the prior art includes devices capable of providing viewer participation,
such devices do not provide all of the following features in one entertainment
medium:
(1) vivid motion picture imagery;
(2) lip-synchronized sound;
(3) story lines (plots) which branch (have alternative sequences);
(4) elaborately developed story lines as in motion picture drama;
(5) scene changes responsive to inputs from each individual viewer;
(6) seamless transitions between shots;
(7) many hours of non-repetitive entertainment.
Furthermore no prior art device can conduct a voice dialog with each viewer
in which the screen actors respond to the viewer's voice in a natural conversational
manner.
Prior-art video game devices enable players to control video images via buttons,
knobs, and control sticks. But in these devices the images are limited to one
stereotyped scene such as a battlefield, an automobile race, a gun fight, or
a surface on which a ball is moved around. Such game devices generate simple
moving figures on a television screen, but the elaborate plot, dialog, characterization,
and most of the cinematic art is absent.
Another problem faced by the present invention is providing many hours of interactive
entertainment without obvious repetition. Prior-art video games can be played
for many hours only because they involve ritualistic cycles in their mechanism
of play. Such cycles lack the variety, suspense, and realism of conventional
movies.
The use of microcomputer-controlled videodiscs for interactive instruction has
been discussed in the literature (for instance see "Special Purpose Applications
of the Optical Videodisc System", by George C. Kenney, IEEE Transactions on
Consumer Electronics, November 1976, pages 327-338). Such computer-assisted
instructional devices present conventional movie portions and still frames with
narration in response to information entered by the student via push-buttons.
But this prior art does not teach how to synchronize multiple alternative motion
picture sequences with multiple alternative audio tracks so that spoken words
from any of the audio tracks are realistically synchronized with the moving
lips of the human actors in the video image. Nor does the prior art teach a
method for automatically inserting spoken names of the players into a prerecorded
spoken dialog so that lip-synchronization (lip-sync) is maintained. Nor does
the prior art teach a method for making a television movie responsive to spoken
words from the viewers/players so that an illusion of personal viewer participation
results.
Prior art systems for recognizing voice inputs and generating voice responses,
such as described in U.S. Pat. No. 4,016,540, do not present a motion picture
and therefore cannot simulate a face-to-face conversation.
Prior art voice controlled systems such as described in U.S. Pat. No. 3,601,530,
provide control of transmitted TV images of live people, but cannot provide
a dialog with pre-recorded images.
Prior-art systems have been used with educational television in which the apparatus
switches between two or more channels or picture quadrants depending on the
student's answers to questions. Such systems cannot provide the rapid response,
precise timing, and smooth transitions which the present invention achieves,
because the multi-channel broadcast proceeds in a rigid sequence regardless
of the student's choices.
The prior art also includes two-way "participatory television" which enables
each subscriber of a cable-TV system to communicate via push-buttons with the
broadcaster's central computer so that statistics may be gathered on the aggregate
responses of the viewers to broadcast questions and performances. Similar systems
use telephone lines to communicate viewer's preferences to the broadcaster's
computer. Although the central computer can record each viewer's response, it
is not possible for the computer to customize the subsequent picture and sound
for every individual viewer. The individual's response is averaged with the
responses from many other subscribers. Although such systems permit each person
to participate, the participation is not "individualized" in the sense used
herein, because the system cannot give each individual a response that is adapted
to him alone.
The prior art for synchronizing audio with motion pictures is largely concerned
with film and video tape editing. Such devices as described in U.S. Pat. No.
3,721,757, are based on the presumption that most of the editing decisions as
to which frames will be synchronized with which portions of the audio have been
made prior to the "final cut" or broadcast. If multiple audio tracks are to
be mixed and synchronized with a motion picture, such editing typically takes
many hours more than the show itself. It is not humanly possible to make the
editing decisions for frame-by-frame finecut editing and precise lip-sync dubbing,
during the show. For this reason, prior-art editing and synchronizing apparatus
(whether preprogrammed or not) cannot provide each individual player with an
individualized dialog and story line, and are therefore not suitable for interactive
participatory movies and simulated voice conversations which are automatically
edited and synchronized by the apparatus during the show.
Another problem not addressed in the prior art is the automatic selection of
a portion of audio (from several alternative portions) which may be automatically
inserted into predetermined points in the audio signal by the apparatus during
the show. For example, the insertion of the names of the players, selected from
a catalog of thousands of common names, into a dialog so that the actors not
only respond to the players but call them by name. Recording a separate audio
track for each of the thousands of names would require an impractically large
amount of disc space. But using a catalog of names requires that each name be
inserted in several points in the dialog, whenever an actor speaks the name
of the then current player. The task of synchronizing audio insertion so that
the dialog flows smoothly without gaps or broken rhythm at the splice is one
heretofore performed by skilled editors who know in advance of the editing procedure
which frames and audio tracks are to be assembled and mixed. In the present
apparatus this finecut editing cannot be done until after the show has started,
because no human editor can know in advance the name of each player and the
sequence of the dialog which will change from performance to performance. The
present invention solves these editing and synchronizing problems.
While watching a prior art branching movie as described in U.S. Pat. No. 3,960,380,
a viewer cannot talk with the screen actors and have them reply responsively.
Applying prior art speech-recognition techniques to control such branching movies
would not provide a realistic conversational dialog because of the following
problem: If the number of words which a viewer of any age and sex can speak
and be understood by the apparatus is sufficiently large to permit a realistic
conversation, then prior art speech-recognition techniques are unreliable. But,
if the vocabulary is restricted to only a few words to make speech recognition
reliable, then a realistic conversation would not result. This problem is resolved
in the present invention.
SUMMARY OF THE INVENTION
This invention provides a form of entertainment heretofore not provided by any
prior-art system. With this invention one or more people can participate in
a motion picture by steering it in a direction of their own choosing and with
the consequences of their participation explicitly performed by motion picture
images and voices of actors or cartoon characters. Users of the system can carry
on simulated conversations with the screen actors who may address each player
by the player's own name. The invention enables television viewers to participate
in simulated conversations with famous people, and choose the direction the
conversation takes as it progresses. The invention eliminates the need for the
ritualistic cycles characteristic of prior-art games, by permitting each show
to be significantly different from any recent show. This is accomplished by
a special-purpose microcomputer which may automatically schedule and control
presentation of video frames, and/or digitally-generated animated cartoons,
and digitized audio which is automatically lip-synced with the motion picture.
Some embodiments of the invention include voice-recognition circuitry so that
the course of the movie can be influenced by words or other sounds spoken by
each viewer to produce an illusion of individualized participation.
Some embodiments include processing of branching schedules of control commands
which specify precise sequences and timing of video, audio, and graphics to
provide a lip-synchronized movie having a seamless flow through alternative
story lines.
Some embodiments include synchronizing multiple video frames and/or animated
cartoon frames with alternative audio portions during the show, such as inserted
names of the players/viewers, while preserving lip-sync and seamless flow.
This invention comprises various apparatus and methods for performing the functions
or combination of functions which may provide individualized participation in
a motion picture and simulated conversations with people. Some of these functions
may in some embodiments be performed by microprocessors executing programs which
may be fixed as firmware incorporated into the same semiconductor chips as the
conventional processing circuits. These programmed microprocessors are in essence
special-purpose circuits. Microprocessors executing separately-stored programs
may also be used.
The claims appended hereto should be consulted for a complete definition of
the invention which is summarized in part in the present summary.
Comments