Resume

Control Computer

Location:

Seattle, WA

Posted:

February 13, 2013

Contact this candidate

Resume:

Declarative Camera Control for Automatic Cinematography

to appear AAAI '96

David B. Christianson

Sean E. Anderson Li-wei He

David H. Salesin Daniel S. Weld Michael F. Cohen

* *********

Department of Computer Science and Engineering Research

University of Washington One Microsoft Way

Seattle, Washington 98195 Redmond, WA 98052

fdbc1,lhe,salesin,abqp1x@r.postjobfree.com, abqp1x@r.postjobfree.com, abqp1x@r.postjobfree.com

Abstract the problem of camera control in an interactive video

game. Speci cally, we describe the Declarative Cam-

Animations generated by interactive 3D computer

graphics applications are typically portrayed either era Control Language (dccl) and demonstrate that it

from a particular character's point of view or from a is su cient for encoding many of the heuristics found

small set of strategically-placed viewpoints. By ignor- in a lm textbook. We also present a Camera Planning

ing camera placement, such applications fail to realize System (cps), which accepts animation traces as input

important storytelling capabilities that have been ex- and returns complete camera speci cations. The cps

plored by cinematographers for many years.

contains a domain-independent compiler that solves

In this paper, we describe several of the principles of dccl constraints and calculates the dynamical con-

cinematography and show how they can be formal-

trol of the camera, as well as a domain-independent

ized into a declarative language, called the Declara-

heuristic evaluator that ranks the quality of the can-

tive Camera Control Language (dccl). We describe

didate shot speci cations that the compiler produces.

the application of dccl within the context of a simple

We demonstrate a simple interactive video game that

interactive video game and argue that dccl represents

creates simulated animations in response to user input

cinematic knowledge at the same level of abstraction

and then feeds these animations to cps in order to pro-

as expert directors by encoding 16 idioms from a lm

textbook. These idioms produce compelling anima- duce complete camera speci cations as shown on the

tions, as demonstrated on the accompanying video- accompanying videotape.

tape. Our prototype video game serves as a testbed for appli-

cations of dccl and cps. However, there are number of

Introduction alternative applications to which both dccl and cps

might be applied. Within the realm of video games,

The language of lm is a complex one, which has

Multi-user Dungeons (MUDs), and interactive ction,

evolved gradually through the e orts of talented lm-

automated cinematography would allow an applica-

makers since the beginning of the century. As a re-

tion to convey the subjective impression of a particu-

sult the rules of lm are now so common that they

lar character without resorting to point-of-view shots.1

are nearly always taken for granted by audiences;

Because many MUDs operate over long periods of time,

nonetheless, they are every bit as essential as they

an automated cinematography system could provide

are invisible. Most interactive 3D computer graph-

users with customized summaries of events they had

ics applications (e.g., virtual chat managers, interac-

missed while they were away. Alternatively, automated

tive ction environments, and videogames) do not ex-

cinematography could be used to create natural inter-

ploit established cinematographic techniques. In par-

actions with the \intelligent agents" that are likely to

ticular, most computer animations are portrayed ei-

take part in the next generation of user interfaces. Au-

ther from a particular character's point of view or

tomated cinematography could also be used to assist

from a small set of strategically-placed viewpoints.

naive users in the creation of desktop videos, or for

By restricting camera placement, such applications

building animated presentations. In the latter case,

fail to realize the expository capabilities developed

Karp and Feiner have shown (Karp & Feiner 1990;

by cinematographers over many decades. Unfortu-

1993) that animated presentations can be e ectively

nately, while there are several textbooks that con-

designed on a computer, reducing costly human in-

tain informal descriptions of numerous rules for lm-

volvement and allowing presentations to be customized

ing various types of scenes (Arijon 1976; Lukas 1985;

for a particular viewer or situation.

Mascelli 1965), it is di cult to encode this textbook

knowledge in a manner that is precise enough for a 1

Most current games, of which Doom is the classic ex-

computer program to manipulate. ample, still provide each participant with a single point-of-

In this paper, we describe several of the principles of view shot; however, a number of games such as Alone in

lmmaking, show how they can be formalized into a the Dark, Fade 2 Black and Virtua Fighter have begun to

declarative language, and then apply this language to employ a wider variety of perspectives.

Principles of Cinematography straints to be placed on successive shots to lead to

Although a lm can be considered to be nothing but a good scenes. Several of the more important rules in-

linear sequence of frames, it is often helpful to think of clude (Arijon 1976):

Parallel editing: Story lines (visualized as scenes)

a lm as having structure. At the highest level, a lm

is a sequence of scenes, each of which captures some should alternate between di erent characters, loca-

speci c situation or action. Each scene in the lm is tions, or times.

composed of one or more shots. A single shot covers

Only show peak moments of the story: Repetitive

the small portion of a movie between when a camera is

moments from a narrative should be deleted.

turned on and when it is turned o . Typically, a lm is

Don't cross the line: Once an initial shot is taken

comprised of a large number of individual shots, with

each shot's duration lasting from a second or two in from the left or right side of the line, subsequent

length to perhaps tens of seconds.2 shots should maintain that side, unless a neutral,

Camera Placement establishing shot is used to show the transition from

one side to the other. This rule ensures that suc-

Directors specify camera placements relative to the line

cessive shots of a moving actor will maintain the

of interest, an imaginary vector connecting two inter-

direction of apparent motion.

acting actors, directed along the line of an actor's mo-

Let the actor lead: The actor should initiate all

tion, or oriented in the direction the actor is facing.

Figure 1 shows the line formed by two actors facing movement, with the camera following; conversely,

each other. the camera should come to rest a little before the

actor.

X Y

Break movement: A scene illustrating motion should

"The Line"

be broken into at least two shots. Typically, each

shot is cut so that the actor appears to move across

a g

c e

half the screen area. A change of the camera-to-

subject distance should also be made in the switch.

b f

External

Internal

External

Idioms

Parallel

Perhaps the most signi cant invention of cinematogra-

hers is the notion of an idiom a stereotypical way

to capture some speci c action as a series of shots.

For example, in a dialogue between two people, a lm-

maker might begin with an apex view of both actors,

Apex

and then alternate views of each, at times using in-

Figure 1: Camera placement is speci ed relative to the ternal reverse placements and at times using external

reverse. While there is an in nite variety of idioms,

\line of interest." (Adapted from gure 4.11 of (Arijon

lm directors have learned to rely on a small subset of

1976))

these. Indeed, lm books (e.g., (Arijon 1976)) are pri-

Shooting actor X from camera position b is called a marily a compilation of idioms along with a discussion

parallel camera placement. Filming X from position of the situations when a lmmaker should prefer one

c yields an internal reverse placement. Shooting from idiom over another. Figure 2 presents a three-shot id-

position d results in an apex shot that shows both ac- iom that serves as an extended example throughout the

tors. Finally, lming from g is called an external re- remainder of this paper. The idiom, adapted from Fig-

ure 13.2 of Arijon's text (1976), provides a method for

verse placement.

depicting short-range motion of one actor approaching

Cinematographers have identi ed that certain \cutting

another. The rst shot is a closeup; actor X begins in

heights" make for pleasing compositions while others

the center of the screen and exits left. The second shot

yield ugly results (e.g., an image of a man cut o at the

begins with a long view of actor Y ; actor X enters from

ankles). There is a set of (roughly) ve useful camera

o -screen right, and the shot ends when X reaches the

distances (Arijon 1976, p. 18). An extreme closeup

center. The nal shot begins with a medium view of Y,

cuts at the neck; a closeup cuts under the chest or at

with actor X entering from o -screen right and stop-

the waist; a medium view cuts at the crotch or under

ping at center.

the knees; a full view shows the entire person; and a

DCCL

long view provides a distant perspective.

Heuristics and Constraints This section provides an informal description of the

Filmmakers have articulated numerous heuristics for Declarative Camera Control Language dccl. The

selecting good shots and have informally speci ed con- speci cation of dccl is important because it allows

cps to formalize, encode, and implement common lm

A notable exception is Alfred Hitchcock's Rope, which

idioms, such as the one presented in Figure 2.

was lmed in a single shot, albeit with disguised breaks.

(AcFilmIdiom name Arijon-13-2

:parameter (AcParamApproach :actor1 :actor2 :start :stop)

:line (AcLineIdiom :primary ?actor1 :other ?actor2 :side left)

(AcFilmShot name shot1

(AcFragGoBy name frag1

:time ?start :primary-moment beginning :entry-pos center :exit-pos out-left

:placement (AcPlaceInternal :primary ?actor1 :other ?actor2 :range closeup :primary-side center)))

(AcFilmShot name shot2

(AcFragGoBy name frag2

:time ?frag3.first-tick :primary-moment end :entry-pos on-right :exit-pos center

:placement (AcPlaceExternal :near ?actor1 :far ?actor2

:primary-subject near :range longshot :primary-side center)))

(AcFilmShot name shot3

(AcFragGoBy name frag3

:time ?stop :primary-moment end :entry-pos out-right :exit-pos right12

:placement (AcPlaceApex :primary ?actor1 :other ?actor2 :range mediumshot :primary-side right12

Figure 3: dccl code corresponding to the idiom depicted in Figure 2.

actor (or actors) internal, external, parallel, and apex.

Three of the fragments, go-by, panning, and tracking,

require an additional argument, called the primary mo-

ment, which speci es the moment at which the place-

ment command is to take e ect during a shot in which

there is motion.

Finally, two of these fragments, go-by and panning, re-

quire another argument called a movement endpoint,

which is used to indicate the range of motion to be cov-

ered by the actor relative to the screen.3 As Figure 5

illustrates, dccl recognizes seven movement-endpoint

keywords.

Figure 2: (Adapted from gure 13.2 of (Arijon 1976)). A Note that although the movement-endpoint keywords

refer to locations on the screen, they are used to

common idiom for depicting short range movement of one

calculate the temporal duration of go-by and panning

actor approaching another. Camera positions are shown

fragments.4 For example, the rst shot of the idiom of

on the left side of the gure; the resulting image is shown

Figure 2 can be de ned as a go-by moving from center

on the right. Arrows indicate motion of actors into and

to out-left.

out of the screen.

Shots and Idioms

There are four basic primitive concepts in dccl: frag-

In many cases, a shot is composed of a single frag-

ments, views, placements, and movement endpoints;

ment, but sometimes it is e ective to merge several

these primitives are combined to specify higher-level

fragments together to form a more complex shot. For

constructs such as shots and idioms.

Fragments example, one can start with a panning fragment in

In the previous section, we discussed how cinematog- 3

To understand why this argument is necessary, recall

raphers treat a shot as the primitive building block in that a go-by fragment results in a static camera position

a lm. For our automated system, we have found it directed at an actor who moves across the eld of view. An

useful to further decompose each shot into a collec- example of a go-by fragment is the rst shot of Figure 2.

tion of one or more fragments. A fragment speci es Note that Arijon (1976) expresses the shot not by specify-

an interval of time during which the camera is in a ing the temporal duration, but rather by indicating (with

arrows) the range of motion that the actor should cover

static position and orientation or is performing a sin-

while the lm is rolling. dccl uses movement endpoints to

gle simple motion. dccl de nes ve fragment types

allow the same type of declarative speci cation and relies

(illustrated schematically in Figure 4).

upon the compiler to calculate the temporal bounds that

A fully-speci ed fragment requires additional informa- will yield the proper range of motion.

tion. Some of these arguments are obvious for exam- 4

This explains the de nition of out-right and out-left.

ple, to track the motion of an actor, one must specify Arijon (1976) speci es that shots in which an actor moves

which actor to track and over what time interval to roll o -screen (or onto the screen) should be cut three frames

the lm. In addition, one must also specify the desired after (of before) the actor disappears (or appears). As a re-

camera range extreme, closeup, medium, full, or long, sult we de ne out-right and out-left in terms of the distance

as well as the placement of the camera relative to the traveled while three frames transpire.

Actor stands still

Screen

line of action

Static Fragment

Actor moves across screen

line of action

center in right on right out right

out left on left in left

Figure 5: dccl allows the user to delimit the temporal

duration of go-by and panning fragments by specifying

Go by Fragment

the desired initial and terminal locations of the actor on

the screen.

Actor stays near center of sceen as camera turns

line of action

Panning Fragment

Actor stays near center of screen as camera moves parallel to line

line of action

Panning Tracking

Figure 6: Schematic illustration of a shot composed of

Tracking Fragment

two fragments: a panning fragment that melds impercep-

tibly into a tracking fragment.

line of action

Actor and camera move together

principles of camera placement for the case of simple

movement sequences on the part of one or two actors.

As input, cps requires an animation trace: informa-

Point of View (POV) Fragment

tion about the positions and activities of each charac-

ter, as well as directives stating which actors are to be

Figure 4: dccl fragments specify the type of camera

lmed over what intervals of time. In our interactive

motion.

game, this information is produced by a simple com-

puter simulation generated in response to a user com-

which a running actor moves from out-left into the

mand. Given this trace information, the cps chooses

center of the screen, then shift to a tracking shot by

which subintervals of time to capture with the camera

terminating camera rotation and increasing its lateral

and from which vantage points and with which lenses

motion to match that of the actor (Figure 6). Multi-

(i.e., what eld of view). The animation can then be

fragment shots typically combine panning, tracking and

played back to the user using the intervals and camera

go-by fragments in di erent orders.

placements selected by the cps.

In multifragment shots, it is often important to be

The primary data structure used by cps is the lm tree

able to synchronize (in \simulation time") the end of

(Figure 7), which represents the lm being generated.

one fragment with the beginning of the next. For this

Of primary consequence are the scene and candidate

reason, dccl supports the ability to export computed

idiom levels of the tree: each scene in the lm is associ-

variables, such as the starting or ending time of a frag-

ated with one or more possible idioms, with each idiom

ment, for use by other fragments in the same idiom.

representing a particular method of lming the parent

The duration of a scene can decrease if fragments do

scene. The cps operates by expanding the lm tree

not cover the entire scene, producing \time contrac-

until it has compiled detailed solutions for each idiom,

tion."

and then selecting appropriate candidate idioms and

To de ne an idiom, one must specify the activities for

frames for each scene.

which the idiom is deemed appropriate, a name, ar-

Internally, the cps is implemented as a three-stage

guments (e.g., actors and times), and a list of shot

pipeline involving a sequence planner, dccl compiler,

descriptions. For example, Figure 3 shows the actual

and a heuristic evaluator, as shown in Figure 8. An id-

dccl encoding of the idiom illustrated in Figure 2; this

iom database (not shown) provides idiom speci cations

idiom is a good choice for showing one actor approach-

relevant to each scene in the animation being lmed.

ing another.

The Sequence Planner

The Camera Planning System

The Camera Planning System (cps) codi es and im- The current implementation of the cps sequence plan-

plements the previously described cinematographic ner is quite simple. Unlike the other portions of the cps

Simulation Sequence DCCL Candidate Heuristic Final

Idioms

Data Planner Compiler Frames Evaluator Frames

Figure 8: The cps is implemented as a three-stage pipeline.

Activity Solitary Idioms

Film

Stopping/Starting Y 1

Sequences

X-Approaches-Y N 2

X-Retreats-From-Y N 2

X-Follows-Y N 2

Scenes

Moving Y 2

Turning Y 1

HeadTurning Y 1

Candidate

Stationary Y 1

Idioms

Looking Y 1

Candidate

Noticing N 1

Frames

Picking Up N 1

Holding N 1

Figure 7: Successive modules in the cps pipeline incre-

mentally expand the lm tree data structure. Table 1: Activity classi cations for prototype game.

the code implementing the sequence planner is speci c for the placeholders speci ed in the idiom de nitions.

to the domain (plot) of the application (e.g.chase and References to actor placements on the right or left sides

capture interactions). As input the planner receives an will be automatically mirrored later, during the idiom

animation trace describing the time-varying behavior solving process.

of a group of actors. As output, the sequence planner

The dccl Compiler

produces a lm tree (Figure 7) that is speci ed down

to the scene level. The dccl compiler uses information about the move-

The animation trace speci es position, velocity, and ment of the actors to expand the fragments in each

joint positions for each actor for each frame of the an- candidate idiom into an array of frame speci cations,

imation. The trace also labels the activity being per- which can be played directly. Since a frame is fully

formed by each actor in each frame, as well as higher- constrained by the combination of its camera's 3D po-

level information encoded as a set of lm sequences, sition, orientation, and eld of view, the compiler need

with each lm sequence including an interval of time, only generate an array of these values for each fragment

an actor to use as the protagonist, and (optionally) a in each shot in each candidate idiom.

second actor. In the current application, multiple lm In its simplest form, an idiom consists of a single shot

sequences are used to create parallel editing e ects by that is composed of a single fragment, so we cover that

having the cps intermix scenes featuring one set of ac- case rst. If the fragment has type pov, then compi-

tors with scenes featuring a di erent set of actors (see lation is trivial, so we assume that the fragment has

accompanying videotape). type static, tracking, go-by, or panning. We decompose

Given the information in the animation trace, the se- the compiler's job into four tasks:

quence planner generates scenes by rst partitioning 1. Determine the appropriate primary moment(s).

each lm sequence according to the activities per-

formed by the protagonist during the given sequence. 2. Determine the set of possible frame speci cations

For the current application we have identi ed ten ac- for each primary moment.

tivity types (Table 1). After partitioning a sequence,

3. Calculate the temporal duration (length of the

the sequence planner generates scenes parameterized

frame array) of the fragment given an initial frame

by the activity, actors, and time interval of each parti-

speci cation.

tion.

4. Generate the interpolated speci cation of frame n

Once the sequence planner has created the scene nodes,

from that of frame n 0 1.

the cps must instantiate the idiom schemata relevant

for each scene. Relevance is determined by matching Once these tasks have been completed, the compiler

the scene activity against a list of applicable activities simply has to generate a frame array for each primary

de ned for each idiom. The current implementation of moment and frame speci cation. In the current ver-

the database contains 16 idioms (Table 1). The plan- sion of cps there are typically only two frame arrays

ner instantiates idioms by substituting actual param- corresponding to placing the camera on one side of the

eters (actor names, and scene start and ending times) line of interest or the other. The task of choosing the

The Heuristic Evaluator

appropriate side is left up to the heuristic evaluator.5

Since the lm tree is an and/or graph, it represents

The primary moment of a fragment de nes the point

many possible lms.6 The heuristic evaluator chooses

in time at which the camera placement should conform

the candidate idiom that renders a scene best, assign-

to the placement speci ed for the fragment, and varies

ing each idiom a real-valued score by rst scoring and

with the type of fragment. The go-by, tracking, and

pruning the frame arrays of its fragments. Note that

panning fragments specify the primary moment as ei-

it is not possible to estimate the quality of a fragment

ther the rst or last tick of the fragment. The static

or idiom before it is compiled, because visual quality is

fragments, on the other hand, do not specify a primary

determined by the complex, geometric interactions be-

moment, so in the current version of cps we solve the

tween the fragment type and the motions of the actors.

placement for the rst, last, and midpoint ticks of the

A given idiom might work extremely well when the ac-

fragment's time interval; the heuristic evaluator will

tors stand in one position relative to one another, yet

later determine which solution looks best and prune

work poorly given a di erent con guration.

the rest.

The scoring mechanism is primarily focused towards

The fragment's placement (e.g., internal, external,

evaluating inter-fragment criteria, namely:

parallel, or apex), as speci ed in the idiom, combined

maintaining smooth transitions between camera

with the location of the actors (from the animation

trace) constrains the camera's initial position to lie on placements;

one of two vectors, according to the side of the line

eliminating fragments which cause the camera to

being used (Figure 1). The actual location on this

cross the line.

vector is determined by the desired distance between

In addition, the scoring mechanism deals with certain

the camera and the primary subject. This distance,

intra-fragment behaviors that sometimes arise from the

in turn, is speci ed by the fragment's range (extreme,

compilation phase of the cps such as:

closeup, medium, full, or long) and the lens focal length.

penalizing very short or very long fragments;

The compiler attempts to generate a set of appropriate

placements using a normal lens (e.g., a 60-degree eld eliminating fragments in which the camera pans

of view). The vector algebra behind these calculations backwards.

is explained in (Blinn 1988).

After the evaluator has selected the best idiom for each

The temporal duration of static, tracking, and pov frag-

scene to be included in the lm, the camera planning

ments is speci ed explicitly as part of the dccl spec-

process is complete. cps concatenates the frame ar-

i cation. However, the duration of go-by and panning

rays for all idiom nodes remaining in the lm tree and

fragments must be computed from the movement-

outputs the corresponding sequence of frames to the

endpoint speci cation in conjunction with the actor's

player for rendering.

velocity.

Note that while the evaluator's rules are heuristic, they

The function used to update the camera position and

are also domain-independent within the domain of lm

orientation from one frame to the next depends on

and animation: each rule encodes broadly-applicable

the type of fragment involved and the change in the

knowledge about which types of shots look good on

actors' positions. Static and go-by fragments do not

lm, and which do not.

change camera position or orientation. The camera

Sample Application

in tracking fragments maintains its orientation, but

changes its position based on the actor's velocity vec-

We are particularly interested in interactive uses of au-

tor. The camera in panning fragments maintains its

tomatic cinematograpy. Therefore, we decided to build

position, but changes its orientation with angular ve-

a simple interactive game that would use cps to lm

locity constrained by actor velocity and the distance

actions commanded by a human player. The basic plot

at the closest approach to the camera (as determined

of the game is very simple. The main character, Bob,

by the primary moment).

must walk around a virtual world (in our case, SGI's

Note that unlike the sequence planner, the dccl com- Performer Town) looking for the Holy Grail. The game

piler is completely domain-independent in that the is made more interesting by the introduction of Fido,

compiler depends only on geometry and not on the an evil dog that seeks to steal the Grail. From time

plot or subject of the animation. Furthermore, the to time, the game generates animations of Fido's ac-

dccl speci cations in the idiom database are applica- tivities and instructs cps to edit these animations into

ble across various animations; for example, the idioms the action. Eventually, Bob and Fido interact: if Fido

in our database should apply to any animation with gets the grail, Bob has to chase Fido in order to get

two-character interactions. it back. The user commands Bob through a pop-up

menu. These commands fall into four basic categories:

Typically, the camera is restricted to one side of the

line of interest. However, opportunities to \switch sides" Indeed, if there are n scenes and each scene has k can-

didate idioms as children, then the lm tree represents nk

sometimes present themselves, such as when an actor turns

to walk in a neutral direction. possible idiom combinations.

telling Bob to look in a particular direction, to move to generate camera positions.

a particular point on the screen, to pick up an object, A number of systems have been described for auto-

or to chase another actor. matically placing the camera in an advantageous posi-

tion when performing a given interactive task (Gleicher

User Commands Movies & Witkin 1992; Mackinlay, Card, & Robertson 1990;

Phillips, Badler, & Granieri 1992). However, these sys-

tems neither attempt to create sequences of scenes, nor

do they apply rules of cinematography in developing

their speci cations.

Game

In work that is closer to our own, Karp and Feiner

Engine

(Karp & Feiner 1990; 1993) describe an animation

planning system for generating automatic presenta-

Control Player

tions. Their emphasis is on the planning engine itself,

whereas the work described in this paper is more con-

cerned with the problem of de ning a high-level declar-

ative language for encoding cinematic expertise. Thus,

the two approaches complement each other.

Strassman (Strassman 1994) reports on Divaldo, an

Camera Planning

Simulator

ambitious experiment to create a prototype system for

System

\desktop theatre." Unlike our focus on camera place-

ment, Strassman attempts to create semi-autonomous

actors who respond to natural language commands.

Figure 9: Overall context of cps cps is also complementary to Divaldo.

Drucker et al. (Drucker, Galyean, & Zeltzer 1992;

The implementation of the game and its various anima- Drucker & Zeltzer 1994; 1995) are concerned with the

tion/simulation modules was done in C++ using the problem of setting up the optimal camera position for

IRIS Performer toolkit running on SGI workstations, individual shots, subject to constraints. Speci c cam-

and based partially on the pery application provided era parameters are automatically tuned for a given shot

by SGI (a Performer-based walkthrough application). based on a general-purpose continuous optimization

The game operates as a nite-state machine that pro- paradigm. In our work, a set of possible cameras is

duces animation traces as the user issues commands fully speci ed by the shot descriptions in dccl and the

to the game engine, with the cps acting as a separate geometry of the scene. The nal selection from among

library whose sole inputs are the animation trace and this set of di erent shots is made according to how well

the database of idioms (Figure 9). The game itself (not each shot covers the scene. Our approach avoids the

counting cps or the code present in the existing per- need for generic optimization searches, and it is guar-

y application) required approximately 10,000 lines of anteed to result in a common shot form. The cost is a

C++ code. The cps system is also written in C++ greatly reduced set of possible camera speci cations.

(despite the Lisp-like appearance of dccl) and imple- Several useful texts derive low-level camera parame-

mented with about 19,000 lines of code. ters, given the geometry of the scene (Foley et al. 1990;

The sample game interaction presented at the end of Hearn & Baker 1994; Blinn 1988).

our video is intended to demonstrate a number of the

Conclusion

activities possible in the game, as well as the various

dccl idioms. For presentation purposes, the planning

time required by cps was edited out of the video; Ta- We close by summarizing the contributions of this pa-

ble 2 gives performance data taken from a similar run- per and describing the directions we intend to pursue

through of the game on an SGI Onyx. in future work. The main contributions of this paper

include:

Related Work Surveying established principles from lmmaking

The subject of using principles from cinematography that can be used in a variety of computer graphics

to control camera positions and scene structure has re- applications.

ceived relatively little attention in the computer graph-

ics or AI communities. We survey most of the related Describing a high-level language, dccl, for specify-

work here. ing camera shots in terms of the desired positions

He, Cohen, and Salesin ( 1996) have developed a sys- and movements of actors across the screen. We have

tem for controlling camera placement in real-time us- argued that dccl represents cinematic knowledge

ing some of the ideas behind dccl. Their work focuses at the same abstraction level as expert directors and

on lming dialogues between multiple animated char- producers by encoding sixteen idioms from a lm

acters, and uses a nite state machine to select and textbook (Arijon 1976) (e.g., Figure 2).

Command Num. Frames Scenes Generated CPU Time (s)

Pick Up Grail 733 9 47.63

Pick Up Net 451 4 44.74

Catch Dog 603 4 32.74

Walk (long range) 701 4 11.52

Walk (med range) 208 4 6.74

Look Right 87 3 5.6

Walk (short range) 158 4 5.12

Table 2: Typical cps Performance

Presenting a domain-independent compiler that Drucker, S. M., Galyean, T. A., and Zeltzer, D. 1992.

solves dccl constraints and dynamically controls CINEMA: A system for procedural camera move-

the camera. ments. In Zeltzer, D., ed., Computer Graphics (1992

Symposium on Interactive 3D Graphics), volume 25,

Describing a domain-independent heuristic evalua-

67{70.

tor that ranks the quality of a shot speci cation us-

Foley, J. D., van Dam, A., Feiner, S. K., and Hughes,

ing detailed geometric information and knowledge of

J. F. 1990. Computer Graphics, Principles and Prac-

desirable focal lengths, shot durations, etc.

tice. Reading, Massachusetts: Addison-Wesley Pub-

Describing a fully-implemented lm camera plan- lishing Company, second edition.

ning system (cps) that uses the dccl compiler

Gleicher, M., and Witkin, A. 1992. Through-the-lens

and heuristic evaluator to synthesize short animated

camera control. In Catmull, E. E., ed., Computer

scenes from 3D data produced by an independent,

Graphics (SIGGRAPH '92 Proceedings), volume 26,

interactive application.

331{340.

Incorporating cps into a prototype game, and He, L., Cohen, M. F., and Salesin, D. H. 1996. Vir-

demonstrating sample interactions in the game (see tual cinematography: A paradigm for automatic real-

videotape). time camera control and directing. To appear at SIG-

Acknowledgements GRAPH '96.

Hearn, D., and Baker, M. P. 1994. Computer Graph-

This research was funded in part by O ce of Naval

ics. Englewood Cli s, New Jersey: Prentice Hall, sec-

Research grants N00014-94-1-0060 and N00014-95-1-

ond edition.

0728, National Science Foundation grants IRI-9303461

Karp, P., and Feiner, S. 1990. Issues in the automated

and CCR-9553199, ARPA/Rome Labs grant F30602-

generation of animated presentations. In Proceedings

95-1-0024, an Alfred P. Sloan Research Fellowship

of Graphics Interface '90, 39{48.

(BR-3495), and industrial gifts from Interval, Mi-

crosoft, Rockwell, and Xerox. Karp, P., and Feiner, S. 1993. Automated presenta-

References tion planning of animation using task decomposition

with heuristic reasoning. In Proceedings of Graphics

Arijon, D. 1976. Grammar of the Film Language. New Interface '93, 118{127. Toronto, Ontario, Canada:

York: Communication Arts Books, Hastings House, Canadian Information Processing Society.

Publishers. Lukas, C. 1985. Directing for Film and Television.

Garden City, N.Y.: Anchor Press/Doubleday.

Blinn, J. 1988. Where am I? What am I looking at?

IEEE Computer Graphics and Applications 76{81. Mackinlay, J. D., Card, S. K., and Robertson, G. G.

1990. Rapid controlled movement through a virtual

Christianson, D. B., Anderson, S. E., He, L., Weld,

3D workspace. In Baskett, F., ed., Computer Graph-

D. S., Salesin, D. H., and Cohen, M. F. 1996. Declar-

ics (SIGGRAPH '90 Proceedings), volume 24, 171{

ative camera control for automatic cinematography.

176.

TR UW-CSE-96-02-01, University of Washington De-

partment of Computer Science and Engineering. Mascelli, J. V. 1965. The Five C's of Cinematography.

Hollywood: Cine/Gra c Publications.

Drucker, S. M., and Zeltzer, D. 1994. Intelligent cam-

era control in a virtual environment. In Proceedings Phillips, C. B., Badler, N. I., and Granieri, J. 1992.

of Graphics Interface '94, 190{199. Ban, Alberta, Automatic viewing control for 3D direct manipula-

Canada: Canadian Information Processing Society. tion. In Zeltzer, D., ed., Computer Graphics (1992

Symposium on Interactive 3D Graphics), volume 25,

Drucker, S. M., and Zeltzer, D. 1995. Camdroid:

71{74.

A system for intelligent camera control. In Proceed-

Strassman, S. 1994. Semi-autonomous animated ac-

ings of the SIGGRAPH Symposium on Interactive 3D

Graphics '95. tors. In Proceedings of the AAAI-94, 128{134.

Contact this candidate