he idea of a computer-generated actor, or "synthespian," capable of passing for human, seems less futuristic every year. Of course, TV's fictional Max Headroom, although ostensibly intelligent and self-aware, looked like a cartoon and was never intended to fool us. And William Gibson's novel Idoru introduced us to Rei Toei, a virtual pop singer authentic enough to love and marry a human colleague. But the promises and pitfalls of artificial intelligence were discussed here last year ("Cyc Yourself Up for A.I."), and the near-term prognosis is not favorable. Putting the human mindor even a reasonable facsimileinto a computer is not something we'll be doing in the near future.
But as hit films like Shrek amply demonstrate, getting a convincing human body to perform, puppetlike, on the silicon screen is increasingly feasible. The body has around 206 bones, 600 muscles not counting the face and (depending how you count the hands and feet) somewhere between 100 and 200 joints. These pieces work together in very predictable ways, so with a decent physics engine it's not difficult to reproduceor even synthesizethe movements of an actor or actress. In fact, recent movies like Harry Potter and The Fellowship of the Ring used computer-generated characters to fill out crowd scenes and to perform stunts which would be costly and dangerous in the real world. These characters didn't draw attention to themselvesdidn't look computer-generated at alland that's the point.
We start to run into trouble, though, when these motion-blurred long-distance scenes give way to close-ups and head shots. Why? Because while human beings will happily buy into the make-believe of cartoons and puppet shows, we're actually damned difficult to fool when the stuff we're looking at is supposed to be real. More than 50 percent of the neurons in our brains are involved in various aspects of vision, and we are exquisitely sensitive to minor discrepancies of lighting and color. Rendering human skin involves not only a realistic treatment of its elasticity and sag, but also of its subtle translucence, its pores and microtexture, its goosebumps and tiny hairs. Ask any painter: it's insanely difficult to portray even a motionless human being as more than a cartoon. And photo-realism is all but impossible, not to mention stylistically trite.
All the motherboard's a stage
The problems multiply when you consider the 44 muscles of the face, which are capable of producing about 5,000 distinct expressions, each representing a slightly different mix of subtle emotions. And in the movements involved with speech, lip readers recognize 20 to 50 "visemes," or facial positions associated with particular sounds from the vocal tract. And it turns out that simply picking the expressions and visemes you want, and "in-betweening" them with morphs or classical animation techniques, produces a ghastly effect which resembles nothing human.
Among the pickiest of human visual organs is the Fusiform Face Area in the brain's temporal lobe. This amazing bit of wetware can identify a symbolic human face in a design as simple as two dots above a half-circle. It can instantly recognize not only your friends and family, but thousands of other people you've met only once or twice, or seen only on TV. But this exquisite sensitivity also means that the FFA can pick out phony features and movements with equal ease. To fool it, animators have to model and control the facial muscles individually. This is beyond the real-time skills of even the greatest puppeteers, and even when you know what each muscle does, that doesn't tell you much about how and when and why they work together. Thus, getting it all to look right becomes an extremely time- and labor-intensive process, both technically and artistically. Which is ironic, considering how effortlessly a human actor can perform these same tricks. A director doesn't ask his cast members to sneer or smirk; he doesn't even ask them to be pensive or angry or guardedly optimistic. Instead, he simply explains the motivations of their characters for a particular scene, and the emotions and expressions flow naturally from that.
Which brings us back to A.I. again, because digital stars will never be cheaper or easier than human ones, until they shed their large staffs of animators and puppeteers. And to do that, they'll need to generate their own emotions based on an understanding of the scenes they're acting out. They needn't be conscious per se, but at a minimum, they'll need to be idiot savants with a fine mathematical grasp of emotion and nuance.
And then there is the voice to consider. Try watching your favorite computer animation with the sound off, and you'll realize just how important voices are in making these characters seem lifelike. Why? Because they're the voices of real movie stars, alive with thought and feeling, impeccably timed for comedic or dramatic effect. In today's CGI movies, the voices actually come first, and the animation is worked in around them.
Voicing the future of film
Can these voice actors be digitally replaced? Not easily. We have speech synthesis packages today which model the throat and vocal cords and could almost pass for human drug zombies, but the eight laryngeal muscles which control the voice box, along with the diaphragm which controls the lungs, play together in extremely complex ways to produce the subtle, beautiful modulations of human speech and song. Mechanizing all thatmatching each of those 5,000 emotional states to a particular motion pattern of the vocal muscleswill be difficult in the extreme, and even when we've done it we'll need another staff of peopleor another A.I.to drive the thing for us. And the results won't be better than a human voice, just comparable. Good enough to fool the listener.
In the future, as now, the most obvious application for digital characters will be for interactive experiences like games, and for bringing to life impossible or inhuman characters like dragons and ghosts. And for cheaply and safely fleshing out a battle or crowd scene, sure. The next frontier after that would be synthespian programs competent to serve as extras with minor speaking parts, although the film's producers would then lose out on one of the great privileges of moviemaking: putting their nephews and girlfriends and gangster acquaintances in the picture for fun and profit.
Anyway, yeah, I'm sure eventually we'll see some fully digital movie stars, and even desktop moviemaker software whose end products are nearly indistinguishable from the real thing. And this will not only threaten Hollywood's entrenched interests and lower the entry barriers for talented new players, but also usher in a brave new age of filmmaking, when any bozo with a computer and some time can slap together a picture starring anyone they like, and release it on the Internet for all to see. Most of these will be painfully stupid even by Hollywood standards, and some will no doubt be libelous and scandalous, pornographic and infringeful of others' copyrights. That's freedom of expression for you. We'll also get some masterpieces.
Still, through this haze of technology it pays to keep one thing in mind: No matter how the movies are made, the chairs and the popcorn will never be virtual, and neither, most especially, will the audience.
Wil McCarthy is a rocket guidance engineer, robot designer, science fiction author and occasional aquanaut. He has contributed to three interplanetary spacecraft, five communication and weather satellites, a line of landmine-clearing robots, and some other "really cool stuff" he can't tell us about. His short fiction has graced the pages of Analog, Asimov's, Science Fiction Age and other major publications, and his novel-length works include Aggressor Six, the New York Times notable Bloom, and The Collapsium.