Skip to main content

How A.I. Is Changing Hollywood

Behind some of the coolest premium effects in Hollywood is the invisible aid of artificial intelligence. Machine learning is helping create previously unimaginable moments in media today. Let's examine how A.I. is changing Hollywood's creative workflow.

Released on 05/17/2022

Transcript

[Narrator] Behind some of the coolest premium effects

in Hollywood content is the invisible aid of AI.

Artificial intelligence.

It is just blowing the doors wide open

on opportunities for new ways to tell stories.

This is a good technology to hang our hat on

because it is getting so much better

every single year.

[Narrator] Machine learning is being baked into workflows

helping create previously unimaginable moments

from big blockbusters to non-fiction TV.

I think where AI really is impactful

is getting it to do things that human beings can't do.

[Narrator] Including raising the dead?

As if you know, you had Andy Warhol

standing in the studio right in front of you,

and you looked at him and said,

I want you to say it like this.

[AI Voice] I wasn't very close to anyone

although I guess I wanted to be.

[Narrator] Let's examine a few specific use cases

of how AI is changing Hollywood's creative workflow.

[gentle music]

The entertainment industry was spawned by new technology.

So it makes sense that from talkies to television

to digital video, Hollywood has a history

of leveraging new tech,

especially in the world of visual effects.

When I saw Jurassic Park

that was the moment that I realized

that computer graphics would change the face

of storytelling forever.

In the last 25 years that I've been working in film

we've been conquering various challenges

doing digital water for the first time in Titanic,

doing digital faces for the first time

in a movie like Benjamin Button.

[Narrator] And now the state of the art

is machine learning AI applications,

like the kind Matt's company Mars develops in house.

You can throw it, you know, infinite amount of data

and it will find the patterns in that data naturally.

[Narrator] Thanks to thirsty streaming services,

Hollywood is scrambling to feed demand

for premium content rich in visual effects.

Budgets time are not growing in a way

that corresponds to to those rising quality expectations.

It's outpacing the number of artists

that are available to do the work.

[Narrator] And that's where AI comes in.

Tackling time consuming, uncreative tasks

like de-noising, rotoscoping,

and motion capture tracking removal.

This was our first time ever trying AI in a production.

We had a lot of footage just by virtue

of being on the project and doing 400 shots for Marvel.

When we received the footage, which we call the plates,

in order to manipulate Paul Bettany's face

there needed to be tracking markers

during principal photography.

We looked at it.

We said, Okay, well, removing tracking markers

is going to take roughly one day per shot.

In order to replace or partially replace Vision's head

for each shot, and a shot is typically defined

as about five seconds of footage.

The tracking marker removal itself was about a 10th of that.

So on a 10 day shot,

one day was simply removing tracking markers.

We developed a neural net where we are able to identify

the dots on the face

where the artificial intelligence averaged out

the skin texture around the dot, removed the dot,

and then infilled with the average

of the texture surrounding it.

Now Marvel loved it because it's sped up production.

They saved money.

It's exactly what we wanted these solutions to do.

Where the solution was faltering

was whenever there was motion blur.

When Paul Bettany moves his head very quickly

to the right or to the left,

there's moments where those dots will reappear

partially because in the dataset itself

we didn't have enough motion blur data.

Another example would be whenever the character

turned his head where his eyes were out of the screen

you would see those dots reappear as well.

And the AI recognition, it's using the eyes

as a kind of a crucial landmark to identify the face.

And so if I turn my head this way and you can't see my eyes

well, the AI can't identify that as a face.

Again, you can fix those things with more data,

the more data you feed these things,

typically the better, right?

[gentle music]

[Narrator] There wasn't a lot of clean data

available on our next AI use case.

The star of the film had been dead for 25 years.

Yet the director wanted more than 30 pages of dialogue

read by iconic artists, Andy Warhol himself.

So what do you do?

You could hire like a voice actor

to do like a great impersonation but we found with his voice

you kind of wanted to retain that humanness

that Andy had himself.

You can get fairly close with the voice actor

but you really can't get it.

And that's where AI technology really helps.

Generative audio is the ability for a artificial agent

to be able to reproduce a particular voice

but also reproduce the style, the delivery,

the tone of of a real human being and do it in real time.

[AI Voice] Welcome to Resemble a generative audio engine.

When the team initially reached out to us

they proposed what they were going to do.

We asked them like, okay, well

what kind of data are we working with?

And they sent us these audio files

like recordings over a telephone.

They're all from the late seventies, mid seventies.

The thing about machine learning

is that bad data hurts a lot more than good data.

So I remember looking at the data we had available

and thinking this is gonna be really, really difficult

to get right with three minutes of data.

We're being asked to produce six episodes worth of content

with three minutes of his voice.

So with three minutes,

he hasn't said every word that's out there.

So we're able to extrapolate to other phonetics

and to other words, and our algorithm

is able to figure out how Andy would say those words.

That's where neural networks are really powerful.

They basically take that speech data

and they break it down and they understand hundreds

and thousands of different features from it.

Once we have that voice that sounds like Andy

from those three minutes of data

then it's all about delivery.

It's all about performance.

[AI Voice] I went down to the office

because they're making a robot of me.

And Andy's voice, it's highly irregular.

And that's where the idea of style transfer really came in.

So style transfer is this ability

for our algorithm to take input as voice

and someone else's speech.

[Voice Actor] I wasn't very close to anyone

although I guess I wanted to be.

But we're able to say that line.

And then our algorithms are able to extract certain features

out of that delivery

and apply it to Andy's synthetic or target voice.

The first one was like automatic generated.

No, touch ups.

[AI Voice] I wasn't very close to anyone.

Although I guess I wanted to be.

The second one was like touch up by adding a pause.

[AI Voice] I wasn't very close to anyone,

although I guess I wanted to be.

And then the third one was basically

adding the final touch where it's like, okay, you know what?

I really want to place an emphasis

on this particular syllable.

So yeah, let's get a voice actor to do that part

to actually place that emphasis

on the right words and the right syllable.

And then the third output has those features extracted

from that voiceover actor and to Andy's voice.

[AI Voice] I wasn't very close to anyone

although I guess I wanted to be.

You have definitely heard AI voices

being used in the past for touch ups

for a line here or there.

This is probably the first major project that's using it

so extensively.

Most VFX is still a very manual process.

Characters can be extremely challenging,

creatures, things like fur hair.

Those things can be extremely challenging

and time consuming.

[Narrator] One notable example of where the technology

is headed are the scenes involving advanced 3D VFX

in Avengers: Endgame.

Josh Brolin plays Thanos.

We capture tons and tons of data in this laboratory setting

with Josh.

And then we use that data to train neural networks

inside of a computer to learn how Josh's face moves.

They'll say lines, they'll look left, they'll look right.

They'll go through silly expressions.

And we capture an immense amount of detail

in that laboratory setting.

Then they can go to a movie set

and act like they normally would act.

They don't have to wear any special equipment.

Sometimes they wear a head camera

but it's really lightweight stuff, very unobtrusive

and allows the actors to act like they're in a normal movie.

Then later when the animators go to animate

the digital character, they kind of tell the computer

what expression the actor wants to be in.

And the computer takes what it knows

based on this really dense set of data

and uses it to plus up,

to enhance what the visual effects animator has done

and make it look completely real.

[gentle music]

So there will come a time in the future.

Maybe it's 10 years, maybe it's 15 years,

but you will see networks that are going to be able to do

really creative stuff.

Again, that's not to suggest

that you remove talented artists from the equation,

but I mean, that's the bet

that we're taking as a business.

Is AI gonna take over my job?

What I see happening right now

is actually quite the opposite

is that it is creating new opportunities

for us to spend the time on doing things

that are creatively meaningful.

Rather than spending lots of time doing menial tasks,

we're actually able to focus on the creative things

and we have more time for iteration.

We can experiment more creatively

to find the best looking result.

I think that the more that AI can do the menial stuff

for us, the more we're gonna find ourselves

being creatively fulfilled.

Again, the argument for us is

like really creating content that isn't humanly possible.

So, you know, we're not interested in

like creating an ad spot that your real voice actor would do

because in all honesty,

that real voice actor would do way better

than the AI technology would do.

It would be way faster

if you're just delivering a particular sentence

or a particular line.

The technology to do deep fakes is so prevalent.

You can get apps on your phone now

that pretty much can do a rudimentary deep fake.

It's gonna be interesting in the future.

Are we gonna have to put limits on this technology?

How do we really verify what's authentic

and what isn't?

There are sort of social repercussions for it as well

that I think that we don't quite understand yet.

I absolutely believe that this technology

could be misused.

Our number one priority is to make everyone feel comfortable

with what we're doing.

I think it comes down to educating

the general population eventually

and making them understand that they should think through

whatever they are looking at

wherever they're reading and now whatever they're hearing.

We feel we're directionally correct in our bet

that this is a good technology to hang our hat on

because it is getting so much better every single year.

And we don't wanna miss what we see

as like a once in a lifetime opportunity here.

Up Next