25

How are subtitles written for movies? Does a person simply write it down line by line, perfectly syncing with the assigned time-frame, or is there an easier way to do so?

Basically, subtitles consist of three components:

  • The order that a given subtitle comes in. For instance, if the line "This is the beginning" comes third-in-line after 2 other lines, then the assigned order number is 3.
  • The time frame, which is arguably the most tedious part. Most subtitle files have the time frame are accurate up to the third decimal point of the second (e.g. - 01:44:12,145 --> 01:44:13,036), something that even the most acute human perception isn't capable of sensing.
  • The line itself, which is also a tedious task to write down. Many such lines don't even include dialogue, but sound, such as [SIREN WAILING] or [ECHOING].

It is hard to believe that actual people sit down to do the tedious task of writing down subtitles, who aren't even credited as part of the film. So how exactly are subtitles written? Do people write them? Or is there a software that can do that for you?

11
  • 3
    Yes, it's written manually and mostly done by normal people (not professional) who write it then post it online. (This probably against the rules) But I feel I have to thank people in "Addic7ed.com" for providing free and quick online subtitles for most tv shows and people in "subscene.com" for translating it to some languages (including my own)
    – madmada
    Commented Aug 5, 2017 at 21:14
  • 7
    @madmada Definitely mostly done by professional companies Commented Aug 5, 2017 at 22:40
  • 6
    @Rapid "who aren't even credited" They are always credited near the end of the film, in the form of subtitles themselves. Commented Aug 5, 2017 at 22:41
  • 12
    @madmada Hehe, that was kind of my point. When you say mostly, you really have to mention exactly in what domain. Every distributed bluray/DVD in history with subtitles was done professionally, same for any TV channel that broadcasts anything with subtitles, etc. If you turn your sights to online torrents and such, then suddenly amateur srt files become the norm :) Guess my point was: depends where you look. Commented Aug 5, 2017 at 23:38
  • 4
    Subtitling is considerably less work than writing, filming/animating, editing and scoring a movie… Even if it's tedious and not to be underestimated, it pales in comparison to the rest of the work already going into a movie. Not sure why it's so unfathomable that people do it.
    – deceze
    Commented Aug 6, 2017 at 7:13

4 Answers 4

25

There are two main types of subtitling and several ways of attaching them to videos.

Types of subtitling

As you've noted in your question, sometimes subtitles have sound cues and sometimes they do not. This is because some subtitles are designed for the hearing and some are designed for the deaf.

Subtitles designed for hearing people will not include these sound cues because the hearing people can... well... hear them. Generally these are used when translating subtitles from another language. In the US, at least, this is generally just called "subtitling"

Subtitles designed for deaf people will include these descriptions because they add details that explain why someone reacts to certain things. Because they can't hear the audio cues, they need textual versions. They add depth to the movie watching experience. These are usually subtitles written in the same language as the spoken language in the film. This is a specialized form of subtitling often referred to as "captioning".

"Captions" aim to describe to the deaf and hard of hearing all significant audio content - spoken dialogue and non-speech information such as the identity of speakers and, occasionally, their manner of speaking - along with any significant music or sound effects using words or symbols.

Ways of subtitling

Most good subtitles are done manually and are either stored in a separate file marked with time code queues or they are hard coded into the video. The latter is more often done with films that have scenes in a language different from the bulk of the film or, occasionally, with fan subs but, usually, they're a separate file. If they are a separate file, that is referred to as "closed captioning" and if they are burned in, that is "open captioning".

The term "closed" (versus "open") indicates that the captions are not visible until activated by the viewer, usually via the remote control or menu option. On the other hand, "open", "burned-in", "baked on", or "hard-coded" captions are visible to all viewers.

Creating these files can be done with a variety of available software options but they are painstaking and difficult to make so you should appreciate the people who take the time to make them (assuming they're done well).

Some websites have bots that do subtitling. They use voice recognition to approximate the words being spoken. They're usually really bad. Youtube has this. If the creator of the video doesn't include a subtitle track, they will create one for you.

YouTube is constantly improving its speech recognition technology. However, automatic captions might misrepresent the spoken content due to mispronunciations, accents, dialects, or background noise. You should always review automatic captions and edit any parts that haven't been properly transcribed.

As to the timing, unless the dialogue is revelatory, the timing isn't really that important. A half second lead or delay isn't going to annoy people that much, so most of that granularity is simply a "because we can" thing. Multiple second delays or leads are a problem and you can actually usually sync the subtitles better (or edit them otherwise) if you have the software and can separate the files.

4
  • 2
    People who do captioning for live TV like news reports often make lots of mistakes when trying to keep up in real time. I wonder if automatic speech recognition would do better.
    – Barmar
    Commented Aug 5, 2017 at 23:35
  • 1
    @Barmar: Maybe. You probably have much better sound quality for "real" TV (as opposed to random things that people have uploaded to YouTube). And while accents can vary, for well-known actors and newscasters it's likely you have a large corpus of the person's already-captioned speech to train your machine learning model on. But on the other hand, well-trained humans are surprisingly good at captioning in real time. The best people usually only mess up on uncommon words and proper nouns, which a machine might also struggle with.
    – Kevin
    Commented Aug 6, 2017 at 17:41
  • I love the way that YouTube's disclaimer basically says, "If the automatic captions are wrong, it's your fault for speaking wrong or speaking funny." *sigh* Commented Aug 6, 2017 at 20:15
  • Re: timing, in my experience of watching subbed foreign-language movies, if there's a joke, the audience waits until they can tell (from inflection, timing, and whatnot) that the actor has actually spoken that line, and then they laugh. I found myself doing to this too. It was very odd. Commented Oct 30, 2017 at 19:41
13

I have a friend who works at a company that does subtitles for TV.

He has dictation software (Dragon), trained specifically to his voice. He listens to the TV show, and as he listens, he repeats the lines clearly into a microphone. The computer transcribes what he says.

This is somewhat more accurate than getting the software to try and understand the audio track of the TV programme directly, but fast enough that it works for live broadcasts (with a short delay and the occasional wrong word).

2
  • Question is about movies, not live TV. While I cannot answer the question, I strongly suspect that the methods used for live TV (unscripted) are different from movies and pre-recorded TV. Difference being non-live has a script available containing all of the words already written, with the minority of the dialog being improvised or incorrectly recited by the actors.
    – user9311
    Commented Aug 7, 2017 at 14:19
  • @Snowman Indeed, there are many examples of TV closed-captions that don't match the dialog because they were made from the original script, but there was improvisation or last-minute rewrites during filming.
    – Barmar
    Commented Aug 7, 2017 at 14:24
7

Many pieces of subtitling software help you sync stuff faster than doing it by trial-and-error, by allowing you to import your movie and showing you a waveform of the audio. It's easy to visually remember where a line starts and ends that way, so, once you've heard the audio once and you've transcribed/translated it (yes, manually), all you have to do is select the desired time frame by clicking and dragging and it sets the timecodes by itself.

Some programs also allow you to perform automatic quality checks on your subs. Now that Netflix is available almost everywhere on the planet, there is a lot of need for subtitlers, so they started hiring freelancers. So I was reading about Netflix's subtitling standards the other day and they recommend a few add-ons that show you whether you've passed the maximum number of characters per line, or whether you've exceeded the maximum reading speed, or whether you have useless leading/trailing spaces, whether you've timed your subtitles properly when near shot changes (!) etc., and those addons also allow you to export your subtitles to Netflix's preferred format.

So yeah, subtitles are written manually because voice recognition and automatic translation are still unreliable, but at least today's software allows you to focus more on actually transcribing/translating the content and less on figuring out technical minutia, like ensuring that all of the text fits on the screen or figuring out the perfect timing/reading speed for each line.

5

It's not that big a deal.

Ever sit through the credits of a movie? No? There's a reason for that. There are often literally a thousand people named, because they are all involved in the movie's production. A movie can easily be as big an operation as a space launch - in fact India put a lander on Mars cheaper than the cost of the movie The Martian.

So closed captions are a very small task in a movie's production. It is no problem having a human do this. Especially when she has access to the shooting script in digital form.

Now, as for foreign languages, the initial pass of laying in the same-language captions establishes the timing and flow of when to subtite. They can usually just "drop in" the foreign language captions. Again, this is a human process, and again, a tiny fraction of the total cost of the movie.

Then you have the fan-made captioning. The economics on this are a little weird, because it's a labor of love by the involved volunteers. Remember, it only needs to be done once per language. There are also people doing captioning for pay as piecework, a-la the Amazon Mechanical Turk, but this places the provider at serious risk of running afoul of labor laws, such as minimum wage.

Voice recognition that tries to recognize "any voice" is garbage, hard enough even when the background noise is very low. Against the background noises and music of a movie, it would be even worse. You can see how Youtube tries to do it, but it's terrible. Computer voice recognition only works well when it's trained over time to a particular person's voice.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .