1

There are many online services that analyze social media data and then display aggregate results. For instance the most frequent keywords in social media posts, etc. Is it legal to do something similar for Youtube captions? Is it legal to display parts of the captions on your own website? I'm mainly interested in US jurisdiction, but advice for other countries is also appreciated.

From what I understand this use case would need to fall under "fair use" doctrine, but it seems quite vague. The Youtube support FAQ states that:

Borrowing small bits of material from an original work is more likely to be considered fair use than borrowing large portions. However, if it's the “heart” of the work, even a small amount may weigh against fair use in some situations.

So if I understand correctly displaying some captions metrics would be fair use if I don't include any quotes? What about a summary? It could be considered the “heart” of the work and therefore use would be illegal?

4
  • Unless you want to type the captions down by hand you would need some kind of script or bot to harvest them, which is forbidden by Youtube's TOS, so that would be the first problem you would have to solve.
    – user9838
    Commented Jul 29, 2020 at 21:18
  • @EikePierstorff That's a part of my question, but I'm not sure what you wrote is necessarily a true. A web scraper needs to respect a robots.txt file in order not to violate the rules and there's a separate endpoint providing the subtitles under google.com which doesn't have any robots.txt file: video.google.com/timedtext. There's also an official API endpoint giving you access to subtitles albeit more restricted. Commented Jul 30, 2020 at 14:22
  • 1
    Right, my bad. I looked at the general TOS, but the API is a separate product (which you are obviously allowed to use, else it would be pointless :-) ) . Sorry for the noise.
    – user9838
    Commented Jul 30, 2020 at 14:27
  • @EikePierstorff No worries, it's a valid concern as there's no unrestricted official access. It seems to be a grey area. Thanks for the input anyway :) Commented Jul 30, 2020 at 15:58

2 Answers 2

0
+100

There are many online services that analyze social media data and then display aggregate results...

Those online services may be doing the analysis and aggregation under the site TOS; or they may be doing that with permission of specific and individual license agreement that they have negotiated; or they may be doing that without explicit permission and under what they feel is Fair Use, and we don't know that.

From what I understand this use case would need to fall under "fair use" doctrine, but it seems quite vague.

It's vague because YouTube's overall TOS - written by YouTube's legal team - needs to be broad to take into account many different instances of what they see could be valid usage as well as cases that they feel infringe and could pursue legal action against. They're trying to explain Fair Use in the TOS, but don't want to make it sound like they condone all usage while making sure they make it understood that Fair Use is a legal doctrine. The TOS is a legal document that both YouTube and users abide by and would be used in a (civil) court case to interpret the contractual arrangement and the possible limits of how Fair Use applies.

So if I understand correctly displaying some captions metrics would be fair use if I don't include any quotes? What about a summary? It could be considered the “heart” of the work and therefore use would be illegal?

What is really "Fair Use" and not would be ultimately decided by a court. And it's up to YouTube to initiate legal action against you and other people that YouTube considers infringers. They may or may not take action, depending on if they feel they could win; or at least discourage your use by costing you money to defend yourself, if they feel they couldn't win on the merits of the usage in terms of Fair Use. Or they may just ignore your usage; that's also an option.

You can look at Fair Use case law, but even a case very close to your type of usage might be determined differently by a court.

You could try contacting YouTube for clarification for what you want to do, but due to the volume of email they get, you may not get a response. It's your choice to aggregate and analyze with your interpretation of the TOS and Fair Use and without direct permission from YouTube.

0

I don't even think an analysis of the frequency of words raises a copyright issue at all. If I write a review of a particular book and talk about how the author really loves the word "implore", how does that copy his expression? Even if it somehow did, fair use covers commentary and criticism.

1
  • Can the downvoted please explain their vote? While I do think this answer can be improved upon and expanded, it does answer (a part) of the question.
    – sharur
    Commented Jul 30, 2020 at 16:24

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .