4

Most news sites and tech blogs provide only a summary in their rss feeds. If I write a script that scrapes their website and extracts some useful data, can I share the script as open source. Does it depend on each and every website and needs prior permission? If scraping is not allowed without permission, who is guilty, the person who wrote the script or the one who used them?

Note 1: I am not redistributing website's content, just sharing the script that helps to read the content in different form. I am specifically interested in Indian and US Law.

Note 2: The script may be specific to particular website and may access data that is accessible only with user's login credentials.

Note 3: script means a program or software application

E.g nytimes offer subscription offer at $3.5 a week for web+smartphone access. But for tablet access, it is $5.00 a week. Can I sell an app for tablet that parses the page from the web interface and converts it to tablet friendly format. The app will use login credentials from the user to login using a virtual browser to download the content and format it such that it is better readable in tablet? Do I have to get permission from nytimes to sell the app? I don't care about this special case but in general can the provider legally restrict how the content is consumed by the end user.

1
  • Sounds like YQL.
    – ShemSeger
    Commented Jul 1, 2015 at 22:43

3 Answers 3

2

Without too much detail knowledge of both US and Indian law, I can't really think that this wouldn't be allowed. In some jurisdictions, distribution of software programs may be restricted or prohibited if any use of such programs would very likely, or automatically, lead to copyright violations (e.g. "hacker tools"). However, this is a very exceptional scenario and doesn't apply here for various reasons.

The owner of a website publishes content on the Internet with the intention that users can access it (either all users or only those who have subscribed to the content etc.). The website owner then cannot prescribe how exactly users access the content. For example, each different browser may display a website differently, with some browsers like Safari already now providing text-only views, and in the end your script is not very different from a browser.

The only critical situation I can imagine is when somebody uses your script to scrape content from a website and then republishes it. Even then this "somebody" would be responsible and liable, not your script. Think again of the browser analogy: If I download a copyright-protected movie from a website via Mozilla Firefox and then put it on my website, I can't imagine a legal principle that Mozilla should be liable. Neither are you.

3

Check the website's terms of service. Check to see if you're violating these terms, and check to see if the script you are making enables other people to violate them. Courts don't often look kindly on actions whose sole purpose is enabling someone else to do something that is prohibited. If you're making a script that helps people do something they're allowed to do, in a way that's better for at least somebody and makes nobody worse off, that's often a different story.

Major websites will generally indicate whether or not you're allowed to do this.

Some sites are fairly strict about prohibiting scraping (e.g. Craigslist, which at one point shut down Padmapper's alternative more-useful presentation of their content).

Others, like Wikipedia, much more actively encourage reusing content from their sites as long as you meet certain conditions such as a link back to the original source.

2
  • I think there is a difference. I'm not republishing content. I understand I don't own the copyright. Every User of the script will fetch the content for themselves
    – balki
    Commented Jun 24, 2015 at 18:33
  • For their personal use.
    – balki
    Commented Jun 24, 2015 at 18:52
0

Such scripts are all over github, so even if it were illegal, it is de facto legal.

However, there's a more obvious proof of its legality; every web browser on earth does exactly what you're describing. As does every RSS reader. Every network / traffic analysis tool. Every screen reader. Every network proxy, cache, and (depending on your perspective) router and switch.

I'll admit that I'm unfamiliar with the legal precedent, but I do know the precedent in reality, and reality is overwhelmingly in your favor.

11
  • 1
    "Everyone does it" is not proof that something is legal.
    – Mark
    Commented Jun 22, 2015 at 23:21
  • @Mark Yes and no. If the probability of your being arrested for, and convicted of, jaywalking is equal to your probability of being arrested for, and convicted of, something that isn't a crime at all, then jaywalking isn't illegal. There's codified law, common law, and reality. Reality is influenced by the first two, but at the end of the day it defines what's legal. Commented Jun 23, 2015 at 2:19
  • @ParthianShot -- Almost zero is not zero. The probability of being fined for jaywalking is definitely greater than zero (I've gotten a ticket for it before), and I'm sure if you pissed off a cop enough he'd arrest you for it. The probability of being convicted for something that isn't a crime is zero (though the probability of being falsely or inaccurately convicted is definitely greater than zero). Torrenting is a perfect example of "everyone does it", but very few people get convicted for it.
    – None
    Commented Jun 24, 2015 at 19:08
  • @zyklus "Almost zero is not zero" Depends on how close to zero you are. That's a kind of fundamental principle of all of mathematics, physics, machine learning, risk assessment. That's why you'll never see any finding in a paper with a P value of exactly zero, and why (despite that) plenty of papers get their findings accepted yearly. "The probability of being convicted for something that isn't a crime is zero" Not true. "the probability of being... inaccurately convicted is definitely greater than zero" Contradicted yourself within two sentences. Nice. Commented Jun 24, 2015 at 20:53
  • 1
    @ParthianShot -- No kidding, opinions play a role in justice. What does that have to do with anything we are talking about?
    – None
    Commented Jun 24, 2015 at 21:14

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .