1

The original data cannot be reproduced or tied to the model.

The data is simply used for training purposes. The model is trained off of other data sets also.

Is it fair to use the model for other commercial purposes post - training?

license https://creativecommons.org/licenses/by-nc/4.0/

(example: I train a model to recognize handwritten characters, but the dataset used to train it is CC BY-NC 4.0). Can I sell the use of the model?

3
  • what is the purpose of the model? Just to be a test case or does the program come with the model?
    – Trish
    Commented Mar 18, 2019 at 16:34
  • @Trish The model is just used to take in other similar data and provide a report on that data. (example: I train a model to recognize handwritten characters, but the dataset used to train it is CC BY-NC 4.0). Can I sell the use of the model?
    – DmetrikX
    Commented Mar 18, 2019 at 16:43
  • This should not be closed as unclear. It is pretty clear what the OP is asking -- may a model builder use data obtained under a CC-BY-NC 4.0 license to train a model, and then sell the model without infringing copyright or database rights. Nor is this question off-topic or too broad, IMO. Commented Mar 18, 2019 at 23:41

1 Answer 1

1

This is an interesting question. I can find no case law on point, and the answer seems to depend on the jurisdiction involved.

The CC-BY-NC 4.0 license grants, in section 2 paragraph a.1.A, the right to

... reproduce and Share the Licensed Material, in whole or in part, for NonCommercial purposes only;

and in section 2 paragraph a.1.A, the right to:

produce, reproduce, and Share Adapted Material for NonCommercial purposes only.

In addition, under Section 4 paragraph a, in those countries that recognize "Sui Generis Database Rights" the right granted under 2.a.1 also include the rights to:

... extract, reuse, reproduce, and Share all or a substantial portion of the contents of the database for NonCommercial purposes only;

These are the "Licensed Rights" under section 1 paragraph g, and the license restrictions (including the restriction to noncommercial use) apply only to the Licensed Rights.

Note that the Sui Generis Database Rights are recognized only in the EU, and in particular are not recognized under US copyright law. Such rights were rejected under the Supreme Court decision Feist Publications, Inc., v. Rural Telephone Service Co., 499 U.S. 340 (1991)

According to the US Copyright Office's "Statement of David O. Carson, General Counsel, United States Copyright Office before the Subcommittee on Courts, the Internet, and Intellectual Property Committee on the Judiciary, 2003"

What remains is a thin layer of copyright protection for qualifying databases. In order to qualify, they must exhibit some modicum of creativity in the selection, arrangement, or coordination of the data. The protection is thin in that only the creative elements (selection, arrangement, or coordination of data) are protected by copyright. Explanatory materials such as introductions or footnotes to databases may also be copyrightable. But in no case is the data itself (as distinguished from its selection, coordination or arrangement) copyrightable.

If I have understood the question correctly, the data used to train the model is "used", but it is not "reproduced" or "shared", as the question says:

The original data cannot be reproduced ... The data is simply used for training purposes.

Thus, in the US, it seems that the non-commercial restriction would not apply to this training data that is used but not reproduced or shared. However, in the EU the non-commercial restriction would apply, and selling the model created in part by the use of the data, without an additional grant of permission would seem to violate the license and thus infringe the copyright-holder's rights.

The CC Data FAQ says:

Under version 4.0, if an NC license has been applied then any use of the licensed database or its contents that is restricted by copyright law or sui generis database rights requires compliance with the NC term, even if the database is not publicly shared.

(emphasis added)

This seems to confirm the conclusion that protection of the data against commercial use by the CC-BY-NC 4.0 license is afforded only if the copyright holder is in a jurisdiction that grants the sui generis database rights, that is, in the EU.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .