I have been looking for open source versions of the bible written in Hebrew, with annotations, such as the literal english meaning of each Hebrew term, the definition of each Hebrew term, the part of speech, etc.. Some things I've found:

  • Hebrew bible text: There is a public domain codex for the Hebrew bible available online in text format.
  • Hebrew bible dictionary: There is the "Strong's Hebrew Dictionary" which is public domain as well, made in the late 1800's.
  • Interlinear translation: I found the Berean Bible, which is not public domain, that link takes you to the license which says you can use it in software but can't sell it directly basically, as long as you attribute them. Interlinear translation are basically the literal meaning of each Hebrew word in English.
  • Part of speech mapping: The OpenScriptures project has mapped the Hebrew bible text with the Hebrew dictionary, which I think you can do pretty easily automatically given the two data sources are public domain and online in text format usable by software. But they put a CC v4 license on it (attribution but commercial use is okay).
  • Combining it all: The OpenHebrewBible project combines all of that stuff into one package, and puts a CC v4 non-commercial license on it, partly because the interlinear translation says something to that degree in the free rendition of their product.

Now, the first part of my question is, say I want to create the same thing as "Combining it all", but I also want to release it as public domain. I can use the public domain Hebrew bible text and dictionary to get started, mapping words to definitions basically. But I can't use the interlinear translation. But I could hire some people to make an interlinear translation, and it would probably end up being pretty close to the Berean Bible, because interlinear translations mean like 1-3 word translations per word, and so you are going to have a lot of same translations. Then we have the part of speech mapping which I could hire people to help write for each word.

So say I hired people to make an interlinear version, and part of speech tagging, and I released it as public domain. How is that any different than just using the "Combining it all" project, and releasing that as public domain? I mean, just the data (not any software related to it).

Second part of the question is, isn't all this type of stuff marked as "generic data" in legal terms, stuff which is "public common knowledge" and so isn't copyrightable? For example, the part of speech tagging is common knowledge to anyone who speaks Hebrew. And the interlinear English translation is pretty much common knowledge to any translator, although you could have translators come up with different end results.

So I don't understand how I can safely/effectively go about creating a public domain version of the annotated bible written in Hebrew, given this data is common knowledge and yet there are projects which have put licenses on various aspects of this common knowledge. What are the laws or high-level factors to consider in such a dilemma?

1 Answer 1


Limiting this to the legal question, it is actually extremely simple. The Hebrew Bible is in the public domain, end of story. Therefore you can freely create a derivative work, a translation to English if you like, based on that original text. You may have to use some copyrighted aids, such as a dictionary, but that's okay, because "using a dictionary" is not copyright infringement. You may want to use some copyrighted software to format the result, but again using software is not itself copyright infringement. It would be infringement to lift someone else's protected English translation, but it would not be infringement to consult competing translations as a means of better understanding the original text.

It is true that in the case of a translation from language to language, there is going to be a substantial similarity in translations to a particular language. This is especially true when dealing with a text that has been independently translated to English hundreds of times. You may therefore have to prove that there is only one possible English translation of בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃ and that there are only two or three reasonable translations of וְהָאָ֗רֶץ הָיְתָ֥ה תֹ֙הוּ֙ וָבֹ֔הוּ וְחֹ֖שֶׁךְ עַ��־פְּנֵ֣י תְהֹ֑ום וְר֣וּחַ אֱלֹהִ֔ים מְרַחֶ֖פֶת עַל־פְּנֵ֥י הַמָּֽיִם׃ in other words that the similarity is a necessary coincidence, given the subject matter.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .