68
$\begingroup$

The most commonly used word in english is "the" accounting for about 6% of all the words being used. The second most common word is "be" account for less than 1% of all the words used (see ngram viewer and wikpieda > Most common words in english)

Imagine a human society where a widely spread language which most commonly used word accounts for over a third or even over half of all words being used. How could such language come to existence and how could it sustain?

$\endgroup$
17
  • 7
    $\begingroup$ Note that Wikipedia's "be" includes all parts of the verb to be. In particular, if you add "is" to your ngram, you'll see that it's about twice as common as the literal word "be"; "was" is also slightly more common than "be". ngram for the/is/was/be $\endgroup$ Commented Oct 25, 2016 at 19:21
  • 30
    $\begingroup$ By smurfing, that's how. Smurf else? $\endgroup$
    – user26892
    Commented Oct 27, 2016 at 19:19
  • 4
    $\begingroup$ Scientific take: Chicken chicken chicken. $\endgroup$
    – yankeekilo
    Commented Oct 28, 2016 at 9:25
  • 3
    $\begingroup$ Obligatory XKCD link: xkcd.com/1007 $\endgroup$
    – Burgi
    Commented Oct 28, 2016 at 9:43
  • 5
    $\begingroup$ Modern popular media and informal, ever-day conversation would sometimes make one think that the 4-letter f-word beats "the" hands down... $\endgroup$
    – frIT
    Commented Oct 28, 2016 at 10:57

18 Answers 18

95
$\begingroup$

Simple, make the language Oligosynthetic. Oligosynthetic languages are languages that have 50-200 words and simply combine words to create new concepts. For example; Let's say I want to say hell, in English I would say hell. In an oligosynthetic language, I would say: Inverse-help-place or hurt-place. In an oligosynthetic language, the word that inverses a word (english prefixes like A-, Un-, or In-) would occur in 50% of nouns.

This is similar to what O.M suggests, but his words would appear around 30% of the time.

$\endgroup$
7
  • $\begingroup$ Comments are not for extended discussion; this conversation has been moved to chat. $\endgroup$
    – HDE 226868
    Commented Oct 27, 2016 at 17:53
  • $\begingroup$ This doesn't match the OP's description. You might as well say that English meets the description because "the" occurs in more than 50% of sentences. $\endgroup$
    – ruakh
    Commented Oct 29, 2016 at 5:54
  • 10
    $\begingroup$ Don't know why, Orwell came immediatly to my mind :D $\endgroup$ Commented Oct 29, 2016 at 12:20
  • 8
    $\begingroup$ Doubleplusgood answer! $\endgroup$
    – Jules
    Commented Oct 30, 2016 at 21:44
  • $\begingroup$ According to the wikipedia article, no known human languages are oligosynthetic and linguists doubt such a language would be practical. Since the question is how could it come into existence, I'm not sure this answers the question. $\endgroup$
    – Nick
    Commented Oct 31, 2016 at 15:27
51
$\begingroup$

No kurwa

Lower classes often have a curse word that can mean anything from agreement to joy to disagreement. In Polish, one word ("kurwa") can also mean, with small modifiers, drunk, angry, thrown out of a bar, and many more.

If you are creating dystopia, using curse word like this can be your way.

Nice to read: https://workout4brain.wordpress.com/2015/09/07/oh-kurwa-reflection-about-bad-words-in-polish-is-it-really-possible-to-translate - of course this barely scratches the surface.

By the way, first paragraph means "isn't it obvious?". Another use of one word.

$\endgroup$
9
  • 10
    $\begingroup$ Forgive me my Polish, but to quote a classic: "Za mało kurwa, kurwa!" ("Not enough kurwa, damn it".) $\endgroup$ Commented Oct 25, 2016 at 22:13
  • 7
    $\begingroup$ @JakubKonieczny I'm kinda torn. I can provide examples, like "kurewsko wkurwił tą kurwę" (3 out of 4 words is essentially "kurwa"), but it would make m answer... rude? $\endgroup$
    – Mołot
    Commented Oct 25, 2016 at 22:23
  • 4
    $\begingroup$ What about the f-word? youtube.com/watch?v=JXk9EPxZw48 $\endgroup$ Commented Oct 26, 2016 at 4:59
  • $\begingroup$ @Cristian similar indeed, but as far as I know, it has fewer meanings. $\endgroup$
    – Mołot
    Commented Oct 26, 2016 at 6:29
  • 1
    $\begingroup$ Except in Australia, where the f-word has so effing many more meaning $\endgroup$
    – paulzag
    Commented Oct 26, 2016 at 21:45
42
$\begingroup$

Single-line answer:

Oook

  • the librarian of the Unseen University, Ankh-Morpork.

In Terry Pratchett's books about the discworld, there is a monkey orangutan, working as a librarian in the Unseen University's library. He had his own language, consisting entirely out of "ook", "oook" and "eeek". Though his language has only 3 words, some people from the IU understand him, because each word has many, many meanings. It all matters only on the intonation, for example "Eeek." means "No.", while "Eeek!" means "I'm not a monkey!", or "Ook" can mean "Yes.", or for example, "Give me that banana.".

I actually tried doing something similar time ago with a group of ~50 people, mainly kids, on a 22-day summer camp (no phones, no electricity, no connection to civilization): I let the people choose the only 20 words they can use in the whole camp. And believe me, it did work. For example, "yes" meant two things: "yes", and "no" (when said ironically). And so on; a single word can have tens of meanings.

$\endgroup$
11
  • 25
    $\begingroup$ I imagine the orangutan considers his language to have far more than three words. In Mandarin, for example, words that sound the same to me are considered clearly distinct words because of intonation differences. Not "the same word with intonation differences", but "different words". So this answer's effectiveness would depend largely on the definition of "word" being used. $\endgroup$
    – MichaelS
    Commented Oct 25, 2016 at 22:51
  • 32
    $\begingroup$ Tell me about it: mā, má, mà, mǎ. I called my mother-in-law a horse-mountain in my first attempt. $\endgroup$
    – JDługosz
    Commented Oct 25, 2016 at 23:38
  • 9
    $\begingroup$ He is most definitely an ape, and not a monkey. $\endgroup$
    – Burki
    Commented Oct 26, 2016 at 8:22
  • 2
    $\begingroup$ Nice try but I would consider Yes and "ironic" yes to be two different words even though they share the same letters. simple.wikipedia.org/wiki/Tone_language $\endgroup$
    – Pieter B
    Commented Oct 28, 2016 at 8:43
  • 5
    $\begingroup$ I want to go to your summer camp. $\endgroup$ Commented Oct 28, 2016 at 13:27
14
$\begingroup$

Assume a language where all nouns can be used as verbs or adjectives, and vice versa. There are special words to indicate the use. Those three words would be rather common.

$\endgroup$
6
  • $\begingroup$ You mean, like the "the" pronoun? $\endgroup$ Commented Oct 26, 2016 at 14:47
  • $\begingroup$ @JanDvorak, what I thought of was a language where "I sit on the chair" becomes "I (verb modifier) sit on the (noun modifier) sit." The word chair is replaced by thing for sitting. $\endgroup$
    – o.m.
    Commented Oct 26, 2016 at 15:05
  • 1
    $\begingroup$ "The" is a pronoun in what situation? $\endgroup$
    – The Nate
    Commented Oct 27, 2016 at 7:38
  • 2
    $\begingroup$ @TheNate The The $\endgroup$
    – Burgi
    Commented Oct 28, 2016 at 9:48
  • $\begingroup$ Well played, @Burgi: ht. $\endgroup$
    – The Nate
    Commented Oct 28, 2016 at 9:50
9
$\begingroup$

In Thai, you end pretty much every sentence with ครับ (if you are male), คะ (if you are female) or ค่ะ (if you are female and the sentence is a question).

These words don't have a translation to English and don't alter its meaning in any way, but omitting them is considered impolite or, when talking to a person of superior status, even rude. Besides ending most sentences, they can also mean yes, OK, please, thank you, and I see.

I don't think ครับ et al. actually cover one third of all spoken words in Thai, but it wouldn't be hard to imagine a language that takes this extra step. If you don't require your most common word to confer a meaning (a formality, a nearly universal response, a common interjection, or some kind of pronounceable punctuation), one third should be plausible.

$\endgroup$
8
$\begingroup$

While one can imagine a language where a single word is that common, it is hard to imagine the language staying that way.

People tend to shorten words that used often, very common words may shortened right out of existence. When everybody understands what you mean anyway, there is no need to actually say the word.

If you listen to people speak, you may notice that they will very often drop "the". (in some dialects) In a hundred years, people will drop it while writing too. In two hundred years only scholars will understand what "the" means.

$\endgroup$
4
  • $\begingroup$ I agree with this answer. Basically using a single word often seems inefficient and development of languages is driven by efficiency. You want to maximize the information content of your communication so you must use all possible codes available, not just one. The will probably not die out but be used much less often and my guess is that people may still understand it although they'll think it to be kind of awkward. $\endgroup$ Commented Oct 26, 2016 at 11:21
  • 2
    $\begingroup$ Even in two hundred years, athletes who graduated from The Ohio State University will know what "the" means. $\endgroup$ Commented Oct 26, 2016 at 18:37
  • $\begingroup$ @Trilarion I don't know if that's not entirely true. Aside from raw efficiency, there is the question of fidelity, fewer bits per word can make the language more resistant to transmission losses and better convey tone. For instance, a lot of languages have articles like 'the' and 'a', even where they are in many contexts entirely superfluous, and have been for a long time. $\endgroup$ Commented Oct 28, 2016 at 13:03
  • $\begingroup$ @Williham Totland I agree that the optimum of a very dense packing has not to be attained but having 50%of the language only being the same word implies an extremely low density which all existing and used languages are exceeding by far. That's why I agree with this answer. But your idea is good too. If there is a lot of transmission losses like if lots of people are nearly deaf or if there are lots of other noises one could maybe have a very low information density which might be implemented by using the same word over and over. $\endgroup$ Commented Oct 30, 2016 at 7:49
6
$\begingroup$

The boundary between words is arbitrary. Officially, linguists define the boundary between words to be "wherever native speakers think there are boundaries"

Thus, the easiest answer to this is to define a binary language, with two words. 100% of our computers use a language like this, so it's clearly effective and sustainable.

$\endgroup$
7
  • 2
    $\begingroup$ The only problem with this is that almost no "native speakers" (i.e. microprocessors in this analogy) work with individual bits; their smallest units are bytes (say nothing of the 16-bit word, 32-bit doubleword, or 64-bit quadword!). Bits are more like letters or phonemes. $\endgroup$ Commented Oct 26, 2016 at 10:24
  • $\begingroup$ @2012rcampion True. There are interesting corelaries though. Consider the comma codes of 10/8b, which protocols like Infiniband spam whenever the link is idle, so that the link can maintain its timing. Perhaps an "idle word" might make sense in some environments. We have similar words like "um" in the english language. $\endgroup$
    – Cort Ammon
    Commented Oct 26, 2016 at 19:12
  • $\begingroup$ Turning to CPU code is not so bad. mov occurs really frequently. $\endgroup$
    – Joshua
    Commented Oct 27, 2016 at 20:06
  • $\begingroup$ @2012rcampion your distinction of a word is entirely arbitrary though. If a word is used commonly enough to have no inherent meaning without modifiers, then it effectively becomes the same as a letter and the combination of that 'word' and modifier becomes a word $\endgroup$
    – JamesRyan
    Commented Oct 28, 2016 at 10:34
  • $\begingroup$ @James I can't tell if you're agreeing or disagreeing with me $\endgroup$ Commented Oct 28, 2016 at 11:40
6
$\begingroup$

Additional to the other answers possibilities are:

Accentuation

This is only valid if you would regard words as the same if they only differ in accentuation: heihohi could mean like 9 different things depending on how the syllables are spoken (a bit like Chinese/Vietnamese). However, in written form you should then find differences.

heihòhi héihohi heihohí. - It's very nice today.

Position in sentence

Oi at the beginning of a sentence could mean "to" and at the end "not" and in the middle some form of be.

Oi oi or oi oi. - To be or be not.

Number of repeats

No could mean no and "no no" could mean really no.

A dog and a a cats make a a a animals?

Some empty phrases that are required for some reason

Xuxu might mean "Listen to me" and must be placed after every other words. Oki might mean "That's clear."

The earth oki and the sun oki are very big oki oki.

Some important religious concept

Om says, that the Omnious ways tell us that the highest flow of Om giving the best crop is in spring and Om will provide us with everything, Om will help us, Om, so let's Om.

All in all, it all sounds a bit strange to us, the biggest problem is surely to make the language not seem overly redundant which might look too artificial.

$\endgroup$
4
  • $\begingroup$ On the point of repetition, this is actually a fairly common component of a lot of languages, called reduplication. It's especially common in Pacific languages. $\endgroup$
    – cbh
    Commented Oct 28, 2016 at 0:10
  • 2
    $\begingroup$ Re the first point: in tonal language (e.g. Chinese/Vietnamese), words with different tones are different words! Calling them “the same word, spoken with different intonation” is like describing hat, hot, hit, and hut as “the same word, spoken with different vowels”. $\endgroup$ Commented Oct 28, 2016 at 11:24
  • $\begingroup$ -1 for saying "[a word] could mean like 9 different things depending on how the syllables are spoken" because that's what makes words different words. $\endgroup$ Commented Oct 30, 2016 at 7:07
  • $\begingroup$ @Azor-Ahai I think you are a bit overly critical here since the OP hasn't exactly given a definition of when two words are different to him, but I will add a comment to make that clear, so thanks for your comment. $\endgroup$ Commented Oct 30, 2016 at 7:37
5
$\begingroup$

Since it has not been mentioned yet: Phrasal containers. I can't point to a good natural language containing rigid ones, but in computer languages, they're omnipresent.

Imagine if ( ), { }, and begin / end of sentence parsing were marked by some word/particle rather than inflections or tone?

The stringing of adjectives to their nouns by “-e-” in Persian comes to mind as a sort of analog. I believe (?) Japanese has some similar particles like “o, wa, no” that serve similar grammatical purposes.

This sentence in English has a few segments that could be delimited in various ways.

(sentence {this-one) {in (language English)) has segments {how-many? few) {“that”-subordinating delimited-in {could-be) ways {how-many? various)))

Using α/ε and ω arbitrarily, and assuming that ε is an inflected/tonal variant of the “word” α, and that there's an inflected/tonal variant ω´ for closing all open phrases at once:

α sentence ε this-one ω ε in α language English ω-ω has segments ε how-many? few ω ε that-subordinating delimited-in ε could-be ω ways ε how-many? various ω´

That makes α/ε and ω collectively make up around half of all words.

$\endgroup$
5
$\begingroup$

The best example and explanation I can think of (right now) is from the TV show Rick and Morty.

Which, as you can see in that clip, can lead to confusion for someone new to the language (but presumably no problem for those fluent).

Essentially, context is the key. For example, if I said to you "I squanched my leg badly in soccer last night", you would likely understand that I hurt my leg last night (especially more obvious if I were present and had a cast/splint/bandage on my leg).

Usage as a verb/noun/adverb would also be determined by context of the sentence:

  • Verb: "I squanched my leg badly in soccer last night"
  • Noun: "I hurt my squanch badly in soccer last night"
  • Adverb: "I hurt my leg squanchly in soccer last night"

As for it being a third or more of a language...

"I squanched my squanch squanchly in soccer last night" may not be so easily understandable, and that's just a third, without physical presence to provide extra context.

However, being more verbose could help to clarify it: "I squanched soccer last night, and squanched my squanch squanchly during the squanch" is more obvious to what I mean ("I played soccer last night, and hurt my leg badly during the match/game"), particularly if I'm present with a cast/splint/bandage.

$\endgroup$
2
  • 3
    $\begingroup$ Alternatly, of course, your more verbose version could read: "I watched soccer last night, and lost my crunchy peanuts during the bathroom break." $\endgroup$
    – subrunner
    Commented Oct 26, 2016 at 20:40
  • $\begingroup$ Very much so, yes. It's not perfect, but it could help. Or be of no help whatsoever :P. $\endgroup$
    – Daevin
    Commented Oct 26, 2016 at 20:45
3
$\begingroup$

Instead of the delimiter between symbols being silence, instead have the delimiter between phrases another word, not unlike the use of 'over' rather than silence to delimit conversations over radio.

$\endgroup$
2
$\begingroup$

Depending on the sorority, the word "like" can take up anywhere between 10-40% of all words used. For fraternities, the same can be said for the word "bro".

$\endgroup$
2
$\begingroup$

Non-verbal communication - facial expression, body language/positioning, gestures, etc play a huge part in many cultures, altering the meaning of words and phrases, and sometimes eliminating the need for spoken language at all.

Perhaps your humans evolved in extremely difficult/broken terrain or live in widely spaced trees, and most of their communication (outside their own family?) is done outside of shouting range but within visual range. With or without tools such as semaphore flags, Morse code, smoke signals, etc. Poor eyesight or mobility would severely limit your ability to communicate.


Others have already mentioned context (words can have different meanings depending on how and when where they're used) and intonation (similar words in tonal languages sound identical to the untrained ear).


Outside of movies (or audio books I supposed), it would be very difficult to world-build with a focus tonality - unless the rest of your story is extremely compelling, very few people are inclined to learn to read diacritics or speak Klingon (for example) in order to grasp the nuances of your story.

Same goes for completely non-verbal communication in writing, though it could be interesting for graphic novels - writing a gripping scene about exactly how a main character was "wiggling his elbows while tilting his head in the 3rd position to show sympathy" would be pretty hard to pull off. Supposedly all ballets tell an unspoken story through dance and music though, so it's not impossible.

$\endgroup$
1
$\begingroup$

Pure/good/virtuous

These sorts of words can be applied to practically anything. There may be a superstition (or even magical basis) for these words helping ward off evil and lead to favourable outcomes. It helps if the language makes no distinction between adjectives and adverbs ("pure" and "purely" are the same word).

E.g. My pure wife purely brought pure-home purely 4 pure baskets of pure fish!

The word can even be repeated a few times when the speaker is particularly concerned about something, or wants to show particular respect for someone/something. Additionally, the word (possibly repeated) could be used as a greeting and farewell. "Good good good good good!" could imply "This event [our meeting] is excellent!"

With-God

For religious reasons, it may be expected to say "with God" (which may be a single word), or something similar, about practically everything.

Um

Usually people who are searching for words mix up their filler words ("you know", "like", "um", "uh", "well") so as not to be too repetitive. But there could conceivably be a culture where conventional wisdom states that there should be one way to say something, and the simplest way should always be used, meaning that people always say "um" when they are searching for words, and they search for words a lot because they are trying to find the simplest way to say things. (Ha!)

If the word rolled off the tongue nicely (like "mala") it might be repeated a lot while the person is thinking.

Emphasis word

A word that adds emphasis might be thrown into sentences frequently, sometimes repeated several times. If intonation and stress already serve other purposes in the language, then this usage could be reasonably long-lived.

Overuse of not

It is conceivable for sentences to be formed using negatives galore. For instance, "I'm not sick" could be phrased "not-others not-is not-healthy". Why would people do this? It might start out as humour, then turn into tradition. Maybe some revered hero or wise man spoke in this way and generations of people studied his teachings and emulated him. Maybe a bit far-fetched but somewhat plausible.

$\endgroup$
1
$\begingroup$

This may be a bit less plausible than some answers, but how about music?

La-la, la-la, la - la-la, la-lah-la-la-la lala la la, la-la, la la-la, la-la-la la-la, la-la, lala-la la -la -la -la lah

So, when referencing music (tunes, rather than lyrics), some filler needs to be used, not for inherent meaning but just as a vehicle for the tone, pitch, and rhythm. The fact that it may be, essentially, pronounced differently (la, vs lala, vs la-la, or Lah) shouldn't make it a different word, because it itself doesn't have its own meaning, it is just filler, and the people using it wouldn't think of it as different words. Of course, to be the go-to for music, it (whatever filler it is, lala or dan-dahn dan, nana-na or something) should probably also be a word, perhaps a filler or emphatic or placeholder, since they tend towards short easy sounds anyway.

If a culture is pretty musical, and also high-context (where people are supposed to notice and reference, rather than spelling things out), some filler word might end up being a substantial part of their vocabulary - because they are essentially quoting bits of songs at each other (well known ones, for well known meanings, or obscure ones when sure of audience) about like we use quotes or references, anywhere from in-jokes to obvious cultural references.

You would need to quote pretty often, to keep the percent at a third or a half of a conversation - but on the other hand, you can stack them up pretty much on top of each other, depending on how long the quoted music is.

$\endgroup$
1
$\begingroup$

In Vietnamese, the language is mostly confusing, we can see through a sentence down here:

Con hổ mang bò lên núi

This sentence up here have 2 meaning: A tiger brings a cow to a mountain(meaning 1) OR A cobra goes up the mountain(meaning 2). Why so many meaning? Because those words are "same-sounded word" which mean that those words speak the same way, but have different meaning.

Another confusing sentence:

Con ngựa đá con ngựa đá, con ngựa đá không đá con ngựa.

So what does it mean? If you are just a new Vietnamese learner, you will assume the word đá as the verb kick and so this is that sentence in English:

The horse kicks the horse kicks, the horse kicks doesn't kick the horse

Huh?? But đá isn't just a verb. It's a noun. Since there's nothing to actually recognize is it a verb or a noun (like the verb be) and therefore proving that Vietnamese is very confusing.

$\endgroup$
0
$\begingroup$

It's pretty easy to do! Let's take English as base language:

  • add a new word "da" meaning that a given word is a verb
  • add a second word "na" meaning that a given word is not a verb

And let's use "We have never been to Asia driving cars, nor have we visited Africa riding bicycles" as test sentence:

"We-na have-da never-na been-da to-na Asia-na driving-da cars-na, nor-na we-na visited-da Africa-na riding-da bicycles-na"

Based on this answer https://english.stackexchange.com/questions/55486/what-are-the-percentages-of-the-parts-of-speech-in-Rnglish you just need to create a word for #?*\$ parts of speech and another word meaning "part of speech different from all #?*\$" keeping in mind that the latter must sum up more than 50%.

Edit:

Another example could be a language spoken so fast that getting when a word ends and a new one starts could be difficult, so a word with a strange sound not used in any other word could be used as "space":

"We-§-have-§-never-§-been-§-to-§-Asia-§-driving-§-cars-§-nor-§-have-§-we-§-visited-§-Africa-§-riding-§-bicycles"

$\endgroup$
2
  • 1
    $\begingroup$ Why would this language support itself? Language redundancies tend to disappear over time. $\endgroup$
    – user8808
    Commented Oct 27, 2016 at 18:26
  • $\begingroup$ It' wouldn't, as you said redundancies tend to disappear so this would be the case for any example we try to think of. Maybe a word to indicate verbs or nouns could remain in a language, but it would account for less than 20%. $\endgroup$ Commented Oct 28, 2016 at 8:34
0
$\begingroup$

The first rule, I think, is to make them speak less. So that the overall quantity of words are comparatively lower and, hence the effective 30% can be composed of similar/nearly identical word(s).

Or you can just have a totalitarian government, where the ruler/authority needs to be addressed by the speaker every few words- irrespective of the case/ mood of the sentence. If you have seen The Dictator movie, you will get the point.

One more question I have is, do you want it to be the same word with different homophonic version, or the same word, with the same pronunciation meaning different thing in different contexts?

$\endgroup$
1
  • $\begingroup$ If you have a question for the asker, you should make a comment on the question and wait for a response. Then, when the asker answers your question, you can give a more complete answer. $\endgroup$
    – Azuaron
    Commented Oct 28, 2016 at 11:40

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .