4
$\begingroup$

Computers were originally built and programmed by developers who wrote in the ISO Latin alphabet and as a result all modern computers are biased towards human-readable input in the ISO Latin alphabet.

If the originally programmers instead wrote in Devanagari, Hangul, Hanzi, or another writing system which is represented in Unicode through clumsy pre-composed characters, how would they have devised human-readable programming languages and command line inputs biased in favor of their own writing system?

$\endgroup$
2
  • $\begingroup$ +1 on the answers below that conclude that the premise of your question is entirely wrong. The first programmable machines were not programmed with anything even closely resembling written language. They were programmed with "words" written in binary... ones and zeroes... on/off... hole/no hole... plug/no plug. Assembler and compilers translate from human readable words to these computer readable words. And they would not be any different no matter what writing system you use, the task is still the same: translate the human words to computer words. $\endgroup$
    – MichaelK
    Commented Nov 10, 2016 at 11:22
  • $\begingroup$ So your question is actually easy to answer: look at they keyboards for these other writing systems. Hangul for instance is no problem at all, it consists of a mere 40 letters and there are keyboards for it. Similarly there are keyboards for Devanagari. Hanzi is trickier because each character is an entire word, but there is a solution around that with the Pinyin phonetic "spelling". $\endgroup$
    – MichaelK
    Commented Nov 10, 2016 at 11:28

5 Answers 5

7
$\begingroup$

You are incorrect about using latin languages. The first computers took their input in the form of punched cards. in fact, punched cards were in use for programmable weaving machines long before computers were invented. I think its safe to say that your computer pioneers would invent a language to communicate with their computers, not build computers and inputs to suit the language. lets look at our own history of development for ideas:

First, if their computer is binary based (likely) then they'll invent a language of ones and zeros, or holes and not holes, or black and white, or ups and downs, or tones and silence. Perhaps several of these. its the easiest/only way to work with primitive machines.

The next stage is that the computer language no longer matches one-to-one with the program in memory. Instead, a sufficiently advanced program on the computer (a compiler), takes a more complicated (easier for humans to read and write) language and translates it into the binary language. Our history took Assembly, a set of names for "slots" or storage spaces (registers) in the computer, and short names for commands which work on them. Each instruction in the language and the arguments to the instruction are literally translated to ones and zeros, which directly work in the cpu. Dealing with literal addresses in memory is pretty inflexible, so the concept of names and labels was introduced, with a tool called a linker. It assigns addresses and keeps track of them, meaning you can refer toeyour variables and numbers by name, not location.

You can see by this point that a few critical linguistic concepts are in play... verbs, describing actions on data, which is named. A further class of instruction involves decisions... if the result of something is zero, go to a different memory address(area of program) and start executing the instructions there, otherwise continue. This, more or less, is the full set of instructions you need to solve pretty much anything (aka being Turing complete). you will need your language and computer to be turing complete if you want it to be a proper, flexible computer, but further refinements to the commands and language are basically for convenience, ease of use, speed of execution, not really essential to solve problems.

Depending on how your language works, and how advanced the computers are, some kind of symbolic references for the instructions and names would be developed. perhaps its ok with one key for each instruction (typically you would have some tens of instructions in a simple computer), and some way to compose variable names (combinations of phonetic sounds?), using numbers as a worst case would be prevalent, much like the latin character set got itself firmly embedded in modern computers. I find it hard to believe their primitive computers would be required to support thousands of characters just because their language demands it, they'd likely be required to develop a reduced character set representation of their language (probably phonetic) as a part of their computer development.

It's been said that had we discovered binary mathematics and boolean logic sooner our computer development would have occured much sooner too. If your culture can not find a cut-down wayeto represent commands and variables it'll likely prevent them from developing computers.

Only much, much later would computers be developed enough to handle the natural language. look as what we have today, you can (in a limited sense) say what you want and have the computer speak the response back to you. That took 50 years of breakneck speed development to achieve. Natural language handling is an extremely advanced computer science field, the fundamental computer design is long, long established before the input matures.

$\endgroup$
1
  • $\begingroup$ Punch cards (and yes, I've used them) are just a way of encoding (a limited set of) characters. Not Latin ones, but English alphabetic, numeric, and some punctuation symbols. Basically what's on your US keyboard. $\endgroup$
    – jamesqf
    Commented Nov 10, 2016 at 4:15
4
$\begingroup$

"Dead keys"

While the visual representation may (or may not, there's literally countless ways to have accomplished it) rely on pre-composed characters, the input style for abugida languages would almost certainly have relied on a "dead keys" or similar input system: Typing a character that would modify the next character typed would result in nothing happening on your screen, until you type the subsequent character and the modified representation appears.

For example, on the French (fr-fr) keyboard, the character '`' is a "dead key": You type it, and nothing shows up. Follow it with e.g. the letter 'e', however, and you get "e grave": 'è'. Type a key that the grave accent can't modify, though -- say, 'd' or ' ' (spacebar) -- and the '`' shows up as a separate character, just like when you type that on a US (en-us) keyboard.

While modern examples of "dead keys" (to my knowledge) all follow this pattern of "dead key modifies the next letter typed", there's no reason why an input system couldn't flip that: A regular key that has been typed -- and is already displayed -- could of course be modified by a subsequent "dead key". For instance, the letter sequence 'ae' in some older languages used to be always represented as the grapheme 'æ'; in a "reverse-dead key" input system, typing an 'a' could display the character 'a' just as you expect, but a subsequent 'e' could result in that 'a' being replaced on your screen by 'æ' instead of showing 'ae'.

This could be made more complex by allowing the keyboard's modifier keys to modify the dead keys; for instance, maybe 'e' positions the diacritic to the right, while Shift+'e' positions it above, Ctrl+'e' below, and Alt+'e' positions it left. (NB: Your abugida-based keyboard would more likely have modifier keys specifically for this purpose, rather than trying to clumsily re-purpose keys that have specific purposes themselves as I am doing here purely for demonstration purposes.) Or, the sequence might matter: Typing 'ke' could put the diacritic on the right, while typing 'ek' puts it on the left; 'kE' (that is, 'k' followed by Shift+'e') could put it on the top, while 'Ek' would put it on the bottom.

The exact number, purpose, and description of modifier keys, as well as the possible function of varying the order of your key presses, will of course vary wildly depending on the specific needs of your language: Devanagari would need all 4 diacritic positions, for example, while Khmer may only need one. The complexity of your written language will necessarily inform the complexity of your input system.

$\endgroup$
1
  • $\begingroup$ That's still alphabetical or featural. $\endgroup$
    – JDługosz
    Commented Nov 10, 2016 at 12:12
2
$\begingroup$

Early computers used punch cards, tape or other primarily numerical inputs so the language, alphabet and syntax of the programmers native langauage was largely irrlevant.

Even now actual programming languages are really more about symbolic logic thaa liguistic syntax and even when you get recognisible english words it would be trivial to replace them with any arbitary symbol.

Dealing with language proper happens at quite a high level and has relatively little to do with the basic architecture of a computer.

There is also the fact that in terms of a keyboard input a langage with a failry small alphabet fundamentally lends itself better to a mechnical input, bu tthis applies as much to typrewriters and movable type printing and isn't special to computers as it is easy to map one letter to one key or print slug.

This isn't limited to the latin alphabet any alphabet based writing sytem be it cyrillic or viking runes would work just as well.

The key thing here is that on a fundamantal level the 'true' language of computers is binary as they are really just a very complicated arrangement of switches so any alphabet ultimately needs to be mapped to binary numbers.

Also bear in in mind that computers had a long history of pure number-crunching before they started dealing with proper alphabets as a direct input in the sense of word processing as we understand it.

If we are limiting ourselves to realtiavely simple input systems then there are a couple of possible solutions to languages with a large number of characters.

1) you have a keyboard with a large number of permutations, either just a very big keyboard or a more sophisticated arrangement of modifier keys, some variation of stenotype machines being one possible approach.

2) Use a phonetic syestem so you are transcribing the spoken language and cutting out the alphabet/characters altogether. note that many languaged developed some alanogue of this system long before computers were common eg Japanese Katakana.

3) A direct character input system, perhaps by a matrix of buttons, obviously this is a lot easier with modern touch screen/graphics tablet technology.

4) A contextual menu / pointer system to select characters from a table.

It may also be instructive to look at the computing developments made at Bletchley Park , the British code-breaking centre dureing WWII where the first programmable computers were developed (although secrecy meant that they had relatively little impact on the development of computing). There is an irony here that although they were concerned with processing language and encryption all of the actual imputs were made by punched cards and tape, patch leads and mechnical switches and the actual encoding of the text was a manual operation.

$\endgroup$
1
  • $\begingroup$ bah, you beat me to the punched cards .. :) $\endgroup$
    – Innovine
    Commented Nov 9, 2016 at 20:14
1
$\begingroup$

The question has several erroneous assumptions.

  1. "Computers were originally built and programmed by developers who wrote in the ISO Latin alphabet".
  2. "...which is represented in Unicode..."

The first computers were originally built decades before ISO Latin existed. The same goes for #2. When the earliest computers were built, they were "programmed" with switches and plugs. They weren't much more that calculating machines. Internally, computers manipulate switches which are either on or off - binary. It's irrelevant what language you start with, you still have to find a way to reduce it to a set of ones and zeros that represent symbols the computer can manipulate.

https://en.wikipedia.org/wiki/Colossus_computer https://en.wikipedia.org/wiki/History_of_programming_languages https://en.wikipedia.org/wiki/ISO_basic_Latin_alphabet https://en.wikipedia.org/wiki/Unicode

$\endgroup$
1
$\begingroup$

The real problem is memory. You can encode all the characters used in early programming languages - and still in most languages today, outside of quoted strings - in 8 (really 7) bits. To use a language with many more characters, you need to go to at least 16 bits, which doubles the size of your program code. When your mainframe computer has a whopping 8K or 16K bytes of memory, every bit counts.

There's another problem with output. Early displays and printers typically used an 8x8 pixel matrix. You just can't make many recognizable characters with that. If you increase the pixel density (which manufacturers couldn't do in those days, affordably) you'd still need sufficient memory in your hardware to store all the character bitmaps.

Bottom line is that they'd have had to develop their own symbolism using a limited character set, to deal with the limitations of their hardware.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .