59

When did people start referring to an ordered group of characters as a "string"? Did this name come from before / outside of the computing field, or is it special to computing?

The metaphor is clear enough, I suppose: the characters are "strung together" like beads in a necklace. This comparison could describe anything in computing that takes place in sequence, which is a lot of things. Why did it get applied to sequences of keyboard characters in particular?

Does the name come from typesetting or something? I wonder why we didn't end up with a term like "text" or "phrase" that's more familiar to outsiders than "string" is.

9
  • 3
    It has to be older than computer terms being popular, since I remember the term “string a bunch of letters together” from the 1970s before i knew anything about computers. And when I did take my first BASIC class (in 1980) the term was completely recognizable.
    – RonJohn
    Commented Jan 25, 2023 at 19:44
  • 17
    The earliest paper I’ve found in the ACM Digital Library referring to character strings was published in 1958 and doesn’t bother introducing the terminology, which suggests it was already in use by then. Commented Jan 25, 2023 at 20:02
  • 3
    stackoverflow.com/questions/880195/… has some guesses…
    – Jon Custer
    Commented Jan 25, 2023 at 20:08
  • 3
  • 1
    Lots of things come in strings - flags, christmas lights, hit records, ....
    – dave
    Commented Jan 26, 2023 at 1:29

7 Answers 7

32

The oldest occurrence I know is from 1918, so much older than the existing answers (at least for its use in mathematics and logic/computation). This is from the book:

C. I. Lewis. A survey of symbolic logic. Berkeley University of California Press, 1918.

For example, on p. 355, he writes (emphasis mine):

A mathematical system is any set of strings of recognisable marks in which some of the strings are taken initially and the remainder derived from these by operations performed according to rules which are independent of any meaning assigned to the marks. That a system should consist of ‘marks’ instead of sounds or odours is immaterial.

8
  • A great find from a fascinating book! Is the author quoting Whitehead and Russell? I can't quite tell if it's intended as a quotation Commented Jan 26, 2023 at 15:57
  • 1
    @JohnSkilesSkinner No, I am quoting Lewis. He isn’t quoting anyone. Commented Jan 26, 2023 at 16:02
  • 6
    @JohnSkilesSkinner The oldest occurrence in a "computational" context I was able to find is from 1878, google.com/books/edition/… referring to a number so large that is perceived just as a "string of digits" rather than a number with an understandable magnitude.
    – Leo B.
    Commented Jan 26, 2023 at 18:26
  • @LeoB. Your example of "string of digits" falls out of the category of abstract mathematical concept (there is nothing abstract about it) and into the category of "a number of objects arranged in a line". This sense was, as another answer notes, already recorded centuries ago. So your example is neither the first example of the use of "string" as an abstract mathematical concept nor as a term in English. Commented Jan 26, 2023 at 21:29
  • 4
    @Carl-FredrikNybergBrodda The question as formulated does not ask about the first example of the use of "string" as an abstract mathematical concept. You've invented that aspect yourself, and now you're forcing it on others. Don't do that.
    – Leo B.
    Commented Jan 27, 2023 at 9:12
28

This question was asked on Stack Overflow, but closed as off-topic there. Before it was closed, it received this answer (lightly edited by me):

I had guessed that "string" was in use by mathematicians long before its adoption in programming languages. Turing machines effectively operate on strings. Turing may not have used the term, but it is used everywhere in automata textbooks, going back decades.

The earliest reference I could find was a fragment in Google books of a 1944 article "Recursively enumerable sets of positive integers and their decision problems" by logician Emil Post in Bulletin of the AMS. I think there is little doubt that he is using "string" in the conventional sense used in computer science. Page 286 contains:

For working purposes, we introduce the letter b, and consider "strings" of 1's and b's such as 11b1bb1. An operation on such strings such as "b1bP produces P1bb1" we term a normal operation. This particular normal operation is applicable only to strings starting with b1b, and the derived string is then obtained from the given string by first removing the initial b1b, and then tacking on 1bb1 at the end. Thus b1bb becomes b1bb1.

Paul Callahan

2
  • 11
    The "scare quotes" are a useful indication that this usage was considered somewhat novel; you often see the same in the first few citations in etymologies in dictionaries
    – AakashM
    Commented Jan 26, 2023 at 10:13
  • 1
    The term was in use 25 years before Post's 1944 article, in the same sense. See my answer. Commented Jan 26, 2023 at 14:17
14

In computer science, it is sometimes deemed necessary to distinguish between the data and its representation, to be able to formulate thoughts like "lines of text are stored in computer memory as [explicit] strings of characters" (as opposed, for example, their offsets in a file, etc).

The word "string" comes naturally for that synonymic, but not quite equal, usage, as, for the English noun "string", Sense of "a number of objects arranged in a line" first recorded late 15c. (Emphasis mine - L.)

Explicit occurrences of the phrase "string of characters" in the 19th century books can be found in abundance; for example,

This literary trifling is obviously quite useless as a means of indexing for reference , unless the whole string of characters be learnt by rote

is quoted from Notes and Queries on China and Japan - Volumes 1-2 - Page 74, 1867.

7
  • Do you have a citation for its first use in a computing context? Not a problem if you don't - this answer is already useful to demonstrate that "string of characters" wasn't too uncommon prior to electronic computing. Commented Jan 26, 2023 at 5:28
  • @TobySpeight My attempts to use Google Books for that failed, because, for example, a COBOL manual published after 1960 was found using the search for "string of characters" in publications between 1920 and 1950, and attributed to 1934. Reverifying every found item would be tedious. However, it would be a fair assumption that cryptographic contexts would precede computing contexts by several decades.
    – Leo B.
    Commented Jan 26, 2023 at 6:30
  • 3
    @TobySpeight Re cryptographic contexts: google.com/books/edition/Everybody_s/… is indeed a 1925 publication
    – Leo B.
    Commented Jan 26, 2023 at 6:42
  • A worthy mention might be the idiom "(he came out with...) a string of expletives". String is not a very common word, but it's quite obvious that it has a broader meaning as a "series" or "sequence".
    – Steve
    Commented Jan 26, 2023 at 10:16
  • ‘String’ is a relatively common word, though not necessarily in the meaning of ‘sequence’. Commented Jan 26, 2023 at 11:54
11

The word is used in an 1834 treatise on the potential power of Charles Babbage's Difference Engine.

Mr Babbage's invention puts an engine in the place of the computer; the question is set to the instrument, or the instrument is set to the question, and by simply giving it motion the solution is wrought, and a string of answers is exhibited.

While this may not be quite the answer asked for, its usage in a related sense at the beginning of computing history could mean that looking for a strictly constrained definition of a term that conveys much of its original non-computing meaning in the contemporary syntactical usage may be an interesting but ultimately indeterminate exercise.

1
  • 2
    I too doubt it's the answer asked for, but upvoted because it is certainly an interesting exhibit!
    – davidbak
    Commented Jan 26, 2023 at 22:59
9

Knuth frequently gives a complete definition of a word, including its etymology, derivations, derivatives, names of people who invented it, back to Babylonian times, or further, but in this case he disappoints. At least in v1 of TAOCP "Fundamental Algorithms" where "string" appears several times in the index but none of the references are to a history.

But consider the very first use of "string" (according to the index) in TAOCP, v1 3e §1.1 pg 8:

If we wish to restrict the notion of algorithm so that only elementary operations are involved, we can place restrictions on 𝑄, 𝐼, Ω, and 𝑓, for example as follows: Let 𝐴 be a finite set of letters, and let 𝐴 be the set of all strings on 𝐴 (the set of all ordered sequences 𝑥1𝑥2 … 𝑥𝑛, where 𝑛 ≥ 0 and 𝑥𝑗 is in 𝐴 for 1 ≤ 𝑗 ≤ 𝑛). The idea is to encode the states of the computation so that they are represented by strings of 𝐴.

The context here is that of an algorithm on sequences, finite and infinite, drawn from a finite set of letters and "a set of all strings on 𝐴 (the set of all ordered sequences ...". (emphasis is mine)

So that tells me that in his mind the word "string" comes from formal language theory, so I suspect the answer lies there. That theory uses common words as formal terms, such as "word" and "alphabet" and "sentence". The study of formal languages goes way back so it might even be the case that the word "string" comes from a translation of a word used by some investigator writing in some other language. And of course from a computer programming point-of-view, formal languages are the basis for a lot of the seminal work in parsing and compiling, and those were very early concerns in the history of programming.

(Sadly I am not a research librarian so I cannot complete this "answer" with the actual correct facts. But maybe someone else can "do a Knuth" here. But see also Carl-Fredrik Nyberg Brodda's answer that goes back to a possible different mathematical progenitor.)

I would have to say that the otherwise completely excellent and enjoyable book Jewels of Stringology: Text Algorithms (Crochemore, Rytter, 2003) disappointed me here, too. It just asserts in the Preface:

The term stringology is a popular nickname for string algorithms as well as text algorithms. Usually text and string have the same meaning. More formally, a text is a sequence of symbols.

And then it goes on from there for 280 very interesting pages covering several dozen algorithms ...

(I just mentioned that book because I love the term "stringology".)

update

Further to the idea this comes from formal language theory by way of interest in parsing and compilers, consider the paper The Syntax And Semantics Of The Proposed National Algebraic Language Of The Zurich ACM-GAMM Conference (Backus, 1959). That language being discussed of course is what became ALGOL 60.

On page 16 we see a discussion using "strings of symbols" - that is, as used in formal language theory:

enter image description here

And then later, pg17, we see "string" used in an informal description, in the grammar itself, of what we now call a "string":

enter image description here

and a few lines later as the name of a non-terminal in the very familiar usage "quoted string":

enter image description here

(I just love these old papers where it was the responsibility, of the department secretary, to laboriously type the paper from a manuscript making care to leave extra blank space everywhere she (inevitably, then, a "she") would have to go back later and ink in the math symbol by hand. Actually, I grew up in that era, and saw it happen in our Mathematics department at college. Those ladies were skilled. It was non-trivial to read the manuscripts of our college professors!)

2
  • 2
    Interesting! Does Knuth directly state it comes from formal language theory? Or, does he simply use it in that context without explicitly stating it? What page of TAOCP? Commented Jan 26, 2023 at 22:46
  • 1
    @JohnSkilesSkinner - edited answer to include photo of book - but answer is no, he doesn't state that, he just uses it immediately (pg8!) in that context.
    – davidbak
    Commented Jan 26, 2023 at 22:55
8

FORTRAN did not have strings: it had "Hollerith constants". "Characters" were added in F77.

COBOL60 had 'characters', making up 'words' of up to 30 characters. http://www.bitsavers.org/pdf/codasyl/COBOL_Report_Apr60.pdf

Dartmouth BASIC did not have 'strings' when in was introduced in 1964. Pascal and c (1970 and sometime after 1970) did not "strings" but did have "character arrays".

By 1974, BASIC did have strings and people knew what they were: the 1974 Pascal "User Manual and Report" references "algol bit strings" (sets), but also the absence of (character) string operations "that users may expect". http://prog.vub.ac.be/~tjdhondt/ESL/Pascal_files/PASCAL%20user%20manual%20and%20report.pdf

Although the origin of the word is obvious and historical, it wasn't widely used in the modern sense until sometime 1970-1975.

7
  • 1
    Don't limit yourself to variable types, consider literal strings as well. "Literal strings of characters" appears in COBOL manuals in the early 1960s.
    – Leo B.
    Commented Jan 26, 2023 at 8:33
  • Neither Ritchie nor Wirth choose to use the term in their first iteration around 1970. I say "not widely used".
    – david
    Commented Jan 26, 2023 at 9:01
  • 2
    Algol 60 had strings by that name, so the 1970s is way too late a date; the Report was published in 1960. IAL (Algol 58) allowed for extralingual strings: procedures may require as parameters quantities outside the language; e.g., a string of characters (...).
    – dave
    Commented Jan 26, 2023 at 12:09
  • 2
    SNOBOL4, an early 1960s language explicitly designed for processing strings, uses that word to describe itself: the basic data element of SNOBOL4 is a string of characters. The '4' is 'version 4', the term was used in earlier versions, as this ACM citation demonstrates.
    – dave
    Commented Jan 26, 2023 at 12:27
  • 1
    Jean Sammet's 1960s survey, Programming Languages: History and Fundamentals uses the term, and I infer it was commonplace by that date,.
    – dave
    Commented Jan 26, 2023 at 12:38
8

'String' is - AFAIK - simply short for 'String of Character'. Same way shortened as we say 'float' instead of 'floatingpoint'

As such it is a common language picture, independent and way older than (modern) computers. Works the same way with string of pearls, string of turtles or string of stars.


Common Language vs. Expert Jargon

Words of a specialized jargon almost always grew out of common words. Much like here. Later on users of that jargon do at foremost think of that specialized meaning - unless forced by environment or additional information to consider different.

Just talk with an architect about a string without giving further clues.Both of you will think they understand what it's about until the astonishing moment when it's becoming obvious that he is talking about a line of bricks, while you were quite clear that character are mean - and vice versa.

Jargon is most of the time created from common words. After all, the ones needed aren't existing at that point in time. So creation is usually done by description ... like 'a String of Characters'. Later such descriptive terms get often shortened in a way still understandable within that trade ('String of Characters' -> ' Character String' -> 'String').

This(re-)use of common words is by no way exclusive to English and becomes obvious when looking at the same term in other languages. German for example uses 'Zeichenkette' or 'Chain of Characters' (literal 'Character Chain' *1, *2) which uses the same picture, albeit based on what is the common picture in that language, so here chain instead of string.

It's the natural thing to happen - just think what words you would use to explain a data base key entity relation to your grand grand aunt - in her understanding a key is of metal and you're her relation :))

But That Must be More Complex

A habit often seen when asking such questions - or discussing it, as seen in comments - is that people are firm in today's understanding of such expert jargon that they have a hard time to think of a world where that jargon was not settled. One where every day words were used to describe the new thing. For sure there must have been as secret meeting defining that new word for very logical reasons - or at least qua law like order. It can't be that it just grew. Never :))


*1 - Interestingly as well in French, with 'chaîne de caractères'

*2 - Which BTW opens a related, quite interesting historical distinction: In cryprographical analysis before computers the German term for what we now call a string wasn't Zeichenkette, but Zeichenfolge - literal Character Sequence. So there is a settled term that could have been used, and for some time was used, but vanished.

18
  • 4
    When you consistently set a high bar people come to expect you to meet it.
    – davidbak
    Commented Jan 26, 2023 at 4:11
  • 11
    Because it’s a guess/assumption more suited to comment than answer.
    – RonJohn
    Commented Jan 26, 2023 at 4:43
  • 4
    @Raffzahn But why not "array"? Or "sequence"? "Line"? "Sentence"?
    – wizzwizz4
    Commented Jan 26, 2023 at 8:23
  • 6
    @Raffzahn The answer is dismissive, and is basically repeating information which is already in the question. -- It also completely omits answering any of the explicitly articulated questions: When did this usage start? Did the usage with ordered groups of characters originate outside of computing? Why is "string" specific to characters (e.g. we say an "array" of ints, not a "string" of ints)? Were there particular reasons why other potential terms (like "text" or "phrase") weren't chosen?
    – R.M.
    Commented Jan 26, 2023 at 16:02
  • 2
    For a string to be also an array, it surely would have to be indexable. Algol 60 strings were not indexable.
    – dave
    Commented Jan 27, 2023 at 0:04

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .