99

The definition of "C-Style language" can practically be simplified down to "uses curly braces ({})." Why do we use that particular character (and why not something more reasonable, like [], which doesn't require the shift key at least on US keyboards)?

Is there any actual benefit to programmer productivity that comes from these braces, or should new language designers look for alternatives (i.e. the guys behind Python)?

Wikipedia tells us that C uses said braces, but not why. A statement in Wikipedia article on the List of C-based programming languages suggests that this syntax element is somewhat special:

Broadly speaking, C-family languages are those that use C-like block syntax (including curly braces to begin and end the block)...

5
  • 36
    The only person who can answer this is Dennis Ritchie and he's dead. A reasonable guess is that [] were already taken for arrays. Commented Feb 26, 2013 at 15:02
  • 2
    @DirkHolsopple So he left no reasoning behind? Drat. Also: two downvotes on something I'm genuinely curious about? Thanks guys.... Commented Feb 26, 2013 at 15:03
  • 1
    Please continue the discussion about this question in this Meta question.
    – Thomas Owens
    Commented Feb 26, 2013 at 19:46
  • 2
    I have unlocked this post. Please keep any comments about the question and discussion about appropriateness on the Meta question.
    – Thomas Owens
    Commented Feb 28, 2013 at 13:58
  • 5
    It probably also has something to do with the fact that curly braces are used in set notation in mathematics, making them somewhat awkward to use for array element access, rather than things like declaring "set"-ish things like structs, arrays, etc. Even modern languages like Python use curly braces to declare sets and dictionaries. The question then, is why did C also use curly braces to declare scope? Probably because the designers just didn't like the known alternatives, like BEGIN/END, and overloading array access notation ([]) was deemed less aesthetically sound than set notation. Commented Sep 25, 2013 at 7:03

3 Answers 3

103

Two of the major influences to C were the Algol family of languages (Algol 60 and Algol 68) and BCPL (from which C takes its name).

BCPL was the first curly bracket programming language, and the curly brackets survived the syntactical changes and have become a common means of denoting program source code statements. In practice, on limited keyboards of the day, source programs often used the sequences $( and $) in place of the symbols { and }. The single-line '//' comments of BCPL, which were not taken up in C, reappeared in C++, and later in C99.

From http://www.princeton.edu/~achaney/tmve/wiki100k/docs/BCPL.html

BCPL introduced and implemented several innovations which became quite common elements in the design of later languages. Thus, it was the first curly bracket programming language (one using { } as block delimiters), and it was the first language to use // to mark inline comments.

From http://progopedia.com/language/bcpl/

Within BCPL, one often sees curly braces, but not always. This was a limitation of the keyboards at the time. The characters $( and $) were lexicographically equivalent to { and }. Digraphs and trigraphs were maintained in C (though a different set for curly brace replacement - ??< and ??>).

The use of curly braces was further refined in B (which preceded C).

From Users' Reference to B by Ken Thompson:

/* The following function will print a non-negative number, n, to
  the base b, where 2<=b<=10,  This routine uses the fact that
  in the ASCII character set, the digits 0 to 9 have sequential
  code values.  */

printn(n,b) {
        extern putchar;
        auto a;

        if(a=n/b) /* assignment, not test for equality */
                printn(a, b); /* recursive */
        putchar(n%b + '0');
}

There are indications that curly braces were used as short hand for begin and end within Algol.

I remember that you also included them in the 256-character card code that you published in CACM, because I found it interesting that you proposed that they could be used in place of the Algol 'begin' and 'end' keywords, which is exactly how they were later used in the C language.

From http://www.bobbemer.com/BRACES.HTM


The use of square brackets (as a suggested replacement in the question) goes back even further. As mentioned, the Algol family influenced C. Within Algol 60 and 68 (C was written in 1972 and BCPL in 1966), the square bracket was used to designate an index into an array or matrix.

BEGIN
  FILE F(KIND=REMOTE);
  EBCDIC ARRAY E[0:11];
  REPLACE E BY "HELLO WORLD!";
  WRITE(F, *, E);
END.

As programmers were already familiar with square brackets for arrays in Algol and BCPL, and curly braces for blocks in BCPL, there was little need or desire to change this when making another language.


The updated question includes an addendum of productivity for curly brace usage and mentions python. There are some other resources that do this study though the answer boils down to "Its anecdotal, and what you are used to is what you are most productive with." Because of the widely varying skills in programming and familiarity with different languages, these become difficult to account for.

See also: Stack Overflow Are there statistical studies that indicates that Python is “more productive”?

Much of the gains would be dependent on the IDE (or lack of) that is used. In vi based editors, putting the cursor over one matching open/close and pressing % will then move the cursor to the other matching character. This is very efficient with C based languages back in the old days - less so now.

A better comparison would be between {} and begin/end which was the options of the day (horizontal space was precious). Many Wirth languages were based on a begin and end style (Algol (mentioned above), pascal (many are familiar with), and the Modula family).

I have difficulty finding any that isolate this specific language feature - at best I can do is show that the curly brace languages are much more popular than begin end languages and it is a common construct. As mentioned in Bob Bemer link above, the curly brace was used to make it easier to program as shorthand.

From Why Pascal is Not My Favorite Programming Language

C and Ratfor programmers find 'begin' and 'end' bulky compared to { and }.

Which is about all that can be said - its familiarity and preference.

6
  • 15
    Now everybody here is learning BCPL instead of working :) Commented Feb 26, 2013 at 15:36
  • The trigraphs (introduced in the 1989 ISO C standard) for { and } are ??< and ??>. The digraphs (introduced by the 1995 amendment) are <% and %>. Trigraphs are expanded in all contexts, in a very early translation phase. Digraphs are tokens, and are not expanded in string literals, character constants, or comments. Commented Feb 26, 2013 at 16:45
  • There existed something prior to 1989 for this in C (I'd have to dig out my first edition book to get a date on that). Not all EBCDIC code pages had a curly brace (or square brackets) in them, and there were provisions for this in the earliest C compilers.
    – user40980
    Commented Feb 26, 2013 at 18:31
  • @NevilleDNZ BCPL used curly braces in 1966. Where Algol68 got its notion from would be something to explore - but BCPL didn't get it from Algo68. The ternary operator is something I've been interested in and have tracked it back to CPL (1963) (the predecessor of BCPL) which borrowed the notion from Lisp (1958).
    – user40980
    Commented Feb 27, 2013 at 0:19
  • 1968: Algol68 permits round brackets ( ~ ) as an shorthand of begin ~ end bold symbol blocks. These are called brief symbols, c.f. wp:Algol68 Bold symbols, this allows blocks of code to be treated just like expressions. A68 also has brief shorthands like C's ?: ternary operator eg x:=(c|s1|s2) instead of C's x=c?s1|s2. Similarly this applies to if & case statements. ¢ BTW: A68 is from where the shell got it's esac & fi ¢
    – NevilleDNZ
    Commented Feb 27, 2013 at 0:22
24

Square braces [] are easier to type, ever since IBM 2741 terminal that was "widely used on Multics" OS, which in turn had Dennis Ritchie, one of C language creators as dev team member.

http://upload.wikimedia.org/wikipedia/commons/thumb/9/9f/APL-keybd2.svg/600px-APL-keybd2.svg.png

Note the absence of curly braces at IBM 2741 layout!

In C, square braces are "taken" as these are used for arrays and pointers. If language designers expected arrays and pointers to be more important / used more frequently than code blocks (which sounds like a reasonable assumption at their side, more on historic context of coding style below), that would mean curly braces would go to "less important" syntax.

Importance of arrays is pretty apparent in the article The Development of the C Language by Ritchie. There's even an explicitly stated assumption of "prevalence of pointers in C programs".

...new language retained a coherent and workable (if unusual) explanation of the semantics of arrays... Two ideas are most characteristic of C among languages of its class: the relationship between arrays and pointers... The other characteristic feature of C, its treatment of arrays... has real virtues. Although the relationship between pointers and arrays is unusual, it can be learned. Moreover, the language shows considerable power to describe important concepts, for example, vectors whose length varies at run time, with only a few basic rules and conventions...


For further understanding of historical context and coding style of the time when C language was created, one needs to take into account that "origin of C is closely tied to the development of the Unix" and, specifically, that porting OS to a PDP-11 "led to the development of an early version of C" (quotes source). According to Wikipedia, "in 1972, Unix was rewritten in the C programming language".

Source code of various old versions of Unix is available online, eg at The Unix Tree site. Of various versions presented there, most relevant seems to be Second Edition Unix dated 1972-06:

The second edition of Unix was developed for the PDP-11 at Bell Labs by Ken Thompson, Dennis Ritchie and others. It extended the First Edition with more system calls and more commands. This edition also saw the beginning of the C language, which was used to write some of the commands...

You can browse and study C source code from Second Edition Unix (V2) page to get an idea of typical coding style of the time.

A prominent example that supports the idea that back then it was rather important for programmer to be able to type square brackets with ease can be found in V2/c/ncc.c source code:

/* C command */

main(argc, argv)
char argv[][]; {
    extern callsys, printf, unlink, link, nodup;
    extern getsuf, setsuf, copy;
    extern tsp;
    extern tmp0, tmp1, tmp2, tmp3;
    char tmp0[], tmp1[], tmp2[], tmp3[];
    char glotch[100][], clist[50][], llist[50][], ts[500];
    char tsp[], av[50][], t[];
    auto nc, nl, cflag, i, j, c;

    tmp0 = tmp1 = tmp2 = tmp3 = "//";
    tsp = ts;
    i = nc = nl = cflag = 0;
    while(++i < argc) {
        if(*argv[i] == '-' & argv[i][1]=='c')
            cflag++;
        else {
            t = copy(argv[i]);
            if((c=getsuf(t))=='c') {
                clist[nc++] = t;
                llist[nl++] = setsuf(copy(t));
            } else {
            if (nodup(llist, t))
                llist[nl++] = t;
            }
        }
    }
    if(nc==0)
        goto nocom;
    tmp0 = copy("/tmp/ctm0a");
    while((c=open(tmp0, 0))>=0) {
        close(c);
        tmp0[9]++;
    }
    while((creat(tmp0, 012))<0)
        tmp0[9]++;
    intr(delfil);
    (tmp1 = copy(tmp0))[8] = '1';
    (tmp2 = copy(tmp0))[8] = '2';
    (tmp3 = copy(tmp0))[8] = '3';
    i = 0;
    while(i<nc) {
        if (nc>1)
            printf("%s:\n", clist[i]);
        av[0] = "c0";
        av[1] = clist[i];
        av[2] = tmp1;
        av[3] = tmp2;
        av[4] = 0;
        if (callsys("/usr/lib/c0", av)) {
            cflag++;
            goto loop;
        }
        av[0] = "c1";
        av[1] = tmp1;
        av[2] = tmp2;
        av[3] = tmp3;
        av[4] = 0;
        if(callsys("/usr/lib/c1", av)) {
            cflag++;
            goto loop;
        }
        av[0] = "as";
        av[1] = "-";
        av[2] = tmp3;
        av[3] = 0;
        callsys("/bin/as", av);
        t = setsuf(clist[i]);
        unlink(t);
        if(link("a.out", t) | unlink("a.out")) {
            printf("move failed: %s\n", t);
            cflag++;
        }
loop:;
        i++;
    }
nocom:
    if (cflag==0 & nl!=0) {
        i = 0;
        av[0] = "ld";
        av[1] = "/usr/lib/crt0.o";
        j = 2;
        while(i<nl)
            av[j++] = llist[i++];
        av[j++] = "-lc";
        av[j++] = "-l";
        av[j++] = 0;
        callsys("/bin/ld", av);
    }
delfil:
    dexit();
}
dexit()
{
    extern tmp0, tmp1, tmp2, tmp3;

    unlink(tmp1);
    unlink(tmp2);
    unlink(tmp3);
    unlink(tmp0);
    exit();
}

getsuf(s)
char s[];
{
    extern exit, printf;
    auto c;
    char t, os[];

    c = 0;
    os = s;
    while(t = *s++)
        if (t=='/')
            c = 0;
        else
            c++;
    s =- 3;
    if (c<=8 & c>2 & *s++=='.' & *s=='c')
        return('c');
    return(0);
}

setsuf(s)
char s[];
{
    char os[];

    os = s;
    while(*s++);
    s[-2] = 'o';
    return(os);
}

callsys(f, v)
char f[], v[][]; {

    extern fork, execv, wait, printf;
    auto t, status;

    if ((t=fork())==0) {
        execv(f, v);
        printf("Can't find %s\n", f);
        exit(1);
    } else
        if (t == -1) {
            printf("Try again\n");
            return(1);
        }
    while(t!=wait(&status));
    if ((t=(status&0377)) != 0) {
        if (t!=9)       /* interrupt */
            printf("Fatal error in %s\n", f);
        dexit();
    }
    return((status>>8) & 0377);
}

copy(s)
char s[]; {
    extern tsp;
    char tsp[], otsp[];

    otsp = tsp;
    while(*tsp++ = *s++);
    return(otsp);
}

nodup(l, s)
char l[][], s[]; {

    char t[], os[], c;

    os = s;
    while(t = *l++) {
        s = os;
        while(c = *s++)
            if (c != *t++) goto ll;
        if (*t++ == '\0') return (0);
ll:;
    }
    return(1);
}

tsp;
tmp0;
tmp1;
tmp2;
tmp3;

It is interesting to note how pragmatic motivation of picking characters to denote language syntax elements based on their use in targeted practical applications resembles Zipf's Law as explained in this terrific answer...

observed relationship between frequency and length is called Zipf's Law

...with the only difference that length in above statement is substituted by / generalized as speed of typing.

23
  • 5
    Anything in support of this "apparent" expectation by the language designers? It doesn't take much programming in C to notice that curly braces are much more common than array declarations. This hasn't really changed much since the olden days -- have a look at K&R.
    – user7043
    Commented Feb 26, 2013 at 15:09
  • 1
    I somehow doubt this explanation. We don't know what the expected and they could have easily chosen it the other way around since they were the people to decide about array notation too. We do not even know if they thought curly braces to be the "less important" option, maybe they liked curly braces more. Commented Feb 26, 2013 at 15:09
  • 3
    @gnat: Square braces are easier to type on modern keyboards, does this apply to the keyboards that were around when unix and c were first being implemented? I have no reason to suspect that they were using the same keyboard, or that they would assume that other keyboards would be like their keyboards, or that they would have thought typing speed would be worth optimizing by one character. Commented Feb 26, 2013 at 15:22
  • 1
    Also, Zipf's law is a generalization on what ends up happening in natural languages. C was artificially constructed, so there is no reason to think it would apply here unless the designers of C consciously decided to deliberately apply it. If it did apply, there's no reason to assume it would simplify something already as short as a single character. Commented Feb 26, 2013 at 15:28
  • 1
    @gnat FWIW, grep -Fo tells me the *.c files of the CPython source code (rev. 4b42d7f288c5 because that's what I have at hand), which includes libffi, contains 39511 { (39508 {, dunno why two braces aren't closed), but only 13718 [ (13702 [). That's counting occurrences in strings and in contexts unrelated to this question, so this is not really accurate, even if we ignore that the code base may not be representative (note that this bias could go in either direction). Still, a factor of 2.8?
    – user7043
    Commented Feb 26, 2013 at 18:02
1

C (and subsequently C++ and C#) inherited its bracing style from its predecessor B, which was written by Ken Thompson (with contributions from Dennis Ritchie) in 1969.

This example is from the Users' Reference to B by Ken Thompson (via Wikipedia):

/* The following function will print a non-negative number, n, to
   the base b, where 2<=b<=10,  This routine uses the fact that
   in the ASCII character set, the digits 0 to 9 have sequential
   code values.  */

printn(n,b) {
        extern putchar;
        auto a;

        if(a=n/b) /* assignment, not test for equality */
                printn(a, b); /* recursive */
        putchar(n%b + '0');
}

B itself was again based on BCPL, a language written by Martin Richards in 1966 for the Multics Operating system. B's bracing system used only round braces, modified by additional characters (Print factorials example by Martin Richards, via Wikipedia):

GET "LIBHDR"

LET START() = VALOF $(
        FOR I = 1 TO 5 DO
                WRITEF("%N! = %I4*N", I, FACT(I))
        RESULTIS 0
)$

AND FACT(N) = N = 0 -> 1, N * FACT(N - 1)

The curly braces used in B and subsequent languages "{...}" is an improvement Ken Thompson made over the original compound brace style in BCPL "$(...)$".

2
  • 1
    No. Seems that Bob Bemer (en.wikipedia.org/wiki/Bob_Bemer) is responsible for this - "...you proposed that they could be used in place of the Algol 'begin' and 'end' keywords, which is exactly how they were later used in the C language." (from bobbemer.com/BRACES.HTM)
    – SChepurin
    Commented Feb 26, 2013 at 15:49
  • 1
    The $( ... $) format is equivalent to { ... } in the lexer in BCPL, just as ??< ... ??> is equivalent to { ... } in C. The improvement between the two styles is in the keyboard hardware - not the language.
    – user40980
    Commented Feb 26, 2013 at 16:30

Not the answer you're looking for? Browse other questions tagged or ask your own question.