94

I'm trying to compile this piece of code from the book "The C Programming Language" (K & R). It is a bare-bones version of the UNIX program wc:

#include <stdio.h>

#define IN   1;     /* inside a word */
#define OUT  0;     /* outside a word */

/* count lines, words and characters in input */
main()
{
    int c, nl, nw, nc, state;

    state = OUT;
    nl = nw = nc = 0;
    while ((c = getchar()) != EOF) {
        ++nc;
        if (c == '\n')
            ++nl;
        if (c == ' ' || c == '\n' || c == '\t')
            state = OUT;
        else if (state == OUT) {
            state = IN;
            ++nw;
        }
    }
    printf("%d %d %d\n", nl, nw, nc);
}

And I'm getting the following error:

$ gcc wc.c 
wc.c: In function ‘main’:
wc.c:18: error: ‘else’ without a previous ‘if’
wc.c:18: error: expected ‘)’ before ‘;’ token

The 2nd edition of this book is from 1988 and I'm pretty new to C. Maybe it has to do with the compiler version or maybe I'm just talking nonsense.

I've seen in modern C code a different use of the main function:

int main()
{
    /* code */
    return 0;
}

Is this a new standard or can I still use a type-less main?

13
  • 4
    Not an answer, but another piece of code to look at more closely, || c = '\t'). Does that seem the same as the other code on that line?
    – user7116
    Commented Dec 27, 2011 at 3:19
  • 58
    32 upvotes for a debugging + typo question?! Commented Dec 27, 2011 at 15:48
  • 37
    @TomalakGeret'kal: you know, old stuff is valued more (wine, paintings, C code) Commented Dec 27, 2011 at 17:29
  • 16
    @César: I am quite within my rights to express my opinion, and I'll thank you not to try to censor it. As it happens, yes, this is not a website for debugging your code and solving your typographical errors, which are "localised" issues that will never help anybody else. It's a website for questions about programming languages, not for doing your basic debugging and reference work for you. Skill level is completely irrelevant. Read the FAQ, and perhaps also this meta question. Commented Dec 28, 2011 at 15:18
  • 11
    @TomalakGeret'kal of course you can express your opinion and I won't censor your comment in spite of being unconstructive. I've already read the FAQ. I'm an enthusiast programmer asking about an actual problem that I'm facing
    – César
    Commented Dec 28, 2011 at 15:26

9 Answers 9

247

Your problem is with your preprocessor definitions of IN and OUT:

#define IN   1;     /* inside a word */
#define OUT  0;     /* outside a word */

Notice how you have a trailing semicolon in each of these. When the preprocessor expands them, your code will look roughly like:

    if (c == ' ' || c == '\n' || c == '\t')
        state = 0;; /* <--PROBLEM #1 */
    else if (state == 0;) { /* <--PROBLEM #2 */
        state = 1;;

That second semicolon causes the else to have no previous if as a match, because you are not using braces. So, remove the semicolons from the preprocessor definitions of IN and OUT.

The lesson learned here is that preprocessor statements do not have to end with a semicolon.

Also, you should always use braces!

    if (c == ' ' || c == '\n' || c == '\t') {
        state = OUT;
    } else if (state == OUT) {
        state = IN;
        ++nw;
    }

There is no hanging-else ambiguity in the above code.

6
  • 9
    For clarity, the problem isn't the spacing, it's the semicolons. You don't need them in preprocessor statements.
    – Dan
    Commented Dec 27, 2011 at 3:21
  • @Dan thanks for the clarification! And the semicolons were indeed the problem! Thanks guys!
    – César
    Commented Dec 27, 2011 at 3:23
  • 2
    @César: you're welcome. The bracing suggestion will hopefully keep you out of trouble in the future, certainly has helped me!
    – user7116
    Commented Dec 27, 2011 at 3:33
  • 5
    @César: It's also a good idea to get used to putting parenthesis around macros since you generally want the macro to be evaluated first. In this case it doesn't matter since the value is a single token, but leaving out parens can lead to unexpected results when defining an expression.
    – styfle
    Commented Dec 27, 2011 at 8:25
  • 7
    "don't need them" != "shouldn't have them". the former is always true; the latter is context-dependent and is the more pertinent issue in this scenario. Commented Dec 27, 2011 at 15:46
64

The main problem with this code is that it is not the code from K&R. It includes semicolons after the macros definitions, which were not present in the book, which as others have pointed out changes the meaning.

Except when making a change in an attempt to understand the code, you should leave it alone until you do understand it. You can only safely modify code you understand.

This was probably just a typo on your part, but it does illustrate the need for understanding and attention to details when programming.

14
  • 9
    Your advice isn't terribly constructive for someone learning to program. Modifying code is precisely how you understand the details of programming.
    – user7116
    Commented Dec 27, 2011 at 21:49
  • 12
    @sixlettervariables: And when doing so, you should know what changes you've made, and make as few changes a possible. If the OP had made the changes deliberately, and made as few change as possible, he probably wouldn't have asked this question, as it would have been clear to him what was going on. He would have changed the macro for IN, with no errors and then the macro for OUT with the two errors, the second of which would be complaining about the semicolon he had just added.
    – jmoreno
    Commented Dec 27, 2011 at 22:07
  • 5
    It seems like unless you make the mistake of including a semicolon on the end of a preprocessor directive line, you likely wouldn't know that you aren't to include them. You could take it at face value, you could read lots of code and notice they never seem to be there. Or, the OP could mess up by including them, ask about the "bizarre" error, and find out: oops, no semicolons required for preprocessor directives! This is programming, not an episode of Scared Straight.
    – user7116
    Commented Dec 27, 2011 at 22:17
  • 14
    @sixlettervariables: Yes, but when the code doesn't work, the obvious first step is to go "oh, ok, then what I changed without any reason whatsoever from the code written in a book by the inventor of C, was probably the issue. I'll just undo that then." Commented Dec 28, 2011 at 0:42
  • 3
    let us continue this discussion in chat
    – user7116
    Commented Dec 28, 2011 at 1:54
34

There should not be any semicolons after the macros,

#define IN   1     /* inside a word */
#define OUT  0     /* outside a word */

and it should probably be

if (c == ' ' || c == '\n' || c == '\t')
4
  • Thanks, the semicolons were the problem. The 2nd one was a typo!
    – César
    Commented Dec 27, 2011 at 3:24
  • 21
    Next time please paste the exact code you use, directly from your text editor. Commented Dec 27, 2011 at 15:46
  • @TomalakGeret'kal well I did not and I will, but how did you find?
    – onemach
    Commented Dec 28, 2011 at 0:34
  • 1
    @onemach: You said the ; was a typo that didn't affect the problem, which means a typo in your question rather than in the code that you actually used. Commented Dec 28, 2011 at 0:41
24

The definitions of IN and OUT should look like this:

#define IN   1     /* inside a word  */
#define OUT  0     /* outside a word */

The semicolons were causing the problem! The explanation is simple: both IN and OUT are preprocessor directives, essentially the compiler will replace all occurrences of IN with a 1 and all occurrences of OUT with a 0 in the source code.

Since the original code had a semicolon after the 1 and the 0, when IN and OUT got replaced in the code, the extra semicolon after the number produced invalid code, for instance this line:

else if (state == OUT)

Ended up looking like this:

else if (state == 0;)

But what you wanted was this:

else if (state == 0)

Solution: remove the semicolon after the numbers in the original definition.

0
8

As you see there was a problem in macros.

GCC has option for stopping after pre-processing. (-E) This option is useful to see the result of pre-processing. In fact the technique is an important one if you are working with large code base in c/c++. Typically makefiles will have a target to stop after pre-processing.

For quick reference : The SO question covers the options -- How do I see a C/C++ source file after preprocessing in Visual Studio?. It starts with vc++, but also has gcc options mentioned down below.

7

Not exactly a problem, but the declaration of main() is also dated, it should be like something this.

int main(int argc, char** argv) {
    ...
    return 0;
}

The compiler will assume an int return value for a function w/o one, and I'm sure the compiler/linker will work around the lack of declaration for argc/argv and the lack of return value, but they should be there.

2
  • 3
    That's a good book - one of the only two worth while books on C as far as I know. I'm pretty sure that newer editions are ANSI C compliant (probably pre C99 ANSI C). The other worth while book on C is Expert C Programming Deep C Secrets by Peter van der Linden. Commented Dec 27, 2011 at 4:18
  • I never said it was. I was simply commented that to bring it in line with the way things are done today, that main should be changed. Commented Dec 27, 2011 at 15:11
4

Try adding explicit braces around code blocks. The K&R style can be ambiguous.

Look at line 18. The compiler is telling you where the issue is.

    if (c == '\n') {
        ++nl;
    }
    if (c == ' ' || c == '\n' || c == '\t') { // You're missing an "=" here; should be "=="
        state = OUT;
    }
    else if (state == OUT) {
        state = IN;
        ++nw;
    }
4
  • 2
    Thanks! Actually, the code worked without the braces in the second if :)
    – César
    Commented Dec 27, 2011 at 3:26
  • 5
    +1. Not just ambiguous but somewhat dangerous. When (if) you add a line to your if block later on, if you forget to add the braces because your block is now more than one line, it can take a while to debug that error...
    – The111
    Commented Dec 27, 2011 at 3:26
  • 8
    @The111 Never, ever, happened to me. I still don’t believe that this is a real problem. I’ve been using the brace-less style for over a decade, I’ve never once forgot to add the braces when expanding the body of a block. Commented Dec 27, 2011 at 9:59
  • 1
    @The111: In this case it took a few SO contributors a handful of minutes :P And if you're a programmer who is capable of adding statements to an if clause and "forgetting" to update the braces then, well, you're not a very good programmer. Commented Dec 27, 2011 at 15:47
3

A simple way is to use brackets like {} for each if and else:

if (c == '\n'){
    ++nl;
}
if (c == ' ' || c == '\n' || c == '\t')
{
    state = OUT;
}
else if (state == OUT) {
    state = IN;
    ++nw;
}
2

As other answers pointed out, the problem is in #define and semicolons. To minimize these problems I always prefer defining number constants as a const int:

const int IN = 1;
const int OUT = 0;

This way you get rid of many problems and possible problems. It is limited by just two things:

  1. Your compiler has to support const - which in 1988 wasn't generally true, but now it's supported by all commonly used compilers. (AFAIK the const is "borrowed" from C++.)

  2. You can't use these constants in some special places where you would need a string-like constant. But I think your program isn't that case.

1
  • An alternative I prefer is enums - they can be used in the special places (like array declarations) that const int can't in C. Commented Feb 22, 2012 at 19:54

Not the answer you're looking for? Browse other questions tagged or ask your own question.