37
\$\begingroup\$

As practice with the (ever so painful) C Strings I wrote a string reversal program. I am new to C, so I wanted to make sure that I am not using bad coding practices. It runs just fine (at least with expected input), but I am not sure if the logic is as good as it could be.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* strreverse(char* c) 
{
    char input[500], output[500] = {}, buffer[2] = {' ', '\0'};
    strcpy(input, c);
    int i;
    for(i = strlen(input); i >= 0; i--)
    {
        buffer[0] = input[i];
        strcat(output, buffer);
    }
    return (char*) output;
}

int main(void)
{
    char* in = malloc(256);
    printf("Enter a string: ");
    scanf("%s", in);
    char* out = strreverse(in);
    printf("%s", out);
    return 0; 
}
\$\endgroup\$
4
  • \$\begingroup\$ Is there any way you could actually do the reversing in-place? You'd get some pretty good performance this way (swapping some characters around) \$\endgroup\$ Commented Jul 11, 2016 at 13:07
  • 6
    \$\begingroup\$ Your specification -- reverse a string -- is a good start, but at three words, is too short. Say whether your function reverses in-place or allocates a new string, whether there are restrictions on the string, what to do if the string is null, and so on. Then show us your specification and the test cases which verify it. \$\endgroup\$ Commented Jul 11, 2016 at 13:27
  • 4
    \$\begingroup\$ This code returns the address of a temporary output[]. This is undefined behavior and a major code flaw. Goods answer should mention that. \$\endgroup\$ Commented Jul 11, 2016 at 22:09
  • \$\begingroup\$ "ever so painful C Strings". I hear that! \$\endgroup\$ Commented Jul 12, 2016 at 17:38

8 Answers 8

43
\$\begingroup\$

Implementation:

As of now, the code is not practical, because the limit is 500 chars including zero termination. It does unnecessary copying. You need to determine the actual length of the string by relying on the fact that C strings are null terminated.

size_t length = 0;
while (*(str + length) != 0)
{
    ++length;
}

Notice that I'm using size_t, because it is meant to be used to describe size of the objects in memory. Despite it is easy to do it manually, you should always use strlen when possible:

size_t length = strlen(str);

It should be noted that the function returns the length without null terminator. Then we allocate memory for result, since we are creating new string, rather than making changes in place:

char* result = malloc(length + 1);
if (result == NULL)
{
    return NULL;
}

We check for succession of allocation, as was suggested in the comments. If allocation failed, further operations will definitely invoke undefined behavior, so we return NULL. length + 1 is for null terminator. Then, we immediately null terminate it:

result[length] = 0;

Now we need to capture the position of the last character in the original string, to start copying from there. It is right before null terminator, which is on index length, so needed position is length - 1. There is an edge case of length 0, so we should check for that first

if (length == 0)
{
    return result;
}
size_t last = length - 1;

Then we write algorithm that writes contents of the first string into the second in reverse order:

size_t it = 0;
while (it <= last)
{
    result[it] = str[last - it];
    ++it;
}

then simply return the result:

return result;

Although the implementation has good performance, there are some opportunities to make micro optimizations. The function input type should be const char*, since we don't modify original string. Additionally, it should be noted that the caller is responsible for freeing up malloc'd memory.

Cosmetics:

Name of the function is a little hard to read, so I'd suggest str_reverse. Also, c is not good name at all, so it would be better if it's name str

Put together:

#include <stdio.h>
#include <string.h>

char* str_reverse(const char* str)
{
    size_t length = strlen(str);

    char* result = malloc(length + 1);
    if (result == NULL)
    {
        return NULL;
    }

    result[length] = 0;

    if (length == 0)
    {
        return result;
    }

    size_t last = length - 1;
    size_t it = 0;
    while (it <= last)
    {
        result[it] = str[last - it];
        ++it;
    }

    return result;
}

int main(void) {
    const char* str = "ABCD";
    char* reversed = str_reverse(str);

    printf("%s\n", reversed);
    if (str[4] != 0)
    {
        printf("problems\n");
    }

    free(reversed);
    return 0;
}
\$\endgroup\$
1
  • \$\begingroup\$ Comments are not for extended discussion; this conversation has been moved to chat. \$\endgroup\$ Commented Jul 11, 2016 at 18:14
26
\$\begingroup\$

Returning a pointer to a local variable

Returning a pointer to the local variable output is incorrect. It may happen to work but it is wrong to do so, because the storage for that local variable goes out of scope the moment you return from the function. To be correct, you should either malloc a buffer and return it (preferred) or make output be static.

Empty initializer list

Initializing output to {} is nonstandard C and should be avoided. The standard way to initialize an array to zeroes is {0}.

Unnecessary variable

Your input variable is unnecessary because you can just use c wherever you used input.

Bad algorithm

Your string reversal function takes \$O(n^2)\$ time when it should only take \$O(n)\$ time. The problem is that you use strcat to append one character at a time, and strcat needs to find the end of the string each time. If you ever find yourself using strcat in a loop, it's probably not the best way.

Other things

  • Using scanf("%s",in); to read into a fixed size string is unsafe and could result in a buffer overflow.
  • You should also print a newline when you print the result otherwise your shell prompt will be on the same line as your output.
  • Your function argument c could be marked const since you do not modify it. Also, c sounds like a character not a string.
  • Since you are already using C99 style variable declarations, you can put the int i declaration inside the for loop.

Suggested rewrite

char *reverseString(const char *str)
{
    size_t len = strlen(str);
    char  *ret = calloc(len+1, sizeof(char));

    if (ret == NULL)
        return NULL;

    for (size_t i = 0, j = len-1; i < len; i++, j--)
        ret[i] = str[j];
    return ret;
}
\$\endgroup\$
6
  • 3
    \$\begingroup\$ While sizeof(char) may be nice if you want to change the type, sizeof(char) is defined to always return 1. \$\endgroup\$
    – pipe
    Commented Jul 11, 2016 at 9:14
  • 2
    \$\begingroup\$ @pipe I know but I like to have sizeof something there otherwise it "looks wrong" to me. \$\endgroup\$
    – JS1
    Commented Jul 11, 2016 at 9:17
  • 1
    \$\begingroup\$ Fine, I won't argue with that. Just thought it's worth commenting on, maybe it helps someone understand why you put it there. :) \$\endgroup\$
    – pipe
    Commented Jul 11, 2016 at 9:18
  • \$\begingroup\$ @JS1 At the time i was unaware u could loop through a char*. Can you edit individual chars in a char? \$\endgroup\$
    – Null Spark
    Commented Jul 11, 2016 at 16:58
  • 2
    \$\begingroup\$ Recommend some_type *ret = calloc(len+1, sizeof *ret); to avoid insure code uses the rightsizeof(char). sizeof *ret will always be right. (Best review so far). \$\endgroup\$ Commented Jul 11, 2016 at 21:37
18
\$\begingroup\$
    char input[500], output[500] = {}, buffer[2] = {' ', '\0'};

This is not a good way to allocate memory in C. What this means in C is that the memory is only guaranteed unique during the scope of the function. So if you have

str1 = strreverse(input1);
str2 = strreverse(input2);

Then str1 and str2 will often both point to the reverse of input2, as you wrote over the first call on input1.

This can happen in two different ways. Sometimes the compiler will allocate a small amount of memory to the function and leave it allocated for the entire program. You can force it to happen that way by declaring the variables as static which says that the same variable will be used throughout the program's duration.

The second way that this can happen is that the memory can be implicitly allocated and deallocated during the scope of the function call. Then the memory is available to be allocated to someone else. If you call the same function twice in a row, it is possible if not likely to allocate the same memory both times. But in some ways it is worse if you do other things in between. Then the memory that pointed to output the first time may be used for other variables and get overwritten wholly or partially by other operations.

Which method it uses is up to the compiler (unless you make the variable static in which case it will always do it the first way; or make the function recursive in which case it can't do it the first way). The key point is that you should not rely on this pattern producing correct or even predictable results. Unlike a garbage collected language, C does not track memory usage until it has no pointers to it. It deallocates static allocations based on scope.

The first method is static allocation on the .bss or data segments. The second method is automatic allocation (which is still static) using the stack. If you use malloc or another dynamic allocation method, you are using heap allocation. If you want to read more about how this works, you might start with What and where are the stack and heap? You can follow up with the Difference between static memory allocation and dynamic memory allocation.

The more normal way to handle this is to pass the output string into the function

strreverse(in, reversed, maximum_length);

Then the function doesn't have to worry about allocating memory. Another possibility would be to have the function allocate memory with malloc or similar, but then you have to explicitly free memory allocated implicitly by the function. By forcing the caller to do that, the caller can figure out how to get the memory allocated and deallocated.

    char* in = malloc(256);

Any time you explicitly allocate memory like this, you should also explicitly free the memory.

It won't matter in this program as the program will end and deallocate all the memory right after you would free it (and some would leave it out for that reason). But in a real program this could cause a memory leak.

    strcpy(input, c);

You should use strncpy instead.

    strncpy(input, c, 500);
    input[499] = '\0';

This will guarantee that c is never longer than input can hold and that input is always null terminated.

Alternately, you could check that c is no longer than will fit in input and do something if it is.

    for(i = strlen(input); i >= 0; i--)
    {
        buffer[0] = input[i];

Note that on the first iteration, you will copy the null byte at the end of the string. The strcat function handles this (by copying a zero length string into output), but this is pure waste. This should be

    for (i = strlen(input) - 1; i >= 0; i--)
    {
        buffer[0] = input[i];
\$\endgroup\$
2
  • \$\begingroup\$ oh... i was unaware that c would allocate the same memory everytime. may i ask how c decides where to store the array? \$\endgroup\$
    – Null Spark
    Commented Jul 11, 2016 at 5:10
  • 3
    \$\begingroup\$ Actually str1 = strreverse(input1); str2 = strreverse(input2); will not necessarily both point to the reverse of input2. Both will point to the stack address that output had at the time of the call to strreverse as the calls are at the same call-stack depth they will get the same pointer. Had the calls been at different depth, different pointers should have been returned. The contents of the pointer are likely to be overwritten with garbage the next time you call a function though. Had output been static then it would have happened as you described. \$\endgroup\$
    – Emily L.
    Commented Jul 11, 2016 at 7:29
11
\$\begingroup\$

Instead of building the reversed string in a separate buffer, I would recommend reversing the input string in place:

char* strreverse(char* c)
{
    /* construct pointers to beginning and end of string */
    char* cur = c;
    char* end = c + (strlen(c) - 1);

    /* loop until pointers cross each other (midpoint of string) */
    while (cur < end)
    {
        /* swap characters at current and end locations */
        const char tmp = *cur;
        *cur = *end;
        *end = tmp;

        /* move pointers inward */
        cur++;
        end--;
    }

    return c;
}

(Edit) Explanation of how this method works:

For simplicity, let's assume the input string to this function is "Hello".

The function begins by constructing two pointers cur and end directly into the string. cur is initialized to point the same memory location as the beginning of the string, and end is advanced to the length of the string minus 1 (effectively pointing to the last character in the string). Visually, this looks like the following in memory:

cur
v
Hello
    ^
    end

As the end pointer is located after the cur pointer in memory, the characters stored at the current and end memory locations are then swapped:

cur
v
oellH
    ^
    end

Then both pointers are moved inward; cur moves forward and end moves backward:

 cur
 v
oellH
   ^
   end

As end is still higher in memory than cur, the characters at those locations are then again swapped:

 cur
 v
olleH
   ^
   end

And then the pointers are again moved inward:

  cur
  v
olleH
  ^
  end

At this point, the end pointer is no longer higher in memory than the cur pointer, so the loop ends, and the string is reversed.

\$\endgroup\$
11
  • \$\begingroup\$ It might be useful to note the return value is always the same as the argument and that you must strcpy before calling if you wish to preserve the non-reversed string. \$\endgroup\$
    – rich remer
    Commented Jul 11, 2016 at 18:49
  • 1
    \$\begingroup\$ @richremer : Actually, that means the calling function already has the address, so there's no need to return it (e.g., Jeryl Vaz's answer. And, if this turned into a void, we really need both "cur" and "c". (Just rename "c" to "cur" in the top line, and remove the first variable declaration.) I'm still up-voting the current version of this answer because I do mostly like it. (However, sizeof('\0') may be better than the number 1, so that 1 isn't an undocumented magic number.) \$\endgroup\$
    – TOOGAM
    Commented Jul 11, 2016 at 20:38
  • \$\begingroup\$ char* end = c + (strlen(c) - 1); is undefined behavior should c == "". Use char* end = c + strlen(c); and move end--; to the beginning of the loop. \$\endgroup\$ Commented Jul 11, 2016 at 21:42
  • \$\begingroup\$ @Falken could u explain why this works? i am very new to c so i find pointer logic to be incredibly confusing. \$\endgroup\$
    – Null Spark
    Commented Jul 11, 2016 at 22:29
  • 1
    \$\begingroup\$ @chux After researching this a bit more, you're right: the code does invoke UB according to the standard, as range comparison operators between pointers (<, >, <=, >=) are not defined for arbitrary memory locations. My suggestion would indeed fail if a zero-length string (a single byte of \0) was unfortunately placed at an offset of 0000h within a given segment. Thanks for reminding me of how I wrote code 25 years ago. ;) \$\endgroup\$
    – falken
    Commented Jul 19, 2016 at 6:48
11
\$\begingroup\$

From your main(), it's obvious that you do not need to store the reversed string separately. Hence, you could reverse the string in place, swapping characters starting at the extremities until you reach the middle. This also has the benefit of using up only half the iterations.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void strreverse(char *str)
{
    size_t len = strlen(str);
    int lastCharPos = len - 1;
    int nrOfSwaps = len / 2;
    int i;
    for (i = 0; i < nrOfSwaps; ++i)
    {
        char temp = str[i];
        str[i] = str[lastCharPos - i];
        str[lastCharPos - i] = temp;
    }
}

int main(void)
{
    char* in = malloc(256);
    printf("Enter a string: ");
    scanf("%s", in);
    strreverse(in);
    printf("%s\n", in);
    return 0;
}

Your input is limited to a max of 255 chars; anything more and there is out-of-bounds memory access. You may want to look at the solutions proposed here: https://stackoverflow.com/questions/16870485/how-can-i-read-an-input-string-of-unknown-length

\$\endgroup\$
3
  • \$\begingroup\$ You divide by 2 for every loop iteration. It may be worthwhile to: A) use memory to store len/2, or B) replace "i < len/2" with " ( i << 1) < len" or "i+i < len" or "i*2 < len", as bitshift is faster than add which is faster than non-bit-shifting MUL which is notably faster than DIV, if I remember my x86 op code speeds right (GameDev question on opcode speed). Sure, the effect is CPU-dependent, but I'd guess DIV is usually slower. Subtracting 1 from len, twice in each loop, also looks slower \$\endgroup\$
    – TOOGAM
    Commented Jul 11, 2016 at 20:20
  • 1
    \$\begingroup\$ @TOOGAM "bitshift is faster than add" may be true with assembly but with C, a good compiler will use the optimal code should code be i*2 or i<<1 or in the other case i/2 or i>>1 - it will make no executable code difference. The preferred coding is then what makes most sense to humans or i*2 in this case. If a compiler makes sub-optimal code, then the issue is with the compiler and not a good reason for less than clear coding. \$\endgroup\$ Commented Jul 11, 2016 at 21:55
  • \$\begingroup\$ @chux : Correct. Compilers may re-arrange our code. Still, why not make the C code actually reflect what the compiler is more likely to do? I'm just feeling a bit more comfortable with the idea of the code I read (the C code) being more similar to what the computer actually ends up doing, not less similar. Well, bit shifting may be more unfamiliar for many people, but (i+i) or (2*i) instead of (len/2) may be just as readable, and likely to be closer to compiler's output, and easily readable. Plus, if code is ported to a compiler that doesn't optimize, using DIV may be expensive. \$\endgroup\$
    – TOOGAM
    Commented Jul 12, 2016 at 3:33
10
\$\begingroup\$

There are good points in other answers, just a couple more I didn't see mentioned:

char* strreverse(char* c) 

c is a perfectly fine name for a parameter that is a character, less so for a string. s (or str) would be more common. Of course both are rather brief, but still used when the context is obvious. (Even some standard functions like memset).

    char output[500] = {};
    return (char*) output;

This doesn't just point to the same array each time, but it's actually invalid. output falls out of scope when the function returns and the pointer is not valid after that. Declaring it static would make it valid but with the problems others have mentioned.

I agree with Jeryl in that a function like this could just reverse the string in-place, and let the caller worry about making a copy if they need it. If they don't, there's no need for another buffer, and if they do, they can do an strdup.

    buffer[0] = input[i];
    strcat(output, buffer);

I would suggest just copying the character to the end of output manually, just run another index variable from zero up. Here strcat has to find the end of the output string every single time, even though you know exactly where it will be on every iteration.

int main(void)
    char* in = malloc(256);
    scanf("%s", in);

Consider what happens if someone inputs a word longer than 256 characters. Also, scanf interacts badly with line buffering when reading from a terminal. I'd suggest fgets.

\$\endgroup\$
4
\$\begingroup\$

Critiques of some of the above solutions

Olzhas solution seems bloated. It's doing superfluous checks that a different structure would make shorter. Compare to falken's solution

Falken's solution is not a bad answer per-se but it seems likely to lead to confusion as is. I'd suggest either making it return void which would make it clear nothing new is being allocated. That would also still make it subtle so I'd suggest naming in something clearer like strreverseInPlace.

Whether reversing in place is better than making a new string is probably a matter of opinion. For example strcpy does not allocate a string. Maybe naming the function strcpyreverse would make it clearer it's the same as a strcpy

Jeryl's solution is similar to Falken's but it's doing un-needed math in 3 places in the inner loop; The 2 places it computes the index of the characters and the place it checks the length. A good compiler might optimize those out but why ask it to (see Falken's). If the compiler can't optimize it it will be slower.

JS1's solution uses calloc which means it's going to visit every byte in the destination twice. Once to clear it then again when copying from the src so it's going to be up to twice as slow

Here's another solution

/* like strcpy but in reverse. 
   Assumes dst is big enough to hold result. */
char* strcpyreverse(const char* src, char* dst) {
    // copy from end
    const char* s = src + strlen(src);
    char* d = dst;

    while (s > src) {
      *d++ = *--s;
    }
    *d = '\0';

    return dst;
}

Some processors have specific instructions for handling post increment and pre decrement. Of course even if the code isn't written this way an optimizing compiler might figure out a way to use those instructions anyway.

Note this may or may not be faster than using indices, again depending on the processor (and the compiler optimizations) although using direct pointers are arguably more C like. If you wanted to use indices it would be

/* like strcpy but in reverse. 
   Assumes dst is big enough to hold result. */
char* strcpyreverse(const char* src, char* dst) {
    // copy from end
    size_t s = strlen(src);
    size_t d = 0;

    while (s > 0) {
      dst[d++] = src[--s];
    }
    dst[d] = '\0';

    return dst;
}

In any case code that uses *somepointer++ is an extremely common pattern in C. If you're new to C it effectively means

value = *somepointer;
somepointer += 1;
return value

and I have a feeling it specifically exists because the processors that C was first written on had instructions that did exactly that.

If you really want to it to allocate the string there's a common function strdup which allocates and copies a string so maybe naming it strdupreverse or strreversedup

/* like strdup but in reverse. */
char* strdupreverse(const char* src) {
    size_t len = strlen(src);
    char* dst = malloc(len + 1);
    if (dst != NULL) { 
        // copy from end
        const char* s = src + len;
        char* d = dst;

        while (s > src) {
          *d++ = *--s;
        }
        *d = '\0';
    }

    return dst;
}

Note I'm not checking for src being null because neither do any of the standard C string functions IIRC.

\$\endgroup\$
11
  • \$\begingroup\$ This is the correct approach. If your goal is to learn C, you'll want to learn how to write idiomatic C. You find C-strings painful because you're trying to use them as though they were string objects, rather than pointers. C's low-level approach offers both advantages and disadvantages. If you don't take advantage of things like pointers, there's really no point in using the language. (One minor critique: src and s should be const char*, not char*.) \$\endgroup\$
    – Ray
    Commented Jul 12, 2016 at 10:37
  • \$\begingroup\$ @Ray What is the difference between referring to *char_pointer and char_pointer? \$\endgroup\$
    – Null Spark
    Commented Jul 12, 2016 at 17:15
  • \$\begingroup\$ Thanks for the const comment. It's been so long since I'd written straight C I forgot if it was in C or just C++ \$\endgroup\$
    – gman
    Commented Jul 13, 2016 at 0:04
  • \$\begingroup\$ @NullSpark My comment was about const char* vs char*, not s vs. *s. Is that what you were referring to? Regardless, here are the differences between both: Suppose you have char buf[] = "foo"; char *s = buf; const char *t = buf;. Then s and t refer to the address of the first character of the string (and are equal, since they're both pointing to the same physical memory in this case, buf). *s and *t refer to the character ('F') itself, however you can't modify the string through *t, since it points to const char: *s = 'F'; is legal, but *t = 'F' is not. \$\endgroup\$
    – Ray
    Commented Jul 13, 2016 at 18:58
  • 2
    \$\begingroup\$ Nice strcpyreverse(const char* src, char* dst). (Could use (const char* restrict src, char* restrict dst)) \$\endgroup\$ Commented Jul 14, 2016 at 5:04
4
\$\begingroup\$

Just a few points to add to the answers you've already received.

Names

The C standard reserves all names at global scope starting with str and a lower case letter, so your strreverse gives undefined behavior. Renaming it to something like reverse_string will fix this (i.e., it's purely the name that matters here).

Use of malloc

At least in my opinion, malloc should typically be reserved for situations where either you don't know the amount of space you're going to allocate until run-time, or situations where you're allocating so much that it's likely to overflow the stack if you allocate the space on the stack instead of the heap. Your char* in = malloc(256); in main doesn't qualify on either count. You might as well just define in as an array of char and be done with it.

If you were going to put this to real use, it could very well make sense to use malloc inside of strreverse itself though. In particular, if you couldn't modify the input string, you'd want to allocate a buffer you could return, and in this case malloc makes perfect sense:

char *reverse_string(char const *input) { 
    size_t size = strlen(input) + 1;

    char *temp = malloc(size);

    /* ... */

use of scanf

When you use scanf with the %s conversion, it's crucial to specify the size of the buffer you're reading into, like:

char in[256];
scanf("%255s", in);

Note that in this case, what you specify to scanf is the maximum number of input characters to accept. It always appends a NUL character after this, so you need to specify one fewer than the size of the buffer you pass (or smaller than that if you don't want to use the whole buffer, but that's rare).

As it is right now, with no length specified, your use of scanf is essentially identical to the widely-reviled gets.

unnecessary casts

I prefer cast-free code whenever possible, so adding casts that aren't necessary at all just grates on my nerves. In your case, you've hit this pet peeve with the return from your strreverse:

return (char*) output;

This is pointless--the name of an array decays to a pointer to the first element of the array in most cases (including this one) without any casting. You can just return output; and it's fine (other than the fact that you're returning the address of a local, but others have already pointed out the problem with that, and section on using malloc above shows how to correct it as well).

\$\endgroup\$
0

Not the answer you're looking for? Browse other questions tagged or ask your own question.