0

In Windows cmd, does grep use GetCommandLine(), and findstr use argv?

This is the grep i'm using

C:\>where grep
c:\cygwin\bin\grep.exe

C:\>c:\cygwin\bin\grep.exe --version
grep (GNU grep) 3.7
Packaged by Cygwin (3.7-2)
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and others; see
<https://git.sv.gnu.org/cgit/grep.git/tree/AUTHORS>.

C:\>

Let's suppose I want to look for a quote.

The following works, but is a bit incorrect.

C:\>echo a^"gg|grep "
a"gg

C:\>

It works but it's not quite right 'cos if I were to do echo a^"gg|grep " >c:\blah\a.a it won't make the file. It's like how echo ">c:\blah\a.a won't make the file.

It still worked but if there were things after the quote it might not have. The quote should be escaped, to ensure that grep will not just receive the quote, but the quote won't have a special meaning to cmd.

C:\>echo a^"gg|grep ^"
a"gg

C:\>

And I want to be sure that grep isn't getting passed a caret, because if grep is passed a caret, that has a meaning in regular expressions. So e.g. in these two examples with ^^, grep is being passed a caret. (in regular expressions ^z would mean match if the letter z is at the beginning of the line)

C:\>echo a^"b | grep ^^a
a"b

C:\>echo a^"b | grep ^^b

C:\>

So I'm bearing some of those hazards in mind.

What I notice though

C:\>echo a^"b | grep ^"
a"b

C:\>echo a^"b | findstr \^"
a"b

C:\>

So findstr needs the extra character, , to escape the quote.

I'm wondering if we can determine that grep is using GetCommandLine() and findstr is using argsv.

What i'm getting at is a concept from C.

I have these programs from a while back

C:\blah>type w.c
#include <stdio.h>

int main(int argc, char *argv[]) {
        int i = 0;
        while (argv[i]) {
                printf("argv[%d] = %s\n", i, argv[i]);
                i++;
        }
        return 0;
}

and

C:\blah>type w2.c
#include <stdio.h>
#include <windows.h>

int main(int argc, char *argv[]) {
    printf(GetCommandLine());
    return 0;
}


C:\blah>

A windows program might load the windows.h library, and then it can use the function GetCommandLine() to get arguments from the command line.

The other option, which a non-windows implementation would use, or which a windows program could use, is that argv array. The arguments passed as an array, to the main procedure of the program.

And so one can see what a program would see, whether it used GetCommandLine(), or argv.

C:\>w abc def
argv[0] = w
argv[1] = abc
argv[2] = def

C:\>w2 abc def
w2  abc def
C:\>

If a program uses GetCommandLine(), then ^" is sufficiently escaping the quote.

C:\>w2 ^"
w2  "
C:\>

If on the other hand, a program uses argsv, then ^" isn't sufficiently escaping the quote

C:\>echo a^"b | findstr ^"
FINDSTR: No search strings

C:\>echo a^"b | findstr \^"
a"b

C:\>

and

C:\>w ^"
argv[0] = w
argv[1] =

C:\>w \^"
argv[0] = w
argv[1] = "

C:\>

So it looks like with argv, if it just receives a ", which it would if at the shell, ^" was written. Then, assigning that to an element of an array, is similar to assigning it to a variable. It'd need to be escaped with . And rather than add the , it says invalid and deletes it.

I think that's what might be happening, but i'd like to verify if that's correct?

And correct that the/that windows implementation of grep uses GetCommandLine(), and findstr use argsv?

Added

To answer a question asked in comment, of why am I interested in which method the app uses. e.g. maybe they used GetCommandLine() maybe CommandLineToArgv() , why am I looking into that?

I am interested in which method is used at the first stage, because it helps in understanding what is going on when it is given something on the command line, to make more sense of a bunch of cases of why a program might give an error when passed something. e.g. in understanding echo it helps a lot to know if it uses or seems to use GetCommandLine() because I can have a program that uses GetCommandLine() that shows what it sees, and especially when you have a program like grep that could interpret some things for regular expressions, it really helps to see what it first receives and how it breaks it down at the start eg into argsv and if it does. The program becomes so much easier to use then. And when one know the method it's parsed that way at the start,(which can be seen in a program like w.c or w2.c that I mention in my question), then it can be easier to make sense of and remember it/the syntax, too.

3
  • I don't understand why you are worried about which method the app uses. Are you thinking that the authors might not have written their parser correctly if they were using GetCommandLine() and did not use CommandLineToArgv() to break up the commands? I am simply curious. Commented Jul 28, 2022 at 3:15
  • Aha! .. thanks for the explanation @barlop Commented Jul 28, 2022 at 14:45
  • @SeñorCMasMas no prob, i've incorporated that into my question now at the end.
    – barlop
    Commented Jul 28, 2022 at 15:18

2 Answers 2

0

You're using the Cygwin grep, which is built upon Linux libraries that were ported to Windows.

It doesn't use GetCommandLine, as you can see while searching the Grep 3.7 source.

findstr itself is a part of Windows, which is proprietary code, so we can't know what it uses.

2
  • I don't know much c.. But if grep doesn't use GetCommandLine, then how is it get the quote. If you look at the w,c/w.exe program in my question w ^" the quote doesn't get into argv. Unless it's got the backslash before the caret.
    – barlop
    Commented Jul 27, 2022 at 18:57
  • It uses Linux software to parse the command-line. How it gets the command line requires searching the sources of the Linux libraries, as there is more than one way of getting the line - GetCommandLine is not the only method and not even the lowest-level one, as I have some of these other methods when I had too (but relates more to StackOverflow than here).
    – harrymc
    Commented Jul 27, 2022 at 19:12
0

All Windows processes only get a single string as their command line, so generally they have to use GetCommandLine() at some point and parse it into arguments (or not). This is true even if your program has a main(argc, argv), because main() isn't the executable's real "entry point" – it is actually the C runtime library (such as msvcrt1) that gets called first and parses the received command line into argv[] before calling your main().

Software available through Cygwin works the same way – the Cygwin grep is more or less a direct recompilation of the original Linux/Unix-oriented source code, which relies on main(argc, argv). But in this case, it is the Cygwin runtime (cygwin1.dll) whose dll_crt0_1() entry point function performs this conversion.

(This commandline-to-argv conversion is also where Windows MSVCRT expands file wildcards (globs), and where Cygwin translates Windows-style paths to Cygwin-style paths.)

So both the GNU grep (which is what you have in Cygwin) and the Windows findstr.exe1 actually have the same kind of main(argc, argv) and neither of them explicitly calls GetCommandLine() – they both rely on their corresponding C runtime libraries to do it…differently.


1 I checked findstr.c within the NT5 source code that had been leaked last year.

2 The source code for msvcrt was actually public somewhere, if I remember correctly, but I'm not exactly sure where it's supposed to be found. It might have been part of the Windows SDKs, perhaps?

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .