0

I have a text file that's about 300KB in size. I want to remove all lines from this file that begin with the letter "P". This is what I've been using:

> cat file.txt | egrep -v P*

That isn't outputting to console. I can use cat on the file without another other commands and it prints out fine. My final intention being to:

> cat file.txt | egrep -v P* > new.txt

No error appears, it just doesn't print anything out and if I run the 2nd command, new.txt is empty.

edit: I should say I'm running Windows 7 with Cygwin installed.

1
  • no need to cat file | grep pattern... grep pattern file is doing fine on is own...
    – pataluc
    Commented Jun 20, 2013 at 8:50

2 Answers 2

2

Try this command instead:

cat file.txt | egrep -v '^P*' > new.txt

An alternative that avoids the useless use of cat would be:

egrep -v '^P*' file.txt > new.txt

You need to put quotes around your regexes in egrep, otherwise bash will expand them (in your case, the * glob would expand to every file in the current directory beginning with an uppercase P).

1
  • [Obligatory "useless use of cat is a silly complaint" complaint.] Commented Jun 20, 2013 at 16:19
0

P* as a regex means "any number of Ps, including 0". So it will always match, since every line contains at least 0 Ps. That explains why egrep -v P* prints nothing: every line matches, and -v selects the lines which don't match. (Actually, it might do something else, since P* will be expanded by bash into the list of files starting with the letter P in the current directory, if there happen to be any. You should use quotes egrep -v "P*", but that's not your problem.)

You want to match one P at the beginning of the line. So you need to specify that the regex is "anchored" (only matches at the beginning), which you do by putting a ^ at the beginning:

grep -v ^P file.txt > new.txt

By the way, egrep is deprecated; you should use grep -E, but in this case there's no difference between basic and extended regex.

grep does not use "globs", it uses regular expressions. And it does not force the regular expression to match the entire line; it's sufficient if a string matching the regular expression appears somewhere in the line.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .