Delete n lines following a pattern (and the line matching the pattern)

Question

How can I delete a line containing a matching pattern and the following n lines using a tool supporting regular expressions?

Said differently, how can I write a regular expression matching a line containing a matching pattern and the following n lines, so that I can replace them with nothing?

For example, if I have a matching pattern bbbb and I want to delete also the 5 lines that follows it, for the input file:

aldjflajdkl
aaaabbbbaaaa
1l;adfjl
2aldfjl
3adlflkdas
4aldfjd
5aldfkld
6dlafjlkdas

The output would be:

aldjflajdkl
6dlafjlkdas

It probably simplify things that in my specific case, it cannot be that the matching pattern (bbbb) is contained in the following 5 lines.

A solution already exists for sed, but it relies only partially on regular expressions, and uses custom replacement commands which are not portable.

You wrote "I am aware of how you can do this in sed. How could I do the same with a tool supporting regular expressions?" <-- I would note that sed is a tool that supports regular expressions. e.g. sed "s/a/z/g" file.ext replaces 'a' with 'z', and does it with all occurrences of 'a'. The regex is in the find portion of that, that's where the 'a' is. Though sed can't see new lines in the find portion and some more advanced regex features are missing from it, so there are better tools than sed for regex support. — barlop, Commented Apr 8, 2015 at 19:45
@barlop I see your point. I made a major reshuffling, let me know if this addresses your concerns. — Antonio, Commented Apr 8, 2015 at 21:37
It's good, it was pretty good even before that correction too. IMO It's fine even if a question has one (easily made) mistake and a comment corrects it. So I upvoted your question even with that mistake that I corrected in comment. The reason I upvoted it was that it was very clear, the description and showing the input and the output you wanted. — barlop, Commented Apr 8, 2015 at 22:21
I would note that sometimes if correcting a question, it can make a comment nonsensical.. But in this case your correction is fine 'cos my comment quoted you so there's no questions about what the comment is or was referring to. — barlop, Commented Apr 8, 2015 at 22:24

Community · Accepted Answer · 2017-05-23 12:41:51Z

3

A possible solution is:

.*<matching pattern>(.*\r?\n){<N+1>}

where N is the number of lines I want to remove after the line containing the pattern.

For the example given, this translates in:

.*bbbb(.*\r?\n){6}

That's how it looks in grepWin: grepWin screenshot
Side notes:

In the tab "The regex search string matches" also the 5aldfkld line is marked to be matched, indeed a scroll bar is visible on the right
(grepWin specific) Because of a small bug, when applying this search on files, you'll see the count of Matches increasing by 7 for each match! That's probably because the match counter counts how many lines are matched, and in this case the pattern covers 7 lines: the matched line, the following 5 lines and the line reached with the last line feed
(sed specific) This regex does not work for sed, which does not fully support regex and has no easy way to match/replace new lines.

The following explains how I got to the solution.

I started from:

.*bbbb.*\n.*\n.*\n.*\n.*\n.*\n

which would not work in my system. But the following would work:

.*bbbb.*\r\n.*\r\n.*\r\n.*\r\n.*\r\n.*\r\n

So, I am working in a CRLF system. However this doesn't look very pretty nor portable.

I can make it a little bit more portable (and uglier :-) ) by doing:

.*bbbb.*\r?\n.*\r?\n.*\r?\n.*\r?\n.*\r?\n.*\r?\n

(The carriage return becomes optional). It still looks ugly, but I can collect the repetitive term:

.*bbbb(.*\r?\n){6}

This guide was very handy.

edited May 23, 2017 at 12:41

CommunityBot

1

answered Apr 8, 2015 at 13:19

Antonio

4072 gold badges8 silver badges24 bronze badges

here is a similar one done with notepad++ though you make a good point that the one in the pic of mine won't remove the last line and would need another \r\n . Also it's good how you did \r? to make the \r optional. You should do a screenshot of your one in whatever editor or program you use for your regexes. i.sstatic.net/rfLHQ.png could do .*bbbb.*\r?\n(.*\r?\n){5} then it's 5 like 5 lines after, though yours is better, more compact while doing the same.
– barlop
Commented Apr 8, 2015 at 14:18
Can you include a screenshot from your favorite regex supporting program?
– barlop
Commented Apr 8, 2015 at 14:23
@barlop Yep, done.
– Antonio
Commented Apr 8, 2015 at 14:28
And i'm curious, i'm assuming you knew about repetition when you began answering and not just at the end, so why did you begin typing \n.*\n.*\n.* and then simplying after? why not just straight away type something like (\n.*){5} ? or (\n.*){5} as soon as you got up to two of them. why keep going as far as \n.*\n.*\n.*\n.*\n.*\n and then do the repetition after? Like if you know the multilication operator and you wanted five fivevs you wouldn't first write 5+5+5+5+5=5*5, you'd just write 5*5
– barlop
Commented Apr 8, 2015 at 14:30
@barlop I simply didn't know about repetition :) You know about my first approach to regular expressions, a few days ago.
– Antonio
Commented Apr 8, 2015 at 14:36

| Show 2 more comments

Scott - Слава Україні · Accepted Answer · 2015-04-09 04:38:52Z

1

An awk solution:

awk '/bbbb/ {i=5; next} {if (i>0) i--; else print}'

When it detects the pattern you're looking for, it sets i (which is a countdown counter) to 5, and skips the rest of the processing (i.e., skips to the next line of input). In particular, it does not print the line. (Saying /bbbb/ {i=5+1} for the first part would be equivalent; choose one based on your style preference.) Then, if the counter is positive, decrement it (subtract 1) so as to count the lines that are being deleted (skipped), and do not print; otherwise, print the line.

answered Apr 9, 2015 at 4:38

Scott - Слава Україні

21.9k46 gold badges64 silver badges124 bronze badges

I think it would be worth to post your answer (also) here, although there they want also to have the option to keep the line with the matching pattern.
– Antonio
Commented Apr 9, 2015 at 7:45

Add a comment |

Stack Exchange Network

Delete n lines following a pattern (and the line matching the pattern)

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
regex
find-and-replace
.

Hot Network Questions

Delete n lines following a pattern (and the line matching the pattern)

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged regexfind-and-replace.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
regex
find-and-replace
.