How to select lines between two marker patterns which may occur multiple times with awk/sed

Question

Using awk or sed how can I select lines which are occurring between two different marker patterns? There may be multiple sections marked with these patterns.

For example: Suppose the file contains:

abc
def1
ghi1
jkl1
mno
abc
def2
ghi2
jkl2
mno
pqr
stu

And the starting pattern is abc and ending pattern is mno So, I need the output as:

def1
ghi1
jkl1
def2
ghi2
jkl2

I am using sed to match the pattern once:

sed -e '1,/abc/d' -e '/mno/,$d' <FILE>

Is there any way in sed or awk to do it repeatedly until the end of file?

Community · Accepted Answer · 2017-05-23 12:02:56Z

254

Use awk with a flag to trigger the print when necessary:

$ awk '/abc/{flag=1;next}/mno/{flag=0}flag' file
def1
ghi1
jkl1
def2
ghi2
jkl2

How does this work?

/abc/ matches lines having this text, as well as /mno/ does.
/abc/{flag=1;next} sets the flag when the text abc is found. Then, it skips the line.
/mno/{flag=0} unsets the flag when the text mno is found.
The final flag is a pattern with the default action, which is to print $0: if flag is equal 1 the line is printed.

For a more detailed description and examples, together with cases when the patterns are either shown or not, see How to select lines between two patterns?.

edited May 23, 2017 at 12:02

CommunityBot

11 silver badge

answered Aug 1, 2013 at 8:29

fedorqui

285k105 gold badges580 silver badges617 bronze badges

47

If you want to print everything between and including the pattern then you can use awk '/abc/{a=1}/mno/{print;a=0}a' file.
– scai
Commented Nov 7, 2013 at 8:08
8

Yes, @scai ! or even awk '/abc/{a=1} a; /mno/{a=0}' file - with this, putting a condition before the /mno/ we make it evaluate the line as true (and print it) before setting a=0. This way we can avoid writing print.
– fedorqui
Commented Nov 7, 2013 at 9:43
21

@scai @fedorqui For including pattern output, you can do awk '/abc/,/mno/' file
– Jotne
Commented Dec 4, 2013 at 6:44
2

@EirNym that is a weird scenario that can be handled on very different ways: which lines would you like to print? Probably awk 'flag; /PAT1/{flag=1; next} /PAT1/{flag=0}' file would make.
– fedorqui
Commented Apr 24, 2017 at 8:28
3

For newbies like me, there is a doc. 1. A awk "rule" contains a "pattern" and an "action", either of which (but not both) may be omitted. So [pattern] { action } or pattern [{ action }]. 2. An action consists of one or more awk statements, enclosed in braces (‘{…}’). —— So the ending flag is abbr of flag {print $0}
– Weekend
Commented Jan 7, 2021 at 8:40

| Show 8 more comments

Jonathan Leffler · Accepted Answer · 2013-08-01 08:47:40Z

56

Using sed:

sed -n -e '/^abc$/,/^mno$/{ /^abc$/d; /^mno$/d; p; }'

The -n option means do not print by default.

The pattern looks for lines containing just abc to just mno, and then executes the actions in the { ... }. The first action deletes the abc line; the second the mno line; and the p prints the remaining lines. You can relax the regexes as required. Any lines outside the range of abc..mno are simply not printed.

answered Aug 1, 2013 at 8:47

Jonathan Leffler

747k144 gold badges937 silver badges1.3k bronze badges

1

@JonathanLeffler can I know what is the purpose of using -e
– Kasun Siyambalapitiya
Commented Dec 6, 2016 at 4:33
1

@KasunSiyambalapitiya: Mostly it means I like to use it. Formally, it specifies that the next argument is (part of) the script that sed should execute. If you want or need to use several arguments to include the entire script, then you must use -e before each such argument; otherwise, it's optional (but explicit).
– Jonathan Leffler
Commented Dec 6, 2016 at 4:41
Nice! (I prefer sed over awk.) When using complex regular expressions, it would be nice not to have to repeat them. Isn't it possible to delete the first / last line of the "selected" range? Or to first apply the d to all lines up to the first match, and then another d to all lines starting with the second match?
– hans_meine
Commented Dec 8, 2016 at 10:12
(Replying to my own comment.) If there's only one section to be cut, I could tentatively solve this e.g. for LaTeX using sed -n '1,/\\begin{document}/d;/\\end{document}/d;p'. (This is cheating a little bit, since the second part does not delete up to the document end, and I would not know how to cut multiple parts as the OP asked for.)
– hans_meine
Commented Dec 8, 2016 at 10:50
@JonathanLeffler what is the reason for inserting the $ mark, as in /^abc$ and others
– Kasun Siyambalapitiya
Commented Jan 25, 2017 at 4:58

| Show 1 more comment

potong · Accepted Answer · 2013-08-01 09:39:57Z

20

This might work for you (GNU sed):

sed '/^abc$/,/^mno$/{//!b};d' file

Delete all lines except for those between lines starting abc and mno

answered Aug 1, 2013 at 9:39

potong

57.5k6 gold badges52 silver badges85 bronze badges

!d;//d golfs 2 characters better :-) stackoverflow.com/a/31380266/895245
– Ciro Santilli OurBigBook.com
Commented Jul 13, 2015 at 9:54
1

This is awesome. The {//!b} prevents the abc and mno from being included in the output, but I can't figure out how. Could you explain?
– Brendan
Commented Feb 16, 2017 at 17:44
2

@Brendan the instruction //!b reads if the current line is neither one of the lines that match the range, break and therefore print those lines otherwise all other lines are deleted.
– potong
Commented Feb 17, 2017 at 1:14

Add a comment |

Irfan Latif · Accepted Answer · 2020-01-02 05:53:44Z

16

From the previous response's links, the one that did it for me, running ksh on Solaris, was this:

sed '1,/firstmatch/d;/secondmatch/,$d'

1,/firstmatch/d: from line 1 until the first time you find firstmatch, delete.
/secondmatch/,$d: from the first occurrance of secondmatch until the end of file, delete.
Semicolon separates the two commands, which are executed in sequence.

edited Jan 2, 2020 at 5:53

Irfan Latif

6072 gold badges10 silver badges26 bronze badges

answered Jul 12, 2017 at 16:38

FanDeLaU

3372 silver badges8 bronze badges

Just curious, why does the range limiter (1,) come before /firstmatch/? I'm guessing this could also be phrased '/firstmatch/1,d;/secondmatch,$d'?
– Luke Davis
Commented Jun 25, 2018 at 0:40
3

With "1,/firstmatch/d" you are saying "from line 1 until the first time you find 'firstmatch', delete". Whereas, with "/secondmatch/,$d" you say "from the first occurrance of 'secondmatch' until the end of file, delete". the semicolon separates the two commands, which are executed in sequence.
– FanDeLaU
Commented Dec 20, 2018 at 17:18

Add a comment |

Community · Accepted Answer · 2017-05-23 12:26:33Z

15

sed '/^abc$/,/^mno$/!d;//d' file

golfs two characters better than ppotong's {//!b};d

The empty forward slashes // mean: "reuse the last regular expression used". and the command does the same as the more understandable:

sed '/^abc$/,/^mno$/!d;/^abc$/d;/^mno$/d' file

This seems to be POSIX:

If an RE is empty (that is, no pattern is specified) sed shall behave as if the last RE used in the last command applied (either as an address or as part of a substitute command) was specified.

edited May 23, 2017 at 12:26

CommunityBot

11 silver badge

answered Jul 13, 2015 at 9:53

Ciro Santilli OurBigBook.com

371k114 gold badges1.3k silver badges1k bronze badges

1

I think the second solution will end up with nothing as the second command is also a range. However kudos for the first.
– potong
Commented Jul 13, 2015 at 14:20
@potong true! I have to study more why the first one works. Thanks!
– Ciro Santilli OurBigBook.com
Commented Jul 13, 2015 at 14:22

Add a comment |

pataluc · Accepted Answer · 2014-06-11 11:32:58Z

3

something like this works for me:

file.awk:

BEGIN {
    record=0
}

/^abc$/ {
    record=1
}

/^mno$/ {
    record=0;
    print "s="s;
    s=""
}

!/^abc|mno$/ {
    if (record==1) {
        s = s"\n"$0
    }   
}

using: awk -f file.awk data...

edit: O_o fedorqui solution is way better/prettier than mine.

edited Jun 11, 2014 at 11:32

answered Aug 1, 2013 at 8:44

pataluc

5794 silver badges20 bronze badges

3

In GNU awk if (record=1) should be if (record==1), i.e. double = - see gawk comparison operators
– George Hawkins
Commented May 26, 2014 at 8:53

Add a comment |

2 revs · Accepted Answer · 2017-04-13 12:36:28Z

3

Don_crissti's answer from Show only text between 2 matching pattern?

firstmatch="abc"
secondmatch="cdf"
sed "/$firstmatch/,/$secondmatch/!d;//d" infile

which is much more efficient than AWK's application, see here.

edited Apr 13, 2017 at 12:36

community wiki

2 revs
Léo Léopold Hertz 준영

I don't think linking the time comparisons makes much sense here, since the requirements of the questions are quite different, hence the solutions.
– fedorqui
Commented Sep 11, 2015 at 15:11
2

I disagree because we should have some criterias to compare answers. Only a few has SED applications.
– Léo Léopold Hertz 준영
Commented Sep 11, 2015 at 16:10

Add a comment |

Vijay · Accepted Answer · 2013-08-01 09:13:08Z

2

perl -lne 'print if((/abc/../mno/) && !(/abc/||/mno/))' your_file

answered Aug 1, 2013 at 9:13

Vijay

66.7k90 gold badges234 silver badges325 bronze badges

Good to know perl equivalent as it is a pretty good alternative to both awk and sed.
– akhan
Commented Mar 8, 2017 at 23:46

Add a comment |

Weekend · Accepted Answer · 2019-01-02 09:14:01Z

I tried to use awk to print lines between two patterns while pattern2 also match pattern1. And the pattern1 line should also be printed.

e.g. source

package AAA
aaa
bbb
ccc
package BBB
ddd
eee
package CCC
fff
ggg
hhh
iii
package DDD
jjj

should has an ouput of

package BBB
ddd
eee

Where pattern1 is package BBB, pattern2 is package \w*. Note that CCC isn't a known value so can't be literally matched.

In this case, neither @scai 's awk '/abc/{a=1}/mno/{print;a=0}a' file nor @fedorqui 's awk '/abc/{a=1} a; /mno/{a=0}' file works for me.

Finally, I managed to solve it by awk '/package BBB/{flag=1;print;next}/package \w*/{flag=0}flag' file, haha

A little more effort result in awk '/package BBB/{flag=1;print;next}flag;/package \w*/{flag=0}' file, to print pattern2 line also, that is,

package BBB
ddd
eee
package CCC

blhsing · Accepted Answer · 2021-03-05 20:56:07Z

0

This can also be done with logical operations and increment/decrement operations on a flag:

awk '/mno/&&--f||f||/abc/&&f++' file

edited Mar 5, 2021 at 20:56

answered Mar 5, 2021 at 20:50

blhsing

101k8 gold badges79 silver badges119 bronze badges

I'm absolutely certain that i've used awk in the past for this problem, and it was nothing like this complex.
– Owl
Commented Mar 28, 2022 at 10:45
1

Obviously the accepted answer in awk that predates my answer by more than 7 years is much more readable, and I saw that answer before I posted mine. I'm just throwing this one here because it is one byte shorter than the accepted answer even after renaming its variable flag to f, in the spirit of some good ol' code golf fun. :-)
– blhsing
Commented Mar 30, 2022 at 7:14

Add a comment |

Collectives™ on Stack Overflow

How to select lines between two marker patterns which may occur multiple times with awk/sed

10 Answers 10

Not the answer you're looking for? Browse other questions tagged
shell
unix
sed
awk
pattern-matching
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

10 Answers 10

Not the answer you're looking for? Browse other questions tagged shellunixsedawkpattern-matching or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
shell
unix
sed
awk
pattern-matching
or ask your own question.