How to recreate grep -f with awk

Question

I have a large file with lots of unnecesarry info. I am only interested in section between edit and next and handle them as one entry. I manage to filter it down like this....

'

awk  'BEGIN {FS = "\n"; RS = ""; OFS = "\n" ;} {if (/intf/ && /addr/) { print $0"\n"}}' > outputfile

Sample of the outputfile it shown below.

edit 114
set uuid 6cb43
set action accept
set srcintf "Port-ch40.1657"
set dstintf "any"
set srcaddr "1.1.1.1"
set dstaddr "all"
set schedule "always"
set service "ALL_ICMP" "icmp-echo-reply" "icmp-source-quench" "icmp-time-exceeded" "icmp-unreachable"
set logtraffic all
next

edit 330
set uuid 6d3d
set action accept
set srcintf "Po40.28"
set dstintf "any"
set srcaddr "all"
set dstaddr "2.2.2.2"
set schedule "always"
set service "ALL_ICMP" "icmp-echo-reply" "icmp-source-quench" "icmp-time-exceeded" "icmp-unreachable"
set logtraffic all
next

there is an option with grep where values can come from a file (grep -f filterfile textfile) Let's assume filterfile contains the values...

1.1.1.1
3.3.3.3

In reality would be way more so manual entry might not be working.

awk  'BEGIN {FS = "\n"; RS = ""; OFS = "\n" ;} {if (/intf/ && /addr/ &&(/1.1.1.1/||/3.3.3.3/)) { print $0"\n"}}' > outputfile

Can the awk command be modify to handle values coming from a file

awk  'BEGIN {FS = "\n"; RS = ""; OFS = "\n" ;} {if (/intf/ && /addr/ &&(values_from_filterfile)) { print $0"\n"}}' > outputfile

Olivier Dulac · Accepted Answer · 2023-09-08 13:02:26Z

if your own:

awk 'BEGIN {FS = "\n"; RS = ""; OFS = "\n" ;} {if (/intf/ && /addr/ &&(/1.1.1.1/||/3.3.3.3/)) { print $0"\n"}}' > outputfile

is really what you need to do, you can match things from a "match.list" file containing on each line some ip address with:

gawk '
  ( NR==FNR ) { # NR==FNR only when parsing the first file...
      ipreg=$0; # get one ip from the first file
      gsub(".", "\.", ipreg); #ensure each "." becomes "\." for the regex
      ipreg= "\<" ipreg "\>" # add beginning of word / end of word delimiters
      # that way 1.2.3.4 will NOT match: 11.2.3.42
      ipsreg=ipsreg sep ipreg; sep="|" # add it to the list of ipsreg
      # and sep only added before the 2+ elements as it is an empty string for the 1st
      next # skip everything else, we are parsing the first file...
    }

  
    ( /intf/ && /addr/ && ( $0 ~ ipsreg ) ) # default action will be print $0 if it matches...
    # and as ORS at that point will have been set to "\n\n",
    # it will print the record + an empty line after it
 ' match.list FS="\n" RS="" OFS="\n" ORS="\n\n" - > outputfile
   # the things between match.list and - will be seen as definitions to be done at that time,
   # as they contain a "=", and not be interpreted as filenames
   #  - : is STDIN, and will be the 2nd "file" parsed, where NR>FNR (FNR=for the current file, NR=from the beginning)

Ed Morton · Accepted Answer · 2023-09-08 11:24:38Z

FWIW I wouldn't use regexps for this, I'd create an array (v[]) below that maps tags such as srcaddr to their values such as "1.1.1.1" or "all" and then you can do a hash lookup of the array indices to find which tags are present in the current block and what values whatever tags you're interested in have. For example, using any POSIX awk:

$ cat tst.awk
NR==FNR {
    ips["\"" $0 "\""]
    next
}
$1 == "edit" {
    lineNr = 1
}
lineNr {
    tagFld = (NF > 2 ? 2 : 1)
    tag = $tagFld
    match($0,"^([[:space:]]*[^[:space:]]+){" tagFld "}[[:space:]]*")
    heads[tag] = substr($0,1,RLENGTH)
    v[tag] = substr($0,RLENGTH+1)
    tags[lineNr++] = tag

    if ( $1 == "next" ) {
        if (    (("srcintf" in v) && (v["srcaddr"] in ips)) \
             || (("dstintf" in v) && (v["dstaddr"] in ips)) \
           ) {
            for ( i=1; i<lineNr; i++ ) {
                tag = tags[i]
                print heads[tag] v[tag]
            }
            print ""
        }
        delete v
        lineNr = 0
    }
}

$ awk -f tst.awk filterfile textfile
edit 114
set uuid 6cb43
set action accept
set srcintf "Port-ch40.1657"
set dstintf "any"
set srcaddr "1.1.1.1"
set dstaddr "all"
set schedule "always"
set service "ALL_ICMP" "icmp-echo-reply" "icmp-source-quench" "icmp-time-exceeded" "icmp-unreachable"
set logtraffic all
next

With that structure you can trivially test or change whatever values of whatever tags you like and write much more precise tests for what you want the contents of each tag in each block to be rather than just doing a regexp comparison across the whole block. For example if you ever wanted to find/output the blocks where uuid is 6cb43, schedule is always and service includes "icmp-time-exceeded" you can just change this:

        if (    (("srcintf" in v) && (v["srcaddr"] in ips)) \
             || (("dstintf" in v) && (v["dstaddr"] in ips)) \

to this:

        if (    (v["uuid"] == "6cb43") \
             && (v["schedule"] == "always") \
             && (v["service"] ~ /"icmp-time-exceeded"/) \

and if you want to set any tag to some other value before printing, you can just populate it in v[] before the print loop:

$ cat tst.awk
NR==FNR {
    ips["\"" $0 "\""]
    next
}
$1 == "edit" {
    lineNr = 1
}
lineNr {
    tagFld = (NF > 2 ? 2 : 1)
    tag = $tagFld
    match($0,"^([[:space:]]*[^[:space:]]+){" tagFld "}[[:space:]]*")
    heads[tag] = substr($0,1,RLENGTH)
    v[tag] = substr($0,RLENGTH+1)
    tags[lineNr++] = tag

    if ( $1 == "next" ) {
        if (    (("srcintf" in v) && (v["srcaddr"] in ips)) \
             || (("dstintf" in v) && (v["dstaddr"] in ips)) \
           ) {
            v["action"] = "reject"
            v["dstaddr"] = "\"127.0.0.1\""
            for ( i=1; i<lineNr; i++ ) {
                tag = tags[i]
                print heads[tag] v[tag]
            }
            print ""
        }
        delete v
        lineNr = 0
    }
}

$ awk -f tst.awk filterfile textfile
edit 114
set uuid 6cb43
set action reject
set srcintf "Port-ch40.1657"
set dstintf "any"
set srcaddr "1.1.1.1"
set dstaddr "127.0.0.1"
set schedule "always"
set service "ALL_ICMP" "icmp-echo-reply" "icmp-source-quench" "icmp-time-exceeded" "icmp-unreachable"
set logtraffic all
next

meuh · Accepted Answer · 2023-09-08 14:08:31Z

To answer just the specific point can awk handle values coming from a file, yes you can redirect input for a getline command using < and the filename. Add to the end of your BEGIN block:

getline <"filterfile";
fromfilter = $0;
gsub("\n","|",fromfilter);

Since you have already set FS and RS, getline will read the whole file into $0, so you simply have to replace the line separators by the regexp | operator. Use the resulting variable with match:

if (/intf/ && /addr/ && match($0,fromfilter)) { print $0"\n"}

Stack Exchange Network

How to recreate grep -f with awk

3 Answers 3

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
awk
grep
.

Hot Network Questions

How to recreate grep -f with awk

3 Answers 3

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged awkgrep.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
awk
grep
.