1

I have a text file which contains (among others) the following lines:

{chapter}{{1}Einleitung}{27}{chapter.1}  
{chapter}{{2}Grundlagen}{35}{chapter.2}

How can I

  • get the 2 lines from this text file (they will always contain }Einleitung resp. }Grundlagen} and
  • extract the 2 page numbers (in this case 27 and 35),
  • calculate the difference 35-27 = 8 and
  • save the difference (8) of the two numbers in a variable

Perhaps with a bash script in Mac OS X?

1
  • var=$({ grep -Eo '(Einleitung|Grundlagen)\}.[0-9]+.'|sort -r|tr '\n' ' '| tr -d -c '0-9 '|awk '{print $1 - $2}'; }</tmp/inputfile)
    – wnrph
    Commented Dec 12, 2011 at 12:52

4 Answers 4

3

I do not know if Mac OS X has awk. If it does, this should work:

This should work:

DIFFERENZ=$(awk 'BEGIN {
  FS="[{}]+"
 } {
  if ($4=="Einleitung")
   EINLEITUNG=$5
  if ($4=="Grundlagen")
   GRUNDLAGEN=$5
 } END {
   print GRUNDLAGEN-EINLEITUNG
 }' textfile)

How it works:

  • FS="[{}]+" sets the field separator to any combination of curly brackets.
  • $4 refers to the third filed on the line (separated by curly brackets).
  • DIFFERENZ=$(...) evaluates the command ... and stores the ouput in DIFFERENZ.
6
  • 2
    it has: developer.apple.com/library/mac/#documentation/Darwin/Reference/…
    – akira
    Commented Dec 12, 2011 at 12:05
  • thanks, that works well with my example. How do I have to write a chapter title which contains a space like Ergebnisse und Diskussion? I tried with if ($3=="Ergebnisse und Diskussion"), but that does not seem to find the correct line Commented Dec 12, 2011 at 12:24
  • @Martin: Spaces are treated as separators. if ($3=="Ergebnisse" && $4=="und" && $5=="Diskussion") should work. But the page number will no longer be stored in $4. I'll update my answer.
    – Dennis
    Commented Dec 12, 2011 at 12:29
  • thank you for your help - sorry, I should have directly asked for the more complicated string, but I did not think about this possible complication Commented Dec 12, 2011 at 12:32
  • 1
    @Dennis: and now your answer looks like mine :)
    – akira
    Commented Dec 12, 2011 at 13:41
3

calc.awk:

BEGIN {
    FS="}{";           # split lines by '}{'
    e=0;               # set variable 'e' to 0
    g=0;               # set variable 'g' to 0
}

/Einleitung/ { e=$3; } # 'Einleitung' matches, extract the page
/Grundlagen/ { g=$3;}  # 'Grundlagen' matches, extract the page

END {
    print g-e;         # print difference
}

you can call it via:

$> awk -f calc.awk < in.txt

it will print 8. you could store that number in a bash-variable like this:

$> nr=`awk -f calc.awk < in.txt` 

if you need it more tight you could also rewrite calc.awk to be not a separate file but a one-line:

$> nr=`awk 'BEGIN{FS="}{";g=0;e=0}/Einleitung/{e=$3;}/Grundlagen/{g=$3;}END{print g-e;}' < in.txt`
1

Pure bash 4.x, and shows the differences for every chapter:

unset page_last title_last page_cur title_cur
re='\{chapter\}\{\{[[:digit:]]+\}([^}]+)\}\{([[:digit:]]+)\}'
while read -r line; do
    if [[ $line =~ $re ]]; then
        title_cur=${BASH_REMATCH[1]} page_cur=${BASH_REMATCH[2]}
        diff=$((page_cur-page_last))
        echo "${diff} pages between \"${title_last}\" and \"${title_cur}\""
        title_last=$title_cur page_last=$page_cur
    fi
done < "$myfile"
0
$ DIFFERENCE=$(( $( cat FILENAME | grep Grundlagen | head -n1 | cut -c26-27 ) - $( cat FILENAME | grep Einleitung  | head -n1 | cut -c26-27 ) ))
$ echo $DIFFERENCE
8

This requires that the lines always look exactly like this (i.e. no different headline), because of the cut.

5
  • 1
    it wont even work with different numbers, lets say 1 or 100
    – akira
    Commented Dec 12, 2011 at 12:03
  • @akira If there are that many pages between introduction and fundamentals chapter headlines, he's doing something wrong :-) But you're right of course.
    – Daniel Beck
    Commented Dec 12, 2011 at 12:10
  • @DanielBeck: Thank you for your anwer! As you already state (and @akira says), the usage of this solution is quite limited because the numbers have to be exactly at the same position each time. The solutions with awk are more flexible. Commented Dec 12, 2011 at 12:28
  • @Martin While you're right, you never even hinted that e.g. you want to apply a solution to other chapter names. Quite the opposite with your first list item...
    – Daniel Beck
    Commented Dec 12, 2011 at 13:28
  • @DanielBeck: this is true - my question was incomplete. Commented Dec 12, 2011 at 15:12

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .