0

A file:

a
b

run command

dd if=file count=1 skip=0 bs=1 # show a
dd if=file count=1 skip=1 bs=1 # show "newline"
dd if=file count=1 skip=2 bs=1 # show b

I want to search the offset of the first "newline" before a given offset, with 'if' statement in a bash script (which is a dummy way):

para1=$1
while(1)
do
    c=$(dd if=file count=1 bs=1 skip=$para1)
    if [ $c -eq "\n" ]   # How to write this line?
    then
        break
    fi
    para1=`expo $para - 1`
done
echo $para1
bash fun.sh 2
# the output should be 1

Actually I have found a solution here:How do i compare if my variable holds a newline character in shell Script

if [ ${#str} -eq 0 ] 

But I wonder is it robust enough, or is there more elegant way to do it?

7
  • I don't understand your task: what's a "newline offset", what's the "given offset", what do you mean by "reversely"? Notice that the $ is only to illustrate a linebreak. It's not really there. Commented Oct 30, 2019 at 3:57
  • for example: a file with "1234567", the offset of "1" is 0, the offset of "3" is 2, given the offset 2, I need to search "7" reversely, so there is none, but if I need to search "1" reversely, the found offset is 0. "$" did take a character. Use "dd skip=xxx count=1 bs=1" you can find a gap between the last char of line and the first char of the next line @BenjaminW.
    – YNX
    Commented Oct 30, 2019 at 4:08
  • 1
    Would something like awk do? Ex. awk -v ndx=3 'sum+length($0) < ndx {sum+=length($0); next} {print sum; exit}' file where the value of ndx is the character in the file you are looking for and the character number of the newline prior to the index is the result? (1 in this case). Or given the file containing "hello\nworld\nthis\nis\na\ntest\n" and ndx=12 (the 'h' in "this"), the result is 10 (the newline before 't') Commented Oct 30, 2019 at 4:45
  • The only place where $ means newline is in regular expressions. I don't see how that's relevant to this task.
    – Barmar
    Commented Oct 30, 2019 at 5:18
  • 1
    $( ) always trims any newlines at the end of what it reads, so the result will never match \n or \r\n. One way around this is to add a protective non-newline at the end, then remove it: c=$(dd if=file count=1 bs=1 skip=$para1; echo x); c=${c%x}. Also, for the test, -eq does numeric comparisons, not string comparison. Also, comparing to "\n" will compare to a literal backslash followed by the letter "n", not a newline. For a newline, use $'\n'. Commented Oct 30, 2019 at 6:55

1 Answer 1

1

Please focus on the code:

c=$(dd if=test1 skip=2 bs=1 count=1)

The Command Substitution section of man bash describes:

Bash performs the expansion by executing command ... with any trailing newlines deleted.

Because of this the newline in the result of dd command above is removed. You'll see it by the test code below:

for (( i=1; i<=3; i++ )); do
    c="$(dd if=test1 skip="$i" bs=1 count=1 2>/dev/null)"
    echo "skip = $i"
    echo -n "$c" | xxd
done

In general bash is not suitable for explicitly dealing with the newline character because bash sometimes automatically removes or adds it.

If perl is your option, please try the following:

perl -0777 -ne '
    $given = 3;     # an example of the given offset
    printf "character at offset %d = %s\n", $given, substr($_, $given, 1);
    $pos = rindex(substr($_, 0, $given), "\n", $given);
    if ($pos < 0) {
        print "not found\n";
    } else {
        printf "newline found at offset %d\n", $given - $pos - 1;
    }
' file

If you prefer bash, here is the alternative in bash:

file="./file"
given=3                               # an example of the given offset

str="$(xxd -ps "$file" | tr -d '\n')" # to the hexadecimal expression
for (( i=given; i>=0; i-- )); do
    j=$(( i * 2 ))
    c="${str:$j:2}"                   # substring offset j, length 2
    if [[ $c = "0a" ]]; then          # search for the substring "0a"
        printf "newline found at offset %d\n" $(( given - i - 1 ))
        exit
    fi
done
echo "not found"

The concept is same as the perl version. It first converts the whole file into the hexadecimal expression and searches for the substring "0a" starting at the given position backwards.

Hope this helps.

1
  • 1
    I've just updated my answer with a bash alternative. Please enjoy!
    – tshiono
    Commented Oct 30, 2019 at 7:30

Not the answer you're looking for? Browse other questions tagged or ask your own question.