7

I have a script where I need to do all possible comparisons between the contents of a string variable. Each combination requires a different approach to the contents of the variable, so something like this:

if $a contains "a" AND "b" AND "c"; then do x
elif $a contains "a" AND "b" but NOT "c"; then do y
elif $a contains "b" AND "c" but NOT "a"; then do z
...

As far as I know, the way of doing that is constructing an if conditional like this one:

if [[ $A == *"b"* ]] && [[ $A == *"c"* ]] && [[ $A == *"d"* ]]
elif [[ $A == *"b"* ]] && [[ $A == *"c"* ]]
elif [[ $A == *"c"* ]] && [[ $A == *"d"* ]]
elif [[ $A == *"b"* ]] && [[ $A == *"d"* ]]
elif [[ $A == *"b"* ]]
elif [[ $A == *"c"* ]]
elif [[ $A == *"d"* ]]
fi

But that is of course too complicated to read, understand and write without mistakes, given that the name of my variable ($A) and the substrings (b, c, d) are way longer than that. So I wanted to see if there was a way of saving the contents of a conditional expression to a variable:

contains_b= *a condition for [[ $A == *"b"* ]]*; echo $contains_b -> true | false

I only found the response here [contains_b=$(! [ "$A" = *"b"* ]; echo $?)]; however, $contains_b -> 0 won't work in subsequent conditionals because:

if $contains_b; then do x; fi -> bash: [0]: command not found

So the only solution that I could think of is doing it manually:

if [[ $A == *"b"* ]]; then
    contains_b=true
else
    contains_b=false
fi

However, I would end up doing the three if statements to get the three variables, and the other 7 comparisons for each of the different combinations.

I am wondering if there is a different/more efficient way of doing it. If not, do you have a suggestion for another way of doing these multiple comparisons? I feel that I am making it overly complicated...

Thank you for any help.

2
  • How many strings do you need to compare against $A? In your question you have 3, but how many is "way more"?
    – Kusalananda
    Commented Sep 28, 2022 at 6:28
  • @Kusalananda I want to see if $A contains three different substrings ($a, $b, $c). The comparisons are the ones that make it long: if it has a AND b AND c, vs just one, or two.
    – Zaida
    Commented Sep 28, 2022 at 6:31

4 Answers 4

8

Create a binary mask and then act on it. This has the benefit of only performing each test once, and it separates the testing from acting on the result of the tests.

Note that the code uses the patterns as extended regular expressions. To compare them as strings, use

[[ $string == *"$pattern"* ]]

in place of

[[ $string =~ $pattern ]]

in the code below.

patterns=( a b c )
string='abba'

mask=0; i=0
for pattern in "${patterns[@]}"; do
        if [[ $string =~ $pattern ]]; then
                # setting the i:th bit from the right to one
                mask=$(( mask | (1 << i) ))
        fi
        i=$(( i + 1 ))
done

case $mask in
        0) echo no match ;;
        1) echo first pattern matched ;;
        2) echo second pattern matched ;;
        3) echo first and second pattern matched ;;
        4) echo third pattern matched ;;
        5) echo first and third pattern matched ;;
        6) echo second and third pattern matched ;;
        7) echo all patterns matched ;;
        *) echo error
esac

Or, with a string mask with ones and zeros (zero denoting no match and one denoting match). Note that the string in mask below is the reverse of the actual binary representation of the numbers used in the code above.

patterns=( a b c )
string='abba'

unset -v mask
for pattern in "${patterns[@]}"; do
        ! [[ $string =~ $pattern ]]
        # string concatenation of the exit status of the previous command
        mask+=$?
done

case $mask in
        000) echo no match ;;
        100) echo first pattern matched ;;
        010) echo second pattern matched ;;
        110) echo first and second pattern matched ;;
        001) echo third pattern matched ;;
        101) echo first and third pattern matched ;;
        011) echo second and third pattern matched ;;
        111) echo all patterns matched ;;
        *) echo error
esac

The output from each of these scripts would be

first and second pattern matched

... since the string abba matches the first two patterns, a and b.

9
  • This is definitely the most clever answer. Just notice that in your first solution, since you don't pre-set mask with zero, if no matches are found, the result would be error and not no match. In addition, it would be nice to make a more generic function (that gets a string and an array of charachters) and returns the mask. But the solution itself is brilliant, and anyone could make a function out of it.
    – aviro
    Commented Sep 28, 2022 at 8:28
  • @aviro Not setting mask=0 was an oversight that I will fix straight away. Thanks for spotting that. Packaging the initial loop in a function would be easy, but since it adds to the complexity of the answer, and since the user's request was to make the code easy to read, I will leave that to the original user to request.
    – Kusalananda
    Commented Sep 28, 2022 at 8:39
  • 1
    See also case $(( [##2] mask )) in zsh to get the number in binary. Or typeset -i2 mask in ksh or zsh. Commented Sep 28, 2022 at 8:42
  • 1
    Or use $((2#mask)) in bash to convert a binary number to decimal. @Kusalananda Commented Sep 28, 2022 at 15:52
  • Of course, you need to use mask="$?$mask" instead of mask+=$? to get a correct binary number in mask. @Kusalananda Commented Sep 28, 2022 at 15:55
6

Here with bash 4.0 or above or mksh R40 or above, you can do it all with two case statements:

a=n b=n c=n
case $string in
  (*a*) a=y ;;&
  (*b*) b=y ;;&
  (*c*) c=y
esac
case $a$b$c in
  (nnn) echo none   ;;
  (ynn) echo a only ;;
  (yyn) echo ab     ;;
  (yyy) echo all    ;;
    (*) echo other
esac

The zsh equivalent of bash's ;;& is ;|, also supported by mksh.

2

I only found the response [here][1] [contains_b=$(! [ "$A" = *"b"* ]; echo $?)]; however, $contains_b -> 0 won't work in subsequent conditionals because:

if $contains_b; then do x; fi -> bash: [0]: command not found

Once you have the result as a number, you should use arithmetic operations on it, instead of running it as a command:

contains_a=$(! [[ $A == *$pattern_a* ]]; echo $?)
contains_b=$(! [[ $A == *$pattern_b* ]]; echo $?)
contains_c=$(! [[ $A == *$pattern_c* ]]; echo $?)

if (( contains_a && contains_b && contains_c )); then
    echo all
elif (( contains_a && contains_b )); then
    echo ab
...
fi
1

It is possible to build a binary number by testing each possible matching pattern only once with a regex match =~ in bash. To build the binary number in the correct order (MSB is left and correspond to the last test performed) we need to prepend the (negated) exit code to a variable.

#!/bin/bash --

string=${1:-"abba"}

patterns=( a b c )

unset bin
for re in "${patterns[@]}"; do
    ! [[ $string =~ $re ]]; bin="$?$bin"
done

case $((2#$bin)) in
        0) echo no match ;;
        1) echo first pattern matched ;;
        2) echo second pattern matched ;;
        3) echo first and second pattern matched ;;
        4) echo third pattern matched ;;
        5) echo first and third pattern matched ;;
        6) echo second and third pattern matched ;;
        7) echo all patterns matched ;;
        *) echo error
esac

Just an alternative solution to a previous answer.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .