11

There is a nice feature in bash, about localization (language translation):

TEXTDOMAIN=coreutils
LANG=fr_CH.utf8
echo $"system boot"
démarrage système

(Nota: For this work, fr_CH.utf8 was already generated on your system... Else you may try with your own locale... or install locales and generate one.)

The problem:

But if this work fine with simple strings, when string contain a \n (or worst: a backtick ` things are more complicated:

echo $"Written by %s, %s, %s,\nand %s.\n"
Written by %s, %s, %s,\nand %s.\n

This is not attended answer.

(Nota2: For this work, exact message has to be prepared in .mo message file, in this sample/test, I use existant coreutils.mo files, which could be unformated with the command msgunfmt.)

At all, the only way I've found to do the translation is:

eval echo \$\"$'Written by %s, %s, %s,\nand %s.\n'\"
Écrit par %s, %s, %s,
et %s.

or

msg=$'Written by %s, %s, %s,\nand %s.\n'
eval echo \$\""$msg"\"
Écrit par %s, %s, %s,
et %s.

(You could see two double quotes... not very sexy...)

And finally I could:

WRITTERS=(Hans Pierre Jackob Heliott)
eval printf \$\""$msg"\" ${WRITTERS[@]}
Écrit par Hans, Pierre, Jackob,
et Heliott.

But as I've heard recently that eval is evil... ;-)

In fact, I don't have problem with an eval that's run with only hard coded part, but I would appreciate a way to keep this eval out and to write this kind of part in a more natural or readable manner.

At all @techno 's answer let me see that my first idea is something dangerous as if WRITTERS contain some ;ls, for sample...

Edit: So question is:

How could I keep this eval out and/or write this in a more sexy fashion

Nota:

$ printf "I use bash %s on Debian %s\n" $BASH_VERSION $(</etc/debian_version)
I use bash 4.1.5(1)-release on Debian 6.0.6
13
  • ... I would appreciate a way to keep this eval out sound not really as a question, but it is. Commented Dec 25, 2012 at 7:05
  • Ah OK. For \n, try echo -e. It is still not very clear what exactly is being asked in other parts of the question. Why are you using echo at all? What's wrong with printf $"message key" $var1 $var2? Commented Dec 25, 2012 at 7:21
  • @n.m. I complain that $"$msg" will work fine only if $msg don't contain a \n. If so, need to write ugly thing like eval... \$\""$msg"\"... Commented Dec 25, 2012 at 7:34
  • 1
    I'd suggest trying the help-bash mailing list. This is a feature that's rarely used both because it's obscure, and because of security bugs. Even people that do a lot of scripting don't tend to use it.
    – ormaaj
    Commented Dec 25, 2012 at 9:19
  • Aha, it's more clear now.stackoverflow.com/questions/9139401/… Commented Dec 25, 2012 at 10:04

5 Answers 5

6
+50

I've played a little bit with this feature and this is what I came up with: you can include the newline verbatim as:

$ echo $"Written by %s.
> "
Écrit par %s.
$ 

In a script:

#!/bin/bash

message=$"Written by %s.
"

printf "$message" Gniourf

This script will output:

Écrit par Gniourf.

Ok, this is not really an answer, but it might help a little bit (at least, we're not using the evil eval).

Personal remark: I find this feature really clunky!

5
  • Yes, thank, I've seen that too, but as you said: this is not really -the- answer ;-) But +1 as your way of storing and re-using$message is cleaner as what I've already tested. (Initial idea was using bash associative array for storing all messages, so I could imagine a nice way of doing that with your syntax.) Commented Dec 25, 2012 at 14:05
  • 1
    ...cleaner and more efficient... if all message have to be printed at least on time or more; Your way run translation when setting variable, when mine run translation when used... More or less, depending on what, when and how... Commented Dec 25, 2012 at 14:20
  • 1
    You can probably mix $"..." and $'...' for the desired effect. msg=$'Written by %s.\n'; echo $"$msg"
    – tripleee
    Commented Jan 1, 2013 at 17:54
  • @tripleee And what if message contain backticks? Try: msg="$(msgunfmt /usr/share/locale/fr/LC_MESSAGES/coreutils.mo | sed -ne '/missing character class/s/^.* "\(.*\)"/\1/p')" Commented Jan 3, 2013 at 20:37
  • @tripleee (copy) Thanks to everyone. (My bounty will go to gniourf_gniourf unless best answer in 8 hours. But thanks to techno too, I like your lPrintf! ) Commented Jan 3, 2013 at 21:22
3

If using eval is bad with arbitrary variables, there is a way to do this only when called/needed, in running eval only on message part:

function lPrintf() {
    local sFormat="$(
        eval 'echo $"'"${1}"'"'.
    )"
    shift
    printf "${sFormat%.}" $@
}

lPrintf "system boot"
démarrage système

lPrintf  $'Written by %s, %s, %s,\nand %s.\n' techno moi lui-même bibi
Écrit par techno, moi, lui-même,
et bibi.

( The dot at end of translated string ensure that whole string, including leading line-break, where passed to variable sFormat. They will be dropped with ${sFormat%.} )

2
  • Nice way to limit use of eval. And to make them a little more sexy! +1, but there is an eval anyway... Commented Dec 27, 2012 at 22:47
  • Thanks to everyone. (My bounty will go to gniourf_gniourf unless best answer in 8 hours. But thanks to techno too, I like your lPrintf! ) Commented Jan 3, 2013 at 21:31
2

OK I think finally got it right.

iprintf() {
    msg="$2"
    domain="$1"
    shift
    shift
    imsg=$(gettext -ed "$domain" "$msg" ; echo EOF)
    imsg="${imsg%EOF}"
    printf "$imsg" "$@"
}

Usage example:

LANG=fr_CH.utf8 iprintf coreutils "If FILE is not specified, use %s.  %s as FILE is common.\n\n" foo bar
6
  • Yes sure, but if this let us keep eval and ugly form out, this add a fork and this is not as quick as invoking $"...". But thank for contrib! Commented Dec 25, 2012 at 21:41
  • If you can produce profiling data that pinpoint this fork as the performance bottleneck in your system, you probably should not have written it in bash in the first place. Commented Dec 25, 2012 at 21:51
  • difference exist (as tiny they are). The feature exist. So if they exist why did a need to use expensive fork? While each time cost have to be reduced, I think your comment is not constructive. Commented Dec 26, 2012 at 10:29
  • 2
    More electrons are probably wasted on this thread than ever will be on all these forks... Commented Dec 26, 2012 at 11:26
  • 1
    A shell script that doesn't invoke any external programs probably shouldn't be a shell script. Note the subshell doesn't really add any cost. It's a single fork+exec either way. @n.m. I'd use local variables, or better, use the parameters directly. It's also not a very good idea to expand a variable into the first argument of printf, especially in Bash, especially when it results from calling an external program. The -v option can result in executing arbitrary code. Also, shift can take an argument to indicate the number of shifts.
    – ormaaj
    Commented Dec 29, 2012 at 7:02
1

Simple solution for building a translation function:

f() {
    eval 'local msg=$"'"${1//[\"\$\`]}"\"
    shift
    printf "${msg}" "$@"
}

Test:

TEXTDOMAIN=coreutils
LANG="fr_CH.utf8"
f system boot
démarrage système

f $'Written by %s, %s, %s,\nand %s.\n' Athos Portos Aramis Shreck
Écrit par Athos, Portos, Aramis
et Shreck.

But as I prefer setting variables instead of forking function:

f() {
    eval 'local msg=$"'"${1//[\"\$\`]}"\"
    local -n variable=$2
    shift 2
    printf -v variable "$msg" "$@"
}

Then

f $'Written by %s, %s, %s,\nand %s.\n' string Huey Dewey Louie Batman
echo ${string@Q}
$'Écrit par Huey, Dewey, Louie\net Batman.\n'

echo "$string"
Écrit par Huey, Dewey, Louie
et Batman.

Or even better as a full translation function:

f() {
    local store=false OPTIND OPTARG OPTERR varname
    while getopts 'd:v:' opt ;do
        case $opt in
            d ) local TEXTDOMAIN=$OPTARG ;;
            v ) varname=$OPTARG ;;
        esac
    done
    shift $((OPTIND-1))
    eval 'local msg=$"'"${1//[\"\$\`]}"\"
    shift
    printf ${varname+-v} $varname "$msg" "$@"
}

Then

f -d libc -v string "Permission denied"
echo $string
Permission non accordée

f -d coreutils $'Written by %s, %s, %s,\nand %s.\n' Riri Fifi Loulou Georges
Écrit par Riri, Fifi, Loulou
et Georges.

Old answer (Jan 2013)

Well, there is my self answer:

This seem not well implemented now. Work in many situations, but, while

echo "$(gettext 'missing character class name `[::]'\')"
caractère de nom de classe « [::] » manquant

work simply, the same string seem impossible to translate using this bashism:

echo $"missing character class name `[::]'"
> 

the console stay locked (waiting for such an end of string) adding ``" ` would immerse bash in a complex interpretation process :->>

> `"
bash: command substitution: line 1: Caractère de fin de fichier (EOF) prématuré lors de la recherche du « ' » correspondant
bash: command substitution: line 2: Erreur de syntaxe : fin de fichier prématurée
missing character class name 

And, of course:

echo $"missing character class name \`[::]'"
missing character class name `[::]'

make no translation. :-p

While translating this string containing two backticks work finely:

echo $"%s}: integer required between `{' and `}'"
%s} : entier requis entre « { » et « } »

There is a script where you may see some of mine unsuccessfull trys.

#!/bin/bash

echo "Localized tests"
export TEXTDOMAIN=coreutils
export LANG=fr_CH.UTF-8
export WRITTERS=(Athos Portos Aramis Dartagnan\ Le\ Beau)

echo '#First method# whitout eval'

declare -A MyMessages;
MyMessages[sysReboot]=$"system boot"
MyMessages[writtenBy]=$"Written by %s, %s, %s,
and %s.
"
MyMessages[intReq]=$"%s}: integer required between `{' and `}'"
MyMessages[trClass]=$"when translating, the only character classes that may appear in
string2 are `upper' and `lower'"
# MyMessages[missClass]=$"missing character class name `[::]'" 

for msgIdx in ${!MyMessages[@]} ;do
    printf "\n--- Test chain '%s' ---\n" $msgIdx
    case $msgIdx in
    writ* )
        printf "${MyMessages[$msgIdx]}\n" "${WRITTERS[@]}"
        ;;
    intReq )
        printf "ARRAY{${MyMessages[$msgIdx]}\n" NaN
        ;;
    * )
        printf "${MyMessages[$msgIdx]}\n"
        ;;
    esac
  done

echo $'###\n#Second method# whith limited eval'
unset MyMessages;

declare -A MyMessages;

lPrintf() {
    local sFormat="$(
        eval 'echo $"'"${1}"'"'.
    )"
    shift
    printf "${sFormat%.}" "$@"
}

MyMessages[sysReboot]="system boot"
MyMessages[writtenBy]=$'Written by %s, %s, %s,\nand %s.\n'
MyMessages[intReq]="%s}: integer required between \`{' and \`}'"
MyMessages[trClass]="when translating, the only character classes that "
MyMessages[trClass]+=$'may appear in\nstring2 '
MyMessages[trClass]+="are \`upper' and \`lower'"
MyMessages[missClass]="missing character class name \`[::]'"

for msgIdx in ${!MyMessages[@]} ;do
    printf "\n--- Test chain '%s' ---\n" $msgIdx
    case $msgIdx in
    writ* )
        lPrintf "${MyMessages[$msgIdx]}" "${WRITTERS[@]}"
        ;;
    intReq )
        lPrintf "${MyMessages[$msgIdx]}" NaN
        ;;
    * )
        lPrintf "${MyMessages[$msgIdx]}"
        ;;
    esac
  done

and his output:

Localized tests
#First method# whitout eval

--- Test chain 'trClass' ---
à la traduction, les seules classes de caractères qui peuvent apparaître
dans string2 sont « upper » ou « lower »

--- Test chain 'intReq' ---
ARRAY{NaN} : entier requis entre « { » et « } »

--- Test chain 'sysReboot' ---
démarrage système

--- Test chain 'writtenBy' ---
Écrit par Athos, Portos, Aramis,
et Dartagnan Le Beau.

###
#Second method# whith limited eval

--- Test chain 'trClass' ---
à la traduction, les seules classes de caractères qui peuvent apparaître
dans string2 sont « upper » ou « lower »
--- Test chain 'missClass' ---
./localized.sh: eval: line 44: Caractère de fin de fichier (EOF) prématuré lors de la recherche du « ` » correspondant
./localized.sh: eval: line 45: Erreur de syntaxe : fin de fichier prématurée

--- Test chain 'intReq' ---
NaN} : entier requis entre « { » et « } »
--- Test chain 'sysReboot' ---
démarrage système
--- Test chain 'writtenBy' ---
Écrit par Athos, Portos, Aramis,
et Dartagnan Le Beau.

If anyone could help my to remove comments and/or error message in this script!? ... (in less then 8 hours?!)

At all, thanks to everyone. (My bounty will go to @gniourf_gniourf unless best answer in 8 hours. But thanks to @techno too, I like your lPrintf! )

0

Talking you out of it

Fundamentally you should probably be not concerned about this issue, because C and Bash are different in how printf works: C's printf does not translate backslash escapes, while the Bash one does. So in an ideal world, you should really only be doing is just printf $"%s, %s,\n%s" some thing more and having the template string retain the raw backslash escape (so it might look like msgid "%s, %s,\\n%s" in the po-file).

As you have already realized, the $"" construct also disallows the use of msgids invalid to bash's double-quotation syntax. There simply is no way to use this entry:

msgid "`"
msgstr "« "

and by stripping these problematic characters away it only masks the problem. (Again, it's fine for bash, because you would've been writing echo $"\`" and msgid "\\`").

On the other hand, there really is good reason to not use the $"" construct. The construct allows translators to run arbitrary commands, creating one more real level of insecurity compared to eval. Using the gettext.sh functions is free from the problem, as any variable substitution is handled by a separate envsubst program. And it also lets you use $'' as much as you like.

Not the answer you're looking for? Browse other questions tagged or ask your own question.