1816

Is there a way in to convert a string into a lower case string?

For example, if I have:

a="Hi all"

I want to convert it to:

"hi all"
2

30 Answers 30

3027

There are various ways:

POSIX standard

tr

$ echo "$a" | tr '[:upper:]' '[:lower:]'
hi all

AWK

$ echo "$a" | awk '{print tolower($0)}'
hi all

Non-POSIX

You may run into portability issues with the following examples:

Bash 4.0

$ echo "${a,,}"
hi all

sed

$ echo "$a" | sed -e 's/\(.*\)/\L\1/'
hi all
# this also works:
$ sed -e 's/\(.*\)/\L\1/' <<< "$a"
hi all

Perl

$ echo "$a" | perl -ne 'print lc'
hi all

Bash

lc(){
    case "$1" in
        [A-Z])
        n=$(printf "%d" "'$1")
        n=$((n+32))
        printf \\$(printf "%o" "$n")
        ;;
        *)
        printf "%s" "$1"
        ;;
    esac
}
word="I Love Bash"
for((i=0;i<${#word};i++))
do
    ch="${word:$i:1}"
    lc "$ch"
done

Note: YMMV on this one. Doesn't work for me (GNU bash version 4.2.46 and 4.0.33 (and same behaviour 2.05b.0 but nocasematch is not implemented)) even with using shopt -u nocasematch;. Unsetting that nocasematch causes [[ "fooBaR" == "FOObar" ]] to match OK BUT inside case weirdly [b-z] are incorrectly matched by [A-Z]. Bash is confused by the double-negative ("unsetting nocasematch")! :-)

29
  • 12
    Am I missing something, or does your last example (in Bash) actually do something completely different? It works for "ABX", but if you instead make word="Hi All" like the other examples, it returns ha, not hi all. It only works for the capitalized letters and skips the already-lowercased letters.
    – jangosteve
    Commented Jan 14, 2012 at 21:58
  • 31
    Note that only the tr and awk examples are specified in the POSIX standard. Commented Feb 3, 2012 at 18:55
  • 211
    tr '[:upper:]' '[:lower:]' will use the current locale to determine uppercase/lowercase equivalents, so it'll work with locales that use letters with diacritical marks. Commented Feb 3, 2012 at 18:58
  • 18
    How does one get the output into a new variable? Ie say I want the lowercased string into a new variable? Commented Sep 25, 2012 at 18:01
  • 88
    @Adam: b="$(echo $a | tr '[A-Z]' '[a-z]')"
    – Tino
    Commented Nov 14, 2012 at 15:39
571

In Bash 4:

To lowercase

$ string="A FEW WORDS"
$ echo "${string,}"
a FEW WORDS
$ echo "${string,,}"
a few words
$ echo "${string,,[AEIUO]}"
a FeW WoRDS

$ string="A Few Words"
$ declare -l string
$ string=$string; echo "$string"
a few words

To uppercase

$ string="a few words"
$ echo "${string^}"
A few words
$ echo "${string^^}"
A FEW WORDS
$ echo "${string^^[aeiou]}"
A fEw wOrds

$ string="A Few Words"
$ declare -u string
$ string=$string; echo "$string"
A FEW WORDS

Toggle (undocumented, but optionally configurable at compile time)

$ string="A Few Words"
$ echo "${string~~}"
a fEW wORDS
$ string="A FEW WORDS"
$ echo "${string~}"
a FEW WORDS
$ string="a few words"
$ echo "${string~}"
A few words

Capitalize (undocumented, but optionally configurable at compile time)

$ string="a few words"
$ declare -c string
$ string=$string
$ echo "$string"
A few words

Title case:

$ string="a few words"
$ string=($string)
$ string="${string[@]^}"
$ echo "$string"
A Few Words

$ declare -c string
$ string=(a few words)
$ echo "${string[@]}"
A Few Words

$ string="a FeW WOrdS"
$ string=${string,,}
$ string=${string~}
$ echo "$string"
A few words

To turn off a declare attribute, use +. For example, declare +c string. This affects subsequent assignments and not the current value.

The declare options change the attribute of the variable, but not the contents. The reassignments in my examples update the contents to show the changes.

Edit:

Added "toggle first character by word" (${var~}) as suggested by ghostdog74.

Edit: Corrected tilde behavior to match Bash 4.3.

11
  • 6
    Quite bizzare, "^^" and ",," operators don't work on non-ASCII characters but "~~" does... So string="łódź"; echo ${string~~} will return "ŁÓDŹ", but echo ${string^^} returns "łóDź". Even in LC_ALL=pl_PL.utf-8. That's using bash 4.2.24. Commented Jul 12, 2012 at 16:48
  • 2
    @HubertKario: That's weird. It's the same for me in Bash 4.0.33 with the same string in en_US.UTF-8. It's a bug and I've reported it. Commented Jul 12, 2012 at 18:20
  • 1
    @HubertKario: Try echo "$string" | tr '[:lower:]' '[:upper:]'. It will probably exhibit the same failure. So the problem is at least partly not Bash's. Commented Jul 13, 2012 at 0:44
  • 1
    @DennisWilliamson: Yeah, I've noticed that too (see comment to Shuvalov answer). I'd just say, "this stuff is only for ASCII", but then it's the "~~" operator that does work, so it's not like the code and translation tables aren't already there... Commented Jul 14, 2012 at 14:13
  • 4
    @HubertKario: The Bash maintainer has acknowledged the bug and stated that it will be fixed in the next release. Commented Jul 14, 2012 at 14:27
164
echo "Hi All" | tr "[:upper:]" "[:lower:]"
6
  • 5
    @RichardHansen: tr doesn't work for me for non-ACII characters. I do have correct locale set and locale files generated. Have any idea what could I be doing wrong? Commented Jul 12, 2012 at 16:56
  • 10
    Why is [:upper:] needed?
    – mgutt
    Commented Aug 8, 2019 at 22:47
  • 1
    The same question why [:upper:] is needed.
    – MaXi32
    Commented Jul 17, 2021 at 14:16
  • 1
    @mgutt and MaXi32: It specifies to translate characters from upper to lower. Commented Jan 7, 2022 at 14:29
  • 1
    tr needs two arguments - the list of from characters, and the list of to characters. [:upper:] is a character class, which is just a fancy way of saying it's a list of all uppercase characters (though there's some question over whether tr properly considers Unicode rather than just ASCII). [:lower:], then, is the list of all lowercase characters. tr then takes both lists and translates characters in the from list to the corresponding characters in the to list. So [:upper:] is needed so tr knows what it's supposed to change, while [:lower:] tells it what to change to. Commented Feb 28, 2022 at 0:14
115

tr:

a="$(tr [A-Z] [a-z] <<< "$a")"

AWK:

{ print tolower($0) }

sed:

y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/
9
  • 2
    +1 a="$(tr [A-Z] [a-z] <<< "$a")" looks easiest to me. I am still a beginner... Commented Feb 2, 2011 at 11:12
  • 2
    I strongly recommend the sed solution; I've been working in an environment that for some reason doesn't have tr but I've yet to find a system without sed, plus a lot of the time I want to do this I've just done something else in sed anyway so can chain the commands together into a single (long) statement.
    – Haravikk
    Commented Oct 19, 2013 at 12:54
  • 2
    The bracket expressions should be quoted. In tr [A-Z] [a-z] A, the shell may perform filename expansion if there are filenames consisting of a single letter or nullgob is set. tr "[A-Z]" "[a-z]" A will behave properly.
    – Dennis
    Commented Nov 6, 2013 at 19:49
  • 3
    @CamiloMartin it's a BusyBox system where I'm having that problem, specifically Synology NASes, but I've encountered it on a few other systems too. I've been doing a lot of cross-platform shell scripting lately, and with the requirement that nothing extra be installed it makes things very tricky! However I've yet to encounter a system without sed
    – Haravikk
    Commented Jun 15, 2014 at 10:51
  • 3
    Note that tr [A-Z] [a-z] is incorrect in almost all locales. for example, in the en-US locale, A-Z is actually the interval AaBbCcDdEeFfGgHh...XxYyZ.
    – fuz
    Commented Jan 31, 2016 at 14:54
100

I know this is an oldish post but I made this answer for another site so I thought I'd post it up here:

UPPER -> lower: use python:

b=`echo "print '$a'.lower()" | python`

Or Ruby:

b=`echo "print '$a'.downcase" | ruby`

Or Perl:

b=`perl -e "print lc('$a');"`

Or PHP:

b=`php -r "print strtolower('$a');"`

Or Awk:

b=`echo "$a" | awk '{ print tolower($1) }'`

Or Sed:

b=`echo "$a" | sed 's/./\L&/g'`

Or Bash 4:

b=${a,,}

Or NodeJS:

b=`node -p "\"$a\".toLowerCase()"`

You could also use dd:

b=`echo "$a" | dd  conv=lcase 2> /dev/null`

lower -> UPPER:

use python:

b=`echo "print '$a'.upper()" | python`

Or Ruby:

b=`echo "print '$a'.upcase" | ruby`

Or Perl:

b=`perl -e "print uc('$a');"`

Or PHP:

b=`php -r "print strtoupper('$a');"`

Or Awk:

b=`echo "$a" | awk '{ print toupper($1) }'`

Or Sed:

b=`echo "$a" | sed 's/./\U&/g'`

Or Bash 4:

b=${a^^}

Or NodeJS:

b=`node -p "\"$a\".toUpperCase()"`

You could also use dd:

b=`echo "$a" | dd  conv=ucase 2> /dev/null`

Also when you say 'shell' I'm assuming you mean bash but if you can use zsh it's as easy as

b=$a:l

for lower case and

b=$a:u

for upper case.

7
  • @JESii both work for me upper -> lower and lower-> upper. I'm using sed 4.2.2 and Bash 4.3.42(1) on 64bit Debian Stretch.
    – nettux
    Commented Nov 20, 2015 at 14:33
  • 1
    Hi, @nettux443... I just tried the bash operation again and it still fails for me with the error message "bad substitution". I'm on OSX using homebrew's bash: GNU bash, version 4.3.42(1)-release (x86_64-apple-darwin14.5.0)
    – JESii
    Commented Nov 21, 2015 at 17:34
  • 11
    Do not use! All of the examples which generate a script are extremely brittle; if the value of a contains a single quote, you have not only broken behavior, but a serious security problem.
    – tripleee
    Commented Jan 16, 2016 at 11:45
  • I like the sed solution the most, since sed is always ubiquitous.
    – Dudi Boy
    Commented Dec 9, 2017 at 13:47
  • I prefer using the dd solution. Please note that you need to be root to get it working Commented Mar 6, 2019 at 14:05
89

Bash 5.1 provides a straight forward way to do this with the L parameter transformation:

${var@L}

So for example you can say:

v="heLLo"
echo "${v@L}"
# hello

You can also do uppercase with U:

v="hello"
echo "${v@U}"
# HELLO

And uppercase the first letter with u:

v="hello"
echo "${v@u}"
# Hello
4
  • 1
    Absolutely deserves more upvotes than it currently has!
    – Umlin
    Commented Jan 11, 2021 at 9:04
  • 1
    @Umlin it is a brand new feature, so it is normal that didn't get much attention yet.
    – fedorqui
    Commented Jan 11, 2021 at 9:26
  • 1
    I can't use this yet, but glad to know it's a thing! Commented Oct 4, 2021 at 19:17
  • 9
    Oh weird. I wonder why they introduced this when ${v^^} and ${v,,} already exist.
    – Gostega
    Commented Feb 17, 2022 at 1:20
36

In zsh:

echo $a:u

Gotta love zsh!

5
  • 5
    or $a:l for lower case conversion Commented Jan 27, 2011 at 5:39
  • 2
    Add one more case: echo ${(C)a} #Upcase the first char only Commented Jul 24, 2015 at 23:26
  • @biocyberman is there a way to lowercase the first char only? Commented Nov 15, 2022 at 22:31
  • 1
    Not something I know of. You can read about available flags here: zsh.sourceforge.io/Doc/Release/… Commented Nov 17, 2022 at 13:25
  • 1
    @karlahrnndz Actually you can do something like thins: ${a:0:1:l}${a:1} to lowercase only the first character. Commented Nov 17, 2022 at 13:36
20

Using GNU sed:

sed 's/.*/\L&/'

Example:

$ foo="Some STRIng";
$ foo=$(echo "$foo" | sed 's/.*/\L&/')
$ echo "$foo"
some string
18

Pre Bash 4.0

Bash Lower the Case of a string and assign to variable

VARIABLE=$(echo "$VARIABLE" | tr '[:upper:]' '[:lower:]') 

echo "$VARIABLE"
3
  • 7
    No need for echo and pipes: use $(tr '[:upper:]' '[:lower:]' <<<"$VARIABLE")
    – Tino
    Commented Dec 11, 2015 at 16:23
  • 3
    @Tino The here string is also not portable back to really old versions of Bash; I believe it was introduced in v3.
    – tripleee
    Commented Jan 16, 2016 at 12:28
  • 1
    @tripleee You are right, it was introduced in bash-2.05b - however that's the oldest bash I was able to find on my systems
    – Tino
    Commented Jan 17, 2016 at 14:28
15
[dev@localhost ~]$ TEST=STRESS2
[dev@localhost ~]$ echo ${TEST,,}
stress2
15

For the Bash command line and depending on locale and international letters, this might work (assembled from the answers from others):

$ echo "ABCÆØÅ" | python -c "print(open(0).read().lower())"
abcæøå
$ echo "ABCÆØÅ" | sed 's/./\L&/g'
abcæøå
$ export a="ABCÆØÅ" | echo "${a,,}"
abcæøå
$ echo "ABCÆØÅ" | ruby -pe '$_.downcase!'
abcæøå

Whereas these variations might NOT work:

$ echo "ABCÆØÅ" | tr "[:upper:]" "[:lower:]"
abcÆØÅ
$ echo "ABCÆØÅ" | awk '{print tolower($1)}'
abcÆØÅ
$ echo "ABCÆØÅ" | perl -ne 'print lc'
abcÆØÅ
$ echo 'ABCÆØÅ' | dd conv=lcase 2> /dev/null
abcÆØÅ
1
  • Probably not BSD sed.
    – Nakilon
    Commented Oct 2, 2022 at 1:31
13

You can try this

s="Hello World!" 

echo $s  # Hello World!

a=${s,,}
echo $a  # hello world!

b=${s^^}
echo $b  # HELLO WORLD!

enter image description here

ref : http://wiki.workassis.com/shell-script-convert-text-to-lowercase-and-uppercase/

1
  • 1
    great! Was about to make an answer like this. Many answers adding lots of unneeded info
    – yosefrow
    Commented Feb 16, 2021 at 13:52
13

Simple way

echo "Hi all" | awk '{ print tolower($0); }'
2
  • It is better to use gawk which properly handles UTF8-encoded characters (and different languages charset). 'Awk tolower' will fail on something like "ЛШТШФУМ АЩЬФ".
    – Vit
    Commented Jul 27, 2020 at 21:59
  • the awk available on macOS 11.6 works flawlessly: echo 'Đêm lưu trú năm nay' | awk '{ print tolower($0); }' => đêm lưu trú năm nay, and echo 'ЛШТШФУМ АЩЬФ' | awk '{ print tolower($0); }' => лштшфум ащьф Commented Sep 29, 2021 at 15:02
12

For a standard shell (without bashisms) using only builtins:

uppers=ABCDEFGHIJKLMNOPQRSTUVWXYZ
lowers=abcdefghijklmnopqrstuvwxyz

lc(){ #usage: lc "SOME STRING" -> "some string"
    i=0
    while ([ $i -lt ${#1} ]) do
        CUR=${1:$i:1}
        case $uppers in
            *$CUR*)CUR=${uppers%$CUR*};OUTPUT="${OUTPUT}${lowers:${#CUR}:1}";;
            *)OUTPUT="${OUTPUT}$CUR";;
        esac
        i=$((i+1))
    done
    echo "${OUTPUT}"
}

And for upper case:

uc(){ #usage: uc "some string" -> "SOME STRING"
    i=0
    while ([ $i -lt ${#1} ]) do
        CUR=${1:$i:1}
        case $lowers in
            *$CUR*)CUR=${lowers%$CUR*};OUTPUT="${OUTPUT}${uppers:${#CUR}:1}";;
            *)OUTPUT="${OUTPUT}$CUR";;
        esac
        i=$((i+1))
    done
    echo "${OUTPUT}"
}
3
  • I wonder if you didn't let some bashism in this script, as it's not portable on FreeBSD sh: ${1:$...}: Bad substitution
    – Dereckson
    Commented Nov 23, 2014 at 19:52
  • 2
    Indeed; substrings with ${var:1:1} are a Bashism.
    – tripleee
    Commented Apr 14, 2015 at 7:09
  • This approach has pretty bad performance metrics. See my answer for metrics. Commented Jul 28, 2018 at 23:39
12

In bash 4 you can use typeset

Example:

A="HELLO WORLD"
typeset -l A=$A
1
  • 1
    Ah, we poor macOS users, it's 2020 and Apple has dropped support for bash which is 'stuck' at 3.2.57(1)... (Note: aye, I'm aware we can always install a more recent bash from homebrew...) Commented Sep 9, 2020 at 16:37
9

From the bash manpage:

${parameter^pattern}

${parameter^^pattern}

${parameter,pattern}

${parameter,,pattern}

Case modification. This expansion modifies the case of alphabetic characters in parameter. The pattern is expanded to produce a pattern just as in pathname expansion. Each character in the expanded value of parameter is tested against pattern, and, if it matches the pattern, its case is converted. The pattern should not attempt to match more than one character. The ^ operator converts lowercase letters matching pattern to uppercase; the , operator converts matching uppercase letters to lowercase. The ^^ and ,, expansions convert each matched character in the expanded value; the ^ and , expansions match and convert only the first character in the expanded value. If pattern is omitted, it is treated like a ?, which matches every character. If parameter is @ or *, the case modification operation is applied to each positional parameter in turn, and the expansion is the resultant list. If parameter is an array variable subscripted with @ or *, the case modification operation is applied to each member of the array in turn, and the expansion is the resultant list.

7

Regular expression

I would like to take credit for the command I wish to share but the truth is I obtained it for my own use from http://commandlinefu.com. It has the advantage that if you cd to any directory within your own home folder that is it will change all files and folders to lower case recursively please use with caution. It is a brilliant command line fix and especially useful for those multitudes of albums you have stored on your drive.

find . -depth -exec rename 's/(.*)\/([^\/]*)/$1\/\L$2/' {} \;

You can specify a directory in place of the dot(.) after the find which denotes current directory or full path.

I hope this solution proves useful the one thing this command does not do is replace spaces with underscores - oh well another time perhaps.

2
  • This didn't work for me for whatever reason, though it looks fine. I did get this to work as an alternative though: find . -exec /bin/bash -c 'mv {} `tr [A-Z] [a-z] <<< {}`' \;
    – John Rix
    Commented Jun 26, 2013 at 15:58
  • This needs prename from perl: dpkg -S "$(readlink -e /usr/bin/rename)" gives perl: /usr/bin/prename
    – Tino
    Commented Dec 11, 2015 at 16:27
6

Many answers using external programs, which is not really using Bash.

If you know you will have Bash4 available you should really just use the ${VAR,,} notation (it is easy and cool). For Bash before 4 (My Mac still uses Bash 3.2 for example). I used the corrected version of @ghostdog74 's answer to create a more portable version.

One you can call lowercase 'my STRING' and get a lowercase version. I read comments about setting the result to a var, but that is not really portable in Bash, since we can't return strings. Printing it is the best solution. Easy to capture with something like var="$(lowercase $str)".

How this works

The way this works is by getting the ASCII integer representation of each char with printf and then adding 32 if upper-to->lower, or subtracting 32 if lower-to->upper. Then use printf again to convert the number back to a char. From 'A' -to-> 'a' we have a difference of 32 chars.

Using printf to explain:

$ printf "%d\n" "'a"
97
$ printf "%d\n" "'A"
65

97 - 65 = 32

And this is the working version with examples.
Please note the comments in the code, as they explain a lot of stuff:

#!/bin/bash

# lowerupper.sh

# Prints the lowercase version of a char
lowercaseChar(){
    case "$1" in
        [A-Z])
            n=$(printf "%d" "'$1")
            n=$((n+32))
            printf \\$(printf "%o" "$n")
            ;;
        *)
            printf "%s" "$1"
            ;;
    esac
}

# Prints the lowercase version of a sequence of strings
lowercase() {
    word="$@"
    for((i=0;i<${#word};i++)); do
        ch="${word:$i:1}"
        lowercaseChar "$ch"
    done
}

# Prints the uppercase version of a char
uppercaseChar(){
    case "$1" in
        [a-z])
            n=$(printf "%d" "'$1")
            n=$((n-32))
            printf \\$(printf "%o" "$n")
            ;;
        *)
            printf "%s" "$1"
            ;;
    esac
}

# Prints the uppercase version of a sequence of strings
uppercase() {
    word="$@"
    for((i=0;i<${#word};i++)); do
        ch="${word:$i:1}"
        uppercaseChar "$ch"
    done
}

# The functions will not add a new line, so use echo or
# append it if you want a new line after printing

# Printing stuff directly
lowercase "I AM the Walrus!"$'\n'
uppercase "I AM the Walrus!"$'\n'

echo "----------"

# Printing a var
str="A StRing WITH mixed sTUFF!"
lowercase "$str"$'\n'
uppercase "$str"$'\n'

echo "----------"

# Not quoting the var should also work, 
# since we use "$@" inside the functions
lowercase $str$'\n'
uppercase $str$'\n'

echo "----------"

# Assigning to a var
myLowerVar="$(lowercase $str)"
myUpperVar="$(uppercase $str)"
echo "myLowerVar: $myLowerVar"
echo "myUpperVar: $myUpperVar"

echo "----------"

# You can even do stuff like
if [[ 'option 2' = "$(lowercase 'OPTION 2')" ]]; then
    echo "Fine! All the same!"
else
    echo "Ops! Not the same!"
fi

exit 0

And the results after running this:

$ ./lowerupper.sh 
i am the walrus!
I AM THE WALRUS!
----------
a string with mixed stuff!
A STRING WITH MIXED STUFF!
----------
a string with mixed stuff!
A STRING WITH MIXED STUFF!
----------
myLowerVar: a string with mixed stuff!
myUpperVar: A STRING WITH MIXED STUFF!
----------
Fine! All the same!

This should only work for ASCII characters though.

For me it is fine, since I know I will only pass ASCII chars to it.
I am using this for some case-insensitive CLI options, for example.

1
  • i use a similar trick to 97 - 65 = 32 for determining the ASCII offset for first 52 letters within base64 : 3^4 - 2^4 = 65, 3^4 + 2^4 = 97 Commented Dec 22, 2022 at 10:07
6

Converting case is done for alphabets only. So, this should work neatly.

I am focusing on converting alphabets between a-z from upper case to lower case. Any other characters should just be printed in stdout as it is...

Converts the all text in path/to/file/filename within a-z range to A-Z

For converting lower case to upper case

cat path/to/file/filename | tr 'a-z' 'A-Z'

For converting from upper case to lower case

cat path/to/file/filename | tr 'A-Z' 'a-z'

For example,

filename:

my name is xyz

gets converted to:

MY NAME IS XYZ

Example 2:

echo "my name is 123 karthik" | tr 'a-z' 'A-Z'
# Output:
# MY NAME IS 123 KARTHIK

Example 3:

echo "my name is 123 &&^&& #@$#@%%& kAR2~thik" | tr 'a-z' 'A-Z'
# Output:
# MY NAME IS 123 &&^&& #@0@%%& KAR2~THIK
6

For Bash3.2.+ | Mac:

read -p 'What is your email? ' email
email=$(echo $email | tr '[:upper:]' '[:lower:]')
email="$email"
echo $email
4

If using v4, this is baked-in. If not, here is a simple, widely applicable solution. Other answers (and comments) on this thread were quite helpful in creating the code below.

# Like echo, but converts to lowercase
echolcase () {
    tr [:upper:] [:lower:] <<< "${*}"
}

# Takes one arg by reference (var name) and makes it lowercase
lcase () { 
    eval "${1}"=\'$(echo ${!1//\'/"'\''"} | tr [:upper:] [:lower:] )\'
}

Notes:

  • Doing: a="Hi All" and then: lcase a will do the same thing as: a=$( echolcase "Hi All" )
  • In the lcase function, using ${!1//\'/"'\''"} instead of ${!1} allows this to work even when the string has quotes.
4

This is a far faster variation of JaredTS486's approach that uses native Bash capabilities (including Bash versions <4.0) to optimize his approach.

I've timed 1,000 iterations of this approach for a small string (25 characters) and a larger string (445 characters), both for lowercase and uppercase conversions. Since the test strings are predominantly lowercase, conversions to lowercase are generally faster than to uppercase.

I've compared my approach with several other answers on this page that are compatible with Bash 3.2. My approach is far more performant than most approaches documented here, and is even faster than tr in several cases.

Here are the timing results for 1,000 iterations of 25 characters:

Timing results for 1,000 iterations of 445 characters (consisting of the poem "The Robin" by Witter Bynner):

  • 2s for my approach to lowercase; 12s for uppercase
  • 4s for tr to lowercase; 4s for uppercase
  • 20s for Orwellophile's approach to lowercase; 29s for uppercase
  • 75s for ghostdog74's approach to lowercase; 669s for uppercase. It's interesting to note how dramatic the performance difference is between a test with predominant matches vs. a test with predominant misses
  • 467s for technosaurus' approach to lowercase; 449s for uppercase
  • 660s for JaredTS486's approach to lowercase; 660s for uppercase. It's interesting to note that this approach generated continuous page faults (memory swapping) in Bash

Solution:

#!/bin/bash
set -e
set -u

declare LCS="abcdefghijklmnopqrstuvwxyz"
declare UCS="ABCDEFGHIJKLMNOPQRSTUVWXYZ"

function lcase()
{
  local TARGET="${1-}"
  local UCHAR=''
  local UOFFSET=''

  while [[ "${TARGET}" =~ ([A-Z]) ]]
  do
    UCHAR="${BASH_REMATCH[1]}"
    UOFFSET="${UCS%%${UCHAR}*}"
    TARGET="${TARGET//${UCHAR}/${LCS:${#UOFFSET}:1}}"
  done

  echo -n "${TARGET}"
}

function ucase()
{
  local TARGET="${1-}"
  local LCHAR=''
  local LOFFSET=''

  while [[ "${TARGET}" =~ ([a-z]) ]]
  do
    LCHAR="${BASH_REMATCH[1]}"
    LOFFSET="${LCS%%${LCHAR}*}"
    TARGET="${TARGET//${LCHAR}/${UCS:${#LOFFSET}:1}}"
  done

  echo -n "${TARGET}"
}

The approach is simple: while the input string has any remaining uppercase letters present, find the next one, and replace all instances of that letter with its lowercase variant. Repeat until all uppercase letters are replaced.

Some performance characteristics of my solution:

  1. Uses only shell builtin utilities, which avoids the overhead of invoking external binary utilities in a new process
  2. Avoids sub-shells, which incur performance penalties
  3. Uses shell mechanisms that are compiled and optimized for performance, such as global string replacement within variables, variable suffix trimming, and regex searching and matching. These mechanisms are far faster than iterating manually through strings
  4. Loops only the number of times required by the count of unique matching characters to be converted. For example, converting a string that has three different uppercase characters to lowercase requires only 3 loop iterations. For the preconfigured ASCII alphabet, the maximum number of loop iterations is 26
  5. UCS and LCS can be augmented with additional characters
2
  • how am i supposed to use these function of yours with the pipe ? i pasted your code, verbatim, and it keeps exiting the shell without returning anything : . ./testchgcase.bash ; pvE0 < "${m3t}" | ucase | ghead [Process completed] Commented Dec 22, 2022 at 10:02
  • @RARE Kpop Manifesto You need to catch the piped data and call a function Add as the last line: lcase "$(cat)" Commented Jan 2 at 1:47
4

so i attempted to perform some updated benchmarking using the consensus approach for each utility, but instead of repeating a tiny set many times, I ...

  1. fed in a 1.85 GB .txt file that's filled to the brim w/ multi-byte Unicode chars in UTF-8 encoding,
  2. via the pipe in order to equalize I/O aspect,
  3. while also enforcing LC_ALL=C for all to ensure level playing field

————————————————————————————————————————

  • Both bsd-sed and gnu-sed are rather mediocre, to put it very nicely.

    • I don't even know what bsd-sed was trying to do, as their xxhash doesn't match
  • was python3 trying to do Unicode letter-casing ?

    • (even though I already forced the locale setting LC_ALL=C )
  • tr is the most extreme

    1. gnu-tr is, by far, the fastest among all
    2. bsd-tr utterly atrocious
  • perl5 is faster than any awk variant I have, unless you're okay with loading the whole file at once using mawk2 in order to gain a tiny bit over perl5 :

      2.935s mawk2 
    vs
      3.081s perl5
    
  • within awk, gnu-gawk appears slowest among the 3 , mawk 1.3.4 in the middle, and mawk 1.9.9.6 fastest : more than 50% time savings over gawk

    . (I didn't waste my time with the useless macosx nawk)

.

     out9: 1.85GiB 0:00:03 [ 568MiB/s] [ 568MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:03 [ 568MiB/s] [ 568MiB/s] [============>] 100%            
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C mawk2 '{ print tolower($_) }' FS='^$'; )  

mawk 1.9.9.6 (mawk2-beta)

3.07s user 0.66s system 111% cpu 3.348 total
85759a34df874966d096c6529dbfb9d5  stdin


     out9: 1.85GiB 0:00:06 [ 297MiB/s] [ 297MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:06 [ 297MiB/s] [ 297MiB/s] [============>] 100%            
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C mawk '{ print tolower($_) }' FS='^$'; )  

 mawk 1.3.4

6.01s user 0.83s system 107% cpu 6.368 total
85759a34df874966d096c6529dbfb9d5  stdin

     out9: 23.8MiB 0:00:00 [ 238MiB/s] [ 238MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:07 [ 244MiB/s] [ 244MiB/s] [============>] 100%            
     out9: 1.85GiB 0:00:07 [ 244MiB/s] [ 244MiB/s] [ <=>                             ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C gawk -be '{ print tolower($_) }' FS='^$';  

GNU Awk 5.1.1, API: 3.1 (GNU MPFR 4.1.0, GNU MP 6.2.1) 

7.49s user 0.78s system 106% cpu 7.763 total
85759a34df874966d096c6529dbfb9d5  stdin


     out9: 1.85GiB 0:00:03 [ 616MiB/s] [ 616MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:03 [ 617MiB/s] [ 617MiB/s] [============>] 100%            
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C perl -ne 'print lc'; )  

perl5 (revision 5 version 34 subversion 0)

2.70s user 0.85s system 115% cpu 3.081 total
85759a34df874966d096c6529dbfb9d5  stdin


     out9: 1.85GiB 0:00:32 [57.4MiB/s] [57.4MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:32 [57.4MiB/s] [57.4MiB/s] [============>] 100%            
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C gsed 's/.*/\L&/'; )  # GNU-sed


gsed (GNU sed) 4.8

32.57s user 0.97s system 101% cpu 32.982 total
85759a34df874966d096c6529dbfb9d5  stdin


     out9: 1.86GiB 0:00:38 [49.7MiB/s] [49.7MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:38 [49.4MiB/s] [49.4MiB/s] [============>] 100%            
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C sed 's/.*/\L&/'; )   # BSD-sed



37.94s user 0.86s system 101% cpu 38.318 total
d5e2d8487df1136db7c2334a238755c0  stdin



      in0:  313MiB 0:00:00 [3.06GiB/s] [3.06GiB/s] [=====>] 16% ETA 0:00:00
     out9: 1.85GiB 0:00:11 [ 166MiB/s] [ 166MiB/s] [ <=>]
      in0: 1.85GiB 0:00:00 [3.31GiB/s] [3.31GiB/s] [============>] 100%            
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C python3 -c "print(open(0).read().lower())) 

Python 3.9.12 

9.04s user 2.18s system 98% cpu 11.403 total
7ddc0b5cbcfbbfac3c2b6da6731bd262  stdin

     out9: 2.51MiB 0:00:00 [25.1MiB/s] [25.1MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:11 [ 171MiB/s] [ 171MiB/s] [============>] 100%            
     out9: 1.85GiB 0:00:11 [ 171MiB/s] [ 171MiB/s] [ <=> ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C ruby -pe '$_.downcase!'; )


ruby 2.6.8p205 (2021-07-07 revision 67951) [universal.arm64e-darwin21]

10.46s user 1.23s system 105% cpu 11.073 total
85759a34df874966d096c6529dbfb9d5  stdin


      in0: 1.85GiB 0:00:01 [1.01GiB/s] [1.01GiB/s] [============>] 100%            
     out9: 1.85GiB 0:00:01 [1.01GiB/s] [1.01GiB/s] [ <=> ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C gtr '[A-Z]' '[a-z]'; )  # GNU-tr


gtr (GNU coreutils) 9.1

1.11s user 1.21s system 124% cpu 1.855 total
85759a34df874966d096c6529dbfb9d5  stdin


     out9: 1.85GiB 0:01:19 [23.7MiB/s] [23.7MiB/s] [ <=> ]
      in0: 1.85GiB 0:01:19 [23.7MiB/s] [23.7MiB/s] [============>] 100%            
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C tr '[A-Z]' '[a-z]'; ) # BSD-tr

78.94s user 1.50s system 100% cpu 1:19.67 total
85759a34df874966d096c6529dbfb9d5  stdin


( time ( pvE0 < "${m3t}" | LC_ALL=C   gdd  conv=lcase ) | pvE9 )  | xxh128sum | lgp3; sleep 3; 
     out9: 0.00 B 0:00:01 [0.00 B/s] [0.00 B/s] [<=> ]
      in0: 1.85GiB 0:00:06 [ 295MiB/s] [ 295MiB/s] [============>] 100%            
     out9: 1.81GiB 0:00:06 [ 392MiB/s] [ 294MiB/s] [ <=>   ]
3874110+1 records in
3874110+1 records out
     out9: 1.85GiB 0:00:06 [ 295MiB/s] [ 295MiB/s] [ <=>  ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C gdd conv=lcase; )  # GNU-dd


gdd (coreutils) 9.1

1.93s user 4.35s system 97% cpu 6.413 total
85759a34df874966d096c6529dbfb9d5  stdin



%  ( time ( pvE0 < "${m3t}" | LC_ALL=C   dd  conv=lcase ) | pvE9 )  | xxh128sum | lgp3; sleep 3; 
     out9: 36.9MiB 0:00:00 [ 368MiB/s] [ 368MiB/s] [ <=> ]
      in0: 1.85GiB 0:00:04 [ 393MiB/s] [ 393MiB/s] [============>] 100%            
     out9: 1.85GiB 0:00:04 [ 393MiB/s] [ 393MiB/s] [ <=>   ]
3874110+1 records in
3874110+1 records out
     out9: 1.85GiB 0:00:04 [ 393MiB/s] [ 393MiB/s] [ <=>  ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C dd conv=lcase; )  # BSD-dd


1.92s user 4.24s system 127% cpu 4.817 total
85759a34df874966d096c6529dbfb9d5  stdin

————————————————————————————————————————

mawk2 can be made artificially faster than perl5 by having the file load all at once, and doing tolower() for all 1.85 GB in a single function call ::

( time ( pvE0 < "${m3t}" | 

  LC_ALL=C mawk2 '
           BEGIN {            FS = RS = "^$"  } 
             END { print tolower($(ORS = "")) }' 

 ) | pvE9 ) | xxh128sum| lgp3 

      in0: 1.85GiB 0:00:00 [3.35GiB/s] [3.35GiB/s] [============>] 100%            
     out9: 1.85GiB 0:00:02 [ 647MiB/s] [ 647MiB/s] [ <=> ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C mawk2 ; )


1.39s user 1.31s system 91% cpu 2.935 total
85759a34df874966d096c6529dbfb9d5  stdin
3

For Bash versions earlier than 4.0, this version should be fastest (as it doesn't fork/exec any commands):

function string.monolithic.tolower
{
   local __word=$1
   local __len=${#__word}
   local __char
   local __octal
   local __decimal
   local __result

   for (( i=0; i<__len; i++ ))
   do
      __char=${__word:$i:1}
      case "$__char" in
         [A-Z] )
            printf -v __decimal '%d' "'$__char"
            printf -v __octal '%03o' $(( $__decimal ^ 0x20 ))
            printf -v __char \\$__octal
            ;;
      esac
      __result+="$__char"
   done
   REPLY="$__result"
}

technosaurus's answer had potential too, although it did run properly for mee.

1
  • Not bad! For an analysis of the performance of this approach, please see my answer for metrics. Commented Jul 28, 2018 at 23:40
3

In spite of how old this question is and similar to this answer by technosaurus. I had a hard time finding a solution that was portable across most platforms (That I Use) as well as older versions of bash. I have also been frustrated with arrays, functions and use of prints, echos and temporary files to retrieve trivial variables. This works very well for me so far I thought I would share. My main testing environments are:

  1. GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)
  2. GNU bash, version 3.2.57(1)-release (sparc-sun-solaris2.10)
lcs="abcdefghijklmnopqrstuvwxyz"
ucs="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
input="Change Me To All Capitals"
for (( i=0; i<"${#input}"; i++ )) ; do :
    for (( j=0; j<"${#lcs}"; j++ )) ; do :
        if [[ "${input:$i:1}" == "${lcs:$j:1}" ]] ; then
            input="${input/${input:$i:1}/${ucs:$j:1}}" 
        fi
    done
done

Simple C-style for loop to iterate through the strings. For the line below if you have not seen anything like this before this is where I learned this. In this case the line checks if the char ${input:$i:1} (lower case) exists in input and if so replaces it with the given char ${ucs:$j:1} (upper case) and stores it back into input.

input="${input/${input:$i:1}/${ucs:$j:1}}"
2
  • This is wildly inefficient, looping 650 times in your example above, and taking 35 seconds to execute 1000 invocations on my machine. For an alternative that loops just 11 times and takes less than 5 seconds to execute 1000 invocations, see my alternative answer. Commented Jul 28, 2018 at 17:20
  • 1
    Thanks, although that should be obvious just from looking at it. Perhaps the page faults are from the input size and the number of iterations you are executing. Nevertheless I like your solution.
    – JaredTS486
    Commented Aug 9, 2018 at 19:24
3

To store the transformed string into a variable. Following worked for me - $SOURCE_NAME to $TARGET_NAME

TARGET_NAME="`echo $SOURCE_NAME | tr '[:upper:]' '[:lower:]'`"
0

Based on Dejay Clayton excellent solution, I've generalized the uppercase/lowercase to a transpose function (independently useful), returned the result in a variable (faster/safer), and added a BASH v4+ optimization:

pkg::transpose() { # <retvar> <string> <from> <to>
  local __r=$2 __m __p
  while [[ ${__r} =~ ([$3]) ]]; do
    __m="${BASH_REMATCH[1]}"; __p="${3%${__m}*}"
    __r="${__r//${__m}/${4:${#__p}:1}}"
  done
  printf -v "$1" "%s" "${__r}"
}
pkg::lowercase() { # <retvar> <string>
  if (( BASH_VERSINFO[0] >= 4 )); then
    printf -v "$1" "%s" "${2,,}"
  else
    pkg::transpose "$1" "$2" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" \
                   "abcdefghijklmnopqrstuvwxyz"
  fi
}
pkg::uppercase() { # <retvar> <string>
  if (( BASH_VERSINFO[0] >= 4 )); then
    printf -v "$1" "%s" "${2^^}"
  else
    pkg::transpose "$1" "$2" "abcdefghijklmnopqrstuvwxyz" \
                   "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
  fi
}

To keep things simple I didn't add any set -e support (or any error checking really)... but otherwise it generally follows shellguide and pkg::transpose() tries to avoid any likely variable name clashes for the printf -v

0

isn't this cleaner than a full chain of shell variable(s) + declare + eval + single quote escapes + echo + pipe(s) + tr just to avoid a sub-shell or external process ?

# ***MUCH*** faster for ASCII only

mawk '$!NF = toupper($_)' <<< 'abcxyz' 
ABCXYZ
gawk '$_ = tolower($_)' <<< 'FAB-EDC'
fab-edc

and Unicodes are just as easy to work with, without having to "unpack" or "encode" or "decode" bytes

printf '%s' "${test_utf8}" | ……
 1  ÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øù
    úûüýþÿĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪī
    ĬĭĮįİıIJijĴĵĶķĸĹĺĻļĽľĿŀŁłŃńŅņŇňʼnŊŋŌōŎŏŐőŒœŔŕŖŗŘřŚśŜŝ
    ŞşŠšŢţŤťŦŧŨũŪūŬŭŮůŰűŲųŴŵŶŷŸŹźŻżŽžſƀƁƂƃƄƅƆƇƈƉƊƋƌƍƎƏ
    ƐƑƒƓƔƕƖƗƘƙƚƛƜƝƞƟƠơƢƣƤƥƦƧƨƩƪƫƬƭƮƯưƱƲƳƴƵƶƷƸƹƺƻƼƽƾƿǀǁǂ
    ǃDŽDždžLJLjljNJNjnjǍǎǏǐǑǒǓǔǕǖǗǘǙǚǛǜǝǞǟǠǡǢǣǤǥǦǧǨǩǪǫǬǭǮǯǰDZDzdzǴ
…… | gawk '$_ = toupper($_)'
 1  ÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ÷ØÙ
    ÚÛÜÝÞŸĀĀĂĂĄĄĆĆĈĈĊĊČČĎĎĐĐĒĒĔĔĖĖĘĘĚĚĜĜĞĞĠĠĢĢĤĤĦĦĨĨĪĪ
    ĬĬĮĮİIIJIJĴĴĶĶĸĹĹĻĻĽĽĿĿŁŁŃŃŅŅŇŇʼnŊŊŌŌŎŎŐŐŒŒŔŔŖŖŘŘŚŚŜŜ
    ŞŞŠŠŢŢŤŤŦŦŨŨŪŪŬŬŮŮŰŰŲŲŴŴŶŶŸŹŹŻŻŽŽSƀƁƂƂƄƄƆƇƇƉƊƋƋƍƎƏ
    ƐƑƑƓƔǶƖƗƘƘƚƛƜƝȠƟƠƠƢƢƤƤƦƧƧƩƪƫƬƬƮƯƯƱƲƳƳƵƵƷƸƸƺƻƼƼƾǷǀǁǂ
    ǃDŽDžDŽLJLjLJNJNjNJǍǍǏǏǑǑǓǓǕǕǗǗǙǙǛǛƎǞǞǠǠǢǢǤǤǦǦǨǨǪǪǬǬǮǮǰDZDzDZǴ
0

Easy way:

Upper ${string^^}

Lower ${string,,}
-16

use this command to do the same , it will convert upper case strings into lowercase :

sed 's/[A-Z]/[a-z]/g' <filename>
1
  • 7
    That doesn't work at all. It replaces each upper case letter by the 5-character string [a-z]!
    – joanis
    Commented Dec 14, 2021 at 13:54

Not the answer you're looking for? Browse other questions tagged or ask your own question.