Skip to main content
Added note about a performance metric
Source Link
Dejay Clayton
  • 3.8k
  • 2
  • 30
  • 20
  • 2s for my approach to lowercase; 12s for uppercase
  • 4s for tr to lowercase; 4s for uppercase
  • 20s for Orwellophile's approach to lowercase; 29s for uppercase
  • 75s for ghostdog74's approach to lowercase; 669s for uppercase. It's interesting to note how dramatic the performance difference is between a test with predominant matches vs. a test with predominant misses
  • 467s for technosaurus' approach to lowercase; 449s for uppercase
  • 660s for JaredTS486's approach to lowercase; 660s for uppercase. It's interesting to note that this approach generated continuous page faults (memory swapping) in Bash
  • 2s for my approach to lowercase; 12s for uppercase
  • 4s for tr to lowercase; 4s for uppercase
  • 20s for Orwellophile's approach to lowercase; 29s for uppercase
  • 75s for ghostdog74's approach to lowercase; 669s for uppercase
  • 467s for technosaurus' approach to lowercase; 449s for uppercase
  • 660s for JaredTS486's approach to lowercase; 660s for uppercase. It's interesting to note that this approach generated continuous page faults (memory swapping) in Bash
  • 2s for my approach to lowercase; 12s for uppercase
  • 4s for tr to lowercase; 4s for uppercase
  • 20s for Orwellophile's approach to lowercase; 29s for uppercase
  • 75s for ghostdog74's approach to lowercase; 669s for uppercase. It's interesting to note how dramatic the performance difference is between a test with predominant matches vs. a test with predominant misses
  • 467s for technosaurus' approach to lowercase; 449s for uppercase
  • 660s for JaredTS486's approach to lowercase; 660s for uppercase. It's interesting to note that this approach generated continuous page faults (memory swapping) in Bash
Augmented description with timing information
Source Link
Dejay Clayton
  • 3.8k
  • 2
  • 30
  • 20

This is a far faster variation of JaredTS486's approachJaredTS486's approach that uses native Bash capabilities (including Bash versions <4.0) to optimize his approach.

I've timed 1,000 iterations of eachthis approach for a small string (25 characters) and a larger string (445 characters), consisting ofboth for lowercase and uppercase conversions. Since the poem "The Robin" by Witter Bynner)test strings are predominantly lowercase, conversions to lowercase are generally faster than to uppercase.

TimingI've compared my approach with several other answers on this page that are compatible with Bash 3.2. My approach is far more performant than most approaches documented here, and is even faster than tr in several cases.

Here are the timing results for 1,000 iterations of 25 characters:

Timing results for 1,000 iterations of 445 characters (consisting of the poem "The Robin" by Witter Bynner):

  • 9 seconds2s for my approach to lowercase; 12s for uppercase
  • 4s for tr '[:lower:]' '[:upper:]' to lowercase; 4s for uppercase
  • 17 seconds20s for myOrwellophile's approach to lowercase; 29s for uppercase
  • 75s for ghostdog74's approach to lowercase; 669s for uppercase
  • 25 seconds467s for [Orwellophile's approach]technosaurus' approach to lowercase; 449s for uppercase
  • 829 seconds660s for JaredTS486's approach to lowercase; 660s for uppercase. It's interesting to note that this approach generated continuous page faults (memory swapping) in Bash
#!/bin/bash
set -e
set -u

declare LCS="abcdefghijklmnopqrstuvwxyz"
declare UCS="ABCDEFGHIJKLMNOPQRSTUVWXYZ"

function ucaselcase()
{
  local TARGET="${1-}"
  local LCHAR=''UCHAR=''
  local LOFFSET=''UOFFSET=''

  while [[ "${TARGET}" =~ ([a[A-z]Z]) ]]
  do
    LCHAR="$UCHAR="${BASH_REMATCH[1]}"
    LOFFSET="$UOFFSET="${LCS%%$UCS%%${LCHARUCHAR}*}"
    TARGET="${TARGET//${LCHARUCHAR}/${UCSLCS:${#LOFFSET#UOFFSET}:1}}"
  done

  echo -n "${TARGET}"
}

echo "OUTPUT:function [$ucase()
{
 ucase 'Changelocal MeTARGET="${1-}"
 To Alllocal Capitals'LCHAR=''
  local LOFFSET=''

  while [[ "${TARGET}" =~ ([a-z])]" ]]
  do
    LCHAR="${BASH_REMATCH[1]}"
    LOFFSET="${LCS%%${LCHAR}*}"
    TARGET="${TARGET//${LCHAR}/${UCS:${#LOFFSET}:1}}"
  done

  echo -n "${TARGET}"
}

The approach is simple: while the input string has any remaining lowercaseuppercase letters present, find the firstnext one, and replace all instances of that letter with its uppercaselowercase variant. Repeat until all lowercaseuppercase letters are replaced.

Some performance characteristics of my solution:

  1. Uses only shell builtin utilities, which avoids the overhead of invoking external binary utilities in a new process
  2. Avoids sub-shells, which incur performance penalties
  3. Uses shell mechanisms that are compiled and optimized for performance, such as global string replacement within variables, variable suffix trimming, and regex searching and matching. These mechanisms are far faster than iterating manually through strings
  4. Loops only the number of times required by the count of unique matching characters to be converted. For example, converting a string that has three different uppercase characters to lowercase requires only 3 loop iterations. For the preconfigured ASCII alphabet, the maximum number of loop iterations is 26
  5. UCS and LCS can be augmented with additional characters

This is a faster variation of JaredTS486's approach that uses native Bash capabilities (including Bash versions <4.0) to optimize his approach.

I've timed 1,000 iterations of each approach for a small string (25 characters) and a larger string (445 characters, consisting of the poem "The Robin" by Witter Bynner).

Timing results for 1,000 iterations of 25 characters:

Timing results for 1,000 iterations of 445 characters:

  • 9 seconds for tr '[:lower:]' '[:upper:]'
  • 17 seconds for my approach
  • 25 seconds for [Orwellophile's approach]
  • 829 seconds for JaredTS486's approach
#!/bin/bash
set -e
set -u

declare LCS="abcdefghijklmnopqrstuvwxyz"
declare UCS="ABCDEFGHIJKLMNOPQRSTUVWXYZ"

function ucase()
{
  local TARGET="${1-}"
  local LCHAR=''
  local LOFFSET=''

  while [[ "${TARGET}" =~ ([a-z]) ]]
  do
    LCHAR="${BASH_REMATCH[1]}"
    LOFFSET="${LCS%%${LCHAR}*}"
    TARGET="${TARGET//${LCHAR}/${UCS:${#LOFFSET}:1}}"
  done

  echo -n "${TARGET}"
}

echo "OUTPUT: [$( ucase 'Change Me To All Capitals' )]"

The approach is simple: while the input string has any remaining lowercase letters present, find the first one, and replace all instances of that letter with its uppercase variant. Repeat until all lowercase letters are replaced.

This is a far faster variation of JaredTS486's approach that uses native Bash capabilities (including Bash versions <4.0) to optimize his approach.

I've timed 1,000 iterations of this approach for a small string (25 characters) and a larger string (445 characters), both for lowercase and uppercase conversions. Since the test strings are predominantly lowercase, conversions to lowercase are generally faster than to uppercase.

I've compared my approach with several other answers on this page that are compatible with Bash 3.2. My approach is far more performant than most approaches documented here, and is even faster than tr in several cases.

Here are the timing results for 1,000 iterations of 25 characters:

Timing results for 1,000 iterations of 445 characters (consisting of the poem "The Robin" by Witter Bynner):

  • 2s for my approach to lowercase; 12s for uppercase
  • 4s for tr to lowercase; 4s for uppercase
  • 20s for Orwellophile's approach to lowercase; 29s for uppercase
  • 75s for ghostdog74's approach to lowercase; 669s for uppercase
  • 467s for technosaurus' approach to lowercase; 449s for uppercase
  • 660s for JaredTS486's approach to lowercase; 660s for uppercase. It's interesting to note that this approach generated continuous page faults (memory swapping) in Bash
#!/bin/bash
set -e
set -u

declare LCS="abcdefghijklmnopqrstuvwxyz"
declare UCS="ABCDEFGHIJKLMNOPQRSTUVWXYZ"

function lcase()
{
  local TARGET="${1-}"
  local UCHAR=''
  local UOFFSET=''

  while [[ "${TARGET}" =~ ([A-Z]) ]]
  do
    UCHAR="${BASH_REMATCH[1]}"
    UOFFSET="${UCS%%${UCHAR}*}"
    TARGET="${TARGET//${UCHAR}/${LCS:${#UOFFSET}:1}}"
  done

  echo -n "${TARGET}"
}

function ucase()
{
  local TARGET="${1-}"
  local LCHAR=''
  local LOFFSET=''

  while [[ "${TARGET}" =~ ([a-z]) ]]
  do
    LCHAR="${BASH_REMATCH[1]}"
    LOFFSET="${LCS%%${LCHAR}*}"
    TARGET="${TARGET//${LCHAR}/${UCS:${#LOFFSET}:1}}"
  done

  echo -n "${TARGET}"
}

The approach is simple: while the input string has any remaining uppercase letters present, find the next one, and replace all instances of that letter with its lowercase variant. Repeat until all uppercase letters are replaced.

Some performance characteristics of my solution:

  1. Uses only shell builtin utilities, which avoids the overhead of invoking external binary utilities in a new process
  2. Avoids sub-shells, which incur performance penalties
  3. Uses shell mechanisms that are compiled and optimized for performance, such as global string replacement within variables, variable suffix trimming, and regex searching and matching. These mechanisms are far faster than iterating manually through strings
  4. Loops only the number of times required by the count of unique matching characters to be converted. For example, converting a string that has three different uppercase characters to lowercase requires only 3 loop iterations. For the preconfigured ASCII alphabet, the maximum number of loop iterations is 26
  5. UCS and LCS can be augmented with additional characters
Appended more detailed timing information
Source Link
Dejay Clayton
  • 3.8k
  • 2
  • 30
  • 20

This is a faster variation of JaredTS486's answerJaredTS486's approach that uses native Bash capabilities (including Bash versions <4.0) to optimize his approach. It even seems to perform faster than

I've timed 1,000 iterations of each approach for a small string tr '[:lower:]' '[:upper:]' on my machine!(25 characters) and a larger string (445 characters, consisting of the poem "The Robin" by Witter Bynner).

Timing results for 1,000 iterations of 25 characters:

Timing results for 1,000 iterations of 445 characters:

  • 9 seconds for tr '[:lower:]' '[:upper:]'
  • 17 seconds for my approach
  • 25 seconds for [Orwellophile's approach]
  • 829 seconds for JaredTS486's approach

Solution:

#!/bin/bash
set -e
set -u

declare LCS="abcdefghijklmnopqrstuvwxyz"
declare UCS="ABCDEFGHIJKLMNOPQRSTUVWXYZ"

function ucase()
{
  local TARGET="${1-}"
  local LCHAR=''
  local LOFFSET=''

  while [[ "${TARGET}" =~ ([a-z]) ]]
  do
    LCHAR="${BASH_REMATCH[1]}"
    LOFFSET="${LCS%%${LCHAR}*}"
    TARGET="${TARGET//${LCHAR}/${UCS:${#LOFFSET}:1}}"
  done

  echo -n "${TARGET}"
}

echo "OUTPUT: [$( ucase 'Change Me To All Capitals' )]"

The approach is simple: while the input string has any remaining lowercase letters present, find the first one, and replace all instances of that letter with its uppercase variant. Repeat until all lowercase letters are replaced.

On my machine, the test string Change Me To All Capitals requires 11 loops, and less than 6 seconds to execute 1,000 times, which is surprisingly faster than invoking tr '[:lower:]' '[:upper:]', which takes over 8 seconds to execute 1,000 times. JaredTS486's answer requires 650 loops and over 35 seconds to execute 1,000 times.

Note that the execution time drops from less than 6 seconds to less than 5 seconds when the logic is inlined directly within the source code, instead of embedded within a Bash function that is then invoked via a string-interpolation subshell $( ).

This is a faster variation of JaredTS486's answer that uses native Bash capabilities (including Bash versions <4.0) to optimize his approach. It even seems to perform faster than tr '[:lower:]' '[:upper:]' on my machine!

#!/bin/bash
set -e
set -u

declare LCS="abcdefghijklmnopqrstuvwxyz"
declare UCS="ABCDEFGHIJKLMNOPQRSTUVWXYZ"

function ucase()
{
  local TARGET="${1-}"
  local LCHAR=''
  local LOFFSET=''

  while [[ "${TARGET}" =~ ([a-z]) ]]
  do
    LCHAR="${BASH_REMATCH[1]}"
    LOFFSET="${LCS%%${LCHAR}*}"
    TARGET="${TARGET//${LCHAR}/${UCS:${#LOFFSET}:1}}"
  done

  echo -n "${TARGET}"
}

echo "OUTPUT: [$( ucase 'Change Me To All Capitals' )]"

The approach is simple: while the input string has any remaining lowercase letters present, find the first one, and replace all instances of that letter with its uppercase variant. Repeat until all lowercase letters are replaced.

On my machine, the test string Change Me To All Capitals requires 11 loops, and less than 6 seconds to execute 1,000 times, which is surprisingly faster than invoking tr '[:lower:]' '[:upper:]', which takes over 8 seconds to execute 1,000 times. JaredTS486's answer requires 650 loops and over 35 seconds to execute 1,000 times.

Note that the execution time drops from less than 6 seconds to less than 5 seconds when the logic is inlined directly within the source code, instead of embedded within a Bash function that is then invoked via a string-interpolation subshell $( ).

This is a faster variation of JaredTS486's approach that uses native Bash capabilities (including Bash versions <4.0) to optimize his approach.

I've timed 1,000 iterations of each approach for a small string (25 characters) and a larger string (445 characters, consisting of the poem "The Robin" by Witter Bynner).

Timing results for 1,000 iterations of 25 characters:

Timing results for 1,000 iterations of 445 characters:

  • 9 seconds for tr '[:lower:]' '[:upper:]'
  • 17 seconds for my approach
  • 25 seconds for [Orwellophile's approach]
  • 829 seconds for JaredTS486's approach

Solution:

#!/bin/bash
set -e
set -u

declare LCS="abcdefghijklmnopqrstuvwxyz"
declare UCS="ABCDEFGHIJKLMNOPQRSTUVWXYZ"

function ucase()
{
  local TARGET="${1-}"
  local LCHAR=''
  local LOFFSET=''

  while [[ "${TARGET}" =~ ([a-z]) ]]
  do
    LCHAR="${BASH_REMATCH[1]}"
    LOFFSET="${LCS%%${LCHAR}*}"
    TARGET="${TARGET//${LCHAR}/${UCS:${#LOFFSET}:1}}"
  done

  echo -n "${TARGET}"
}

echo "OUTPUT: [$( ucase 'Change Me To All Capitals' )]"

The approach is simple: while the input string has any remaining lowercase letters present, find the first one, and replace all instances of that letter with its uppercase variant. Repeat until all lowercase letters are replaced.

Source Link
Dejay Clayton
  • 3.8k
  • 2
  • 30
  • 20
Loading