What's happening is that the file is in Windows format, where newlines are represented by the two-character combination CR, LF. You're using Unix tools which expect newlines to be represented the Unix way, with just the LF character. The shell treats CR (carriage return) as an ordinary character, so it becomes part of the value of the address
variable. When you print out the result on a terminal, the terminal interprets the CR character as “go back to the beginning of the current line”, which is why the .html
bit that comes after the CR overwrites the beginning of the line.
You can convert the file to use Unix newlines.
If you want your code to be robust to input files with Windows newline encoding, you can tell the shell to treat CR as a whitespace character by adding it to the IFS
variable.
while IFS="$IFS$(printf '\r')" read address; do
echo "${address}.html"
done <addresses.txt
Another solution would be to strip the CR character from the end of the value in case it's there, using a parameter expansion string manipulation construct. Note that backslash-newline for line continuation won't work if the file actually contains backslash-CR-newline, so you should turn that off to avoid confusion.
CR=$(printf '\r')
while read -r address; do
address=${address%$CR}
echo "${address}.html"
done <addresses.txt
In ksh93, bash and zsh, you can use $'\r'
instead of $(printf '\r')
.
while read -r addr; do echo "${addr}.html"; done < addresses.txt