How to avoid the two most common caveats when using read command

These are easy mistakes to make as the bash builtin read command will terminate with the error exit code on the end-of-file condition and use Internal Field Separator to split the line into words. It is kind of tricky, but it is worth to know how deal with such problems.

Skip to third (while loop) or fourth step (until loop) to see final solution.

I will use the following input sample to show you each issue separately.

$ printf "first\tline\n\nthird line \n  fourth line"
first	line

third line 
  fourth line$ _

Notice, first line contains tab character in the middle, second line is empty (single newline character), third line contains trailing white space character, fourth line contains two leading white space characters and does not contain the newline character.

It contains 4 lines and 37 characters.

First step - look at the while loop

Create simplest possible shell script to count lines/characters.

#!/bin/bash

# set initial number of lines
n_lines=0
# set initial number of characters
n_chars=0

while read -r -u 0 line; do
  n_lines=$(expr $n_lines \+ 1)
  l_chars=$(printf "%s\n" "$line" | wc -c)
  n_chars=$(expr $n_chars \+ $l_chars)
  printf "Parsed line '%s'\n" "$line"
done

echo "Lines: $n_lines"
echo "Characters: $n_chars"

Check it out.

$ printf "first\tline\n\nthird line \n  fourth line" | bash ./read_inside_while_loop_1.sh 
Parsed line 'first	line'
Parsed line ''
Parsed line 'third line'
Lines: 3
Characters: 23

There are two problems. The last line is missing as it does not contain newline character and leading/trailing white spaces disappeared.

Second step - fix the last line without newline character

Update shell script to parse the last line as it will be left untouched if it does not contain newline character.

#!/bin/bash

# set initial number of lines
n_lines=0
# set initial number of characters
n_chars=0

while read -r -u 0 line; do
  n_lines=$(expr $n_lines \+ 1)
  l_chars=$(printf "%s\n" "$line" | wc -c)
  n_chars=$(expr $n_chars \+ $l_chars)
  printf "Parsed line '%s'\n" "$line"
done

if [ -n "$line" ]; then
  n_lines=$(expr $n_lines \+ 1)
  l_chars=$(printf "%s" "$line" | wc -c)
  n_chars=$(expr $n_chars \+ $l_chars)
  printf "Parsed line '%s'\n" "$line"
fi

echo "Lines: $n_lines"
echo "Characters: $n_chars"

Check it out.

$ printf "first\tline\n\nthird line \n  fourth line" | bash ./read_inside_while_loop_2.sh 
Parsed line 'first	line'
Parsed line ''
Parsed line 'third line'
Parsed line 'fourth line'
Lines: 4
Characters: 34

It looks better, one problem solved, but leading/trailing white spaces are still missing.

Third step - fix the missing leading/trailing white spaces

Update shell script to parse leading/trailing white space characters. All you need to do now is to alter the Internal Field Separator as it contains characters that are used to split the line into words.

#!/bin/bash

# set initial number of lines
n_lines=0
# set initial number of characters
n_chars=0

OLDIFS="$IFS"
IFS=""
while read -r -u 0 line; do
  n_lines=$(expr $n_lines \+ 1)
  l_chars=$(printf "%s\n" "$line" | wc -c)
  n_chars=$(expr $n_chars \+ $l_chars)
  printf "Parsed line '%s'\n" "$line"
done

if [ -n "$line" ]; then
  n_lines=$(expr $n_lines \+ 1)
  l_chars=$(printf "%s" "$line" | wc -c)
  n_chars=$(expr $n_chars \+ $l_chars)
  printf "Parsed line '%s'\n" "$line"
fi

IFS="$OLDIFS"

echo "Lines: $n_lines"
echo "Characters: $n_chars"

Check it out.

$ printf "first\tline\n\nthird line \n  fourth line" | bash ./read_inside_while_loop_3.sh 
Parsed line 'first	line'
Parsed line ''
Parsed line 'third line '
Parsed last line '  fourth line'
Lines: 4
Characters: 37

Success. Shell script using the while loop displays correct results.

Fourth step - alternative solution

If you search for something slightly different then use the until loop. I will skip the adjustment process as it is very similar to the previous one.

#!/bin/bash

# set initial number of lines
n_lines=0
# set initial number of characters
n_chars=0

OLDIFS="$IFS"
IFS=""

file_read=false
until $file_read; do
  read -r -u 0 line || file_read=true
  if [  "$file_read" == false ]; then
    n_lines=$(expr $n_lines \+ 1)
    l_chars=$(printf "%s\n" "$line" | wc -c)
    n_chars=$(expr $n_chars \+ $l_chars)
    printf "Parsed line '%s'\n" "$line"
  elif ([ "$file_read" == true ] && [ -n "$line" ]); then
    n_lines=$(expr $n_lines \+ 1)
    l_chars=$(printf "%s" "$line" | wc -c)
    n_chars=$(expr $n_chars \+ $l_chars)
    printf "Parsed line '%s'\n" "$line"
  fi
done
IFS="$OLDIFS"

echo "Lines: $n_lines"
echo "Characters: $n_chars"

Check it out.

$ printf "first\tline\n\nthird line \n  fourth line" | bash ./read_inside_while_loop_3.sh 
Parsed line 'first	line'
Parsed line ''
Parsed line 'third line '
Parsed line '  fourth line'
Lines: 4
Characters: 37

Success. Shell script using the until loop displays correct results.

Want to know more? Read the bash manual page and look for IFS special variable and builtin read command.