Categories
SysOps

How to calculate how fast data is copied to the specified directory

Determine how long it would take to copy data between directories.

Create a shell script that will calculate how much data was copied to the specified directory in one minute and use that value to create a simple prognosis for the whole hour and day.

#!/bin/bash
# calculate how fast data is copied to specified directory

# base 1000 or 1024
base=1024

if [ "$base" -eq "1000" ]; then
  block_size=1M
else
  block_size=1MB
fi

# usage
usage() {
  echo "Usage:"
  echo " $0 directory"
  echo ""
}

# pretty print total amount of data using the same unit as du
pretty_print() {
  amount_of_data="$1"
  
  if [ "$base" -eq "1000" ]; then
    unit="M"
  else
    unit="MB"
  fi

  if [ "$amount_of_data" -gt "1024" ]; then
    amount_of_data=$(expr $amount_of_data / 1024)  # gigabytes
    if [ "$base" -eq "1000" ]; then
      unit="G"
    else
      unit="GB"
    fi
  fi

  if [ "$amount_of_data" -gt "1024" ]; then
    amount_of_data=$(expr $amount_of_data / 1024)  # terabytes
    if [ "$base" -eq "1000" ]; then
      unit="T"
    else
      unit="TB"
    fi
  fi

  # print output using the same units as du
  printf "%4s %2s\r" ${amount_of_data} ${unit}
}


if [ "$#" -eq "1" ] && [ -d "$1" ]; then
  directory="$1"
  
  # du params: 
  #   use defined block size (1M/1MB), 
  #   use apparent size, 
  #   display only total summary
  disk_usage_params="--block-size=${block_size} --apparent-size --summarize"

  difference_per_minute=$((du $disk_usage_params $directory; \
                           sleep 1m; \
                           du $disk_usage_params $directory;) 2>/dev/null | \
                             cut -f 1 | tac | paste --serial --delimiter - | bc)

  if [ "$difference_per_minute" -gt "0" ]; then
    difference_per_hour=$(expr $difference_per_minute  \* 60)
    difference_per_day=$(expr $difference_per_hour \* 24)

    current_directory_size=$(du $disk_usage_params $directory) 

    echo "Calculated amount data copied per minute is $(pretty_print $difference_per_minute)"
    echo "---------------------------------------------------"
    echo "Prognosis for a whole hour               is $(pretty_print $difference_per_hour)"
    echo "Prognosis for a whole day                is $(pretty_print $difference_per_day)"
    echo "---------------------------------------------------"
    echo "Current directory size                   is $(pretty_print $current_directory_size)"
  else
    echo "The amount of data did not changed during one minute"
  fi
else
  usage
fi
fi

Sample usage.

$ check.sh 
Usage:
 check.sh directory

Sample scenario when data is not copied to the specified directory.

$ check.sh /var/backups/
The amount of data did not changed during one minute

Sample scenario when data is copied to the specified directory over the network.

$ check.sh /srv/backup
Calculated amount data copied per minute is  243 MB
---------------------------------------------------
Prognosis for a whole hour               is   14 GB
Prognosis for a whole day                is  341 GB
---------------------------------------------------
Current directory size                   is  241 GB

Hint about the most exciting part of this shell script

cut (get the first column), tac (swap lines), paste (create a subtraction equation) and bc (calculate equation) can be replaced by single awk command, but I like it more this way.