How to automate mouse and keyboard

Recently I was wondering if it would be possible to automate mouse and keyboard actions. The answer is obviously yes as you can use xautomation and xdotool tools. I will describe them in couple of simple examples.

Each script available here needs to be assigned to a keyboard shortcut. Just catch an idea as possibilities are almost endless so do not forget to read available manual pages.

Have fun!

Preparations

Install the following packages to automate mouse and keyboard:

$ sudo apt-get install xdotool
$ sudo apt-get install xautomation
$ sudo apt-get install xwd

Example 1

First script is very simple as it will only print date anywhere you execute it.

kwrite_date

#!/bin/sh

# used command
xte_command=`which xte`

if [ -n "$xte_command" ]; then
  # set desired date format
  date=`date +%d.%m.%Y`

  # generate fake input
  $xte_command "str ${date}"
fi

Example 2A

This script will add "https://" string to the beginning of the URL. It will work only in Google Chrome and will be useless in other programs but we will extend it in next example.

google_chrome_auto_https

Just note that you need to use sleep and usleep commands to wait a certain time.

#!/bin/sh

# used command
xte_command=`which xte`

if [ -n "$xte_command" ]; then
  # press F6, Home, enter "https://" string and press return
  $xte_command "sleep 1" "key F6" "usleep 10000" "key Home" "str https://" "key Return"
fi

Example 2B

This is an extended version of the previous script as it will send defined keys only if Google Chrome is an active window. You can extend it to support other browsers or even perform different tasks in different applications using the same keyboard shortcut.

#!/bin/sh

# used commands
xprop_command=`which xprop`
xdotool_command=`which xdotool`
xte_command=`which xte`

if [ -n "$xprop_command" -a -n "$xdotool_command" -a -n "$xte_command" ]; then
  # get active window id
  active_window_id=`$xdotool_command getactivewindow`

  # get class of the active window
  window_class=`$xprop_command -id $active_window_id | sed -n -e "s/^WM_CLASS(STRING).*\"\(.*\)\", \".*\"/\1/ p"`

  # execute only when active windows is Google Chrome
  if [ "$window_class" = "google-chrome" ]; then
    # press F6, Home, enter "https://" string and press return
    $xte_command "sleep 1" "key F6" "usleep 10000" "key Home" "str https://" "key Return"
  fi
fi

Example 2C

This script will open debian.org, lwn.net, debian-news.net and freecode.com in new tabs and then return to the first opened tab.

google_chrome_multiple_tabs

#!/bin/sh

# used commands
xprop_command=`which xprop`
xdotool_command=`which xdotool`
xte_command=`which xte`

if [ -n "$xprop_command" -a -n "$xdotool_command" -a -n "$xte_command" ]; then
  # get active window id
  active_window_id=`$xdotool_command getactivewindow`

  # get class of the active window
  window_class=`$xprop_command -id $active_window_id | sed -n -e "s/^WM_CLASS(STRING).*\"\(.*\)\", \".*\"/\1/ p"`

  # execute only when active window is Google Chrome
  if [ "$window_class" = "google-chrome" ]; then
    # open debian.org in new tab
    $xte_command "sleep 1" "keydown Control_L" "key t" "keyup Control_L"
    $xte_command "str debian.org" "key Return"

    # open lwn.net in new tab
    $xte_command "keydown Control_L" "key t" "keyup Control_L"
    $xte_command "str lwn.net" "key Return"

    # open debian-news.net in new tab
    $xte_command "keydown Control_L" "key t" "keyup Control_L"
    $xte_command "str debian-news.net" "key Return"

    # open freecode.com in new tab
    $xte_command "keydown Control_L" "key t" "keyup Control_L"
    $xte_command "str freecode.com" "key Return"

    # execute 'open previous tab' three times
    for i in $(seq 1 3); do
      $xte_command "keydown Control_L" "keydown Shift_L" "key Tab" "keyup Shift_L" "keyup Control_L"
    done
  fi
fi

Example 3A

This script will click full-screen button on the embedded YouTube video. It will create screenshot of the active window, scan it for the full-screen button pattern and then move mouse and click that button.

At first create screen shot of the YouTube video.

YouTube video

Extract full-screen button:

youtube_fs_button

Convert it to PAT format:

$ png2pat youtube_fs_button.png > youtube_fs_button.pat

Now you can search for it using created pattern:

$ visgrep youtube.png youtube_fs_button.pat youtube_fs_button.pat
614, 369 0

Complete shell script:

#!/bin/sh

# used commands
xprop_command=`which xprop`
xdotool_command=`which xdotool`
xte_command=`which xte`
visgrep_command=`which visgrep`
xwd_command=`which xwd`
xwdtopnm_command=`which xwdtopnm`
pnmtopng_command=`which pnmtopng`
xwininfo_command=`which xwininfo`

# PAT file
pat_file="/home/milosz/Documents/youtube_fs_button.pat"

if [ -n "$xprop_command" -a -n "$xdotool_command" -a -n "$xte_command" -a -n "$xwd_command" -a -n "$xwininfo_command" ]; then
  # get active window id
  active_window_id=`$xdotool_command getactivewindow`

  # get class of the active window
  window_class=`$xprop_command -id $active_window_id | sed -n -e "s/^WM_CLASS(STRING).*\"\(.*\)\", \".*\"/\1/ p"`

  # execute only when active window is Google Chrome
  if [ "$window_class" = "google-chrome" ]; then
    # create temporary file to store screen shot
    tmp_file=`mktemp`

    # move active window to foreground
    $xdotool_command windowactivate $active_window_id

    # create window screen shot
    $xwd_command -id $active_window_id | $xwdtopnm_command | $pnmtopng_command > $tmp_file

    # search for a pattern
    rpos=`${visgrep_command} $tmp_file $pat_file $pat_file`
    if [ -n "$rpos" ]; then
      # get window position
      x1=`$xwininfo_command -id $active_window_id | awk '/Absolute upper-left X/ {print $NF}'`
      y1=`$xwininfo_command -id $active_window_id | awk '/Absolute upper-left Y/ {print $NF}'`

      # get found pattern position (inside window)
      x2=`echo $rpos | sed "s/\(.*\),.*/\1/"`
      y2=`echo $rpos | sed "s/.*,\(.*\) 0/\1/"`

      # calculate resulting position (add '5' to be inside)
      x=`expr $x1 + $x2 + 5`
      y=`expr $y1 + $y2 + 5`

      $xte_command "mousemove $x $y" "mouseclick 1" "sleep 1" "key space"
    fi
    unlink $tmp_file
  fi
fi

Example 3B

This script will search for Google Chrome window with opened tab containing YouTube video and then click full-screen button and play it.

#!/bin/sh

# used commands
xprop_command=`which xprop`
xdotool_command=`which xdotool`
xte_command=`which xte`
visgrep_command=`which visgrep`
xwd_command=`which xwd`
xwdtopnm_command=`which xwdtopnm`
pnmtopng_command=`which pnmtopng`
xwininfo_command=`which xwininfo`

# PAT file
pat_file="/home/milosz/Documents/youtube_fs_button.pat"

if [ -n "$xprop_command" -a -n "$xdotool_command" -a -n "$xte_command" -a -n "$xwd_command" -a -n "$xwininfo_command" ]; then
  # get visible windows
  window_list=`$xprop_command -root | sed -n -e "/_NET_CLIENT_LIST_STACKING(WINDOW): window id #/ {s/.* # \(.*\)/\1/;s/,//g;p}"`

  # create temporary file to store screen shot
  tmp_file=`mktemp`

  # check all visible windows
  for window_id in $window_list; do
    # get window class
    window_class=`$xprop_command -id $window_id | sed -n -e "s/^WM_CLASS(STRING).*\"\(.*\)\", \".*\"/\1/ p"`

    # check window class
    if [ "$window_class" = "google-chrome" ]; then
      # move window to foreground
      $xdotool_command windowactivate $window_id

      # get window information
      info=`$xwininfo_command -id $window_id`

      # create screen shot
      $xwd_command -id $window_id | $xwdtopnm_command | $pnmtopng_command > $tmp_file

      # search for a pattern
      rpos=`${visgrep_command} $tmp_file $pat_file $pat_file`
      if [ -n "$rpos" ]; then
        # get window position
        x1=`xwininfo -id $window_id | awk '/Absolute upper-left X/ {print $NF}'`
        y1=`xwininfo -id $window_id | awk '/Absolute upper-left Y/ {print $NF}'`

        # get found pattern position (inside window)
        x2=`echo $rpos | sed "s/\(.*\),.*/\1/"`
        y2=`echo $rpos | sed "s/.*,\(.*\) 0/\1/"`

        # calculate resulting position (add '5' to be inside)
        x=`expr $x1 + $x2 + 5`
        y=`expr $y1 + $y2 + 5`

        $xte_command "mousemove $x $y" "mouseclick 1" "sleep 1" "key space"
        unlink $tmp_file
        break
      fi
    fi
  done

  if [ -e "$tmp_file" ]; then
    unlink $tmp_file
  fi
fi

Example 4

This script will select all folders in Konqueror window. It can be easily modified to perform different tasks based on file icons.

Create screen shot of the Konqueror window.

konqueror

Extract folder icon:

folder

Convert it to PAT file:

$ png2pat folder.png > folder.pat

Complete shell script:

#!/bin/sh

# used commands
xprop_command=`which xprop`
xdotool_command=`which xdotool`
xte_command=`which xte`
visgrep_command=`which visgrep`
xwd_command=`which xwd`
xwdtopnm_command=`which xwdtopnm`
pnmtopng_command=`which pnmtopng`
xwininfo_command=`which xwininfo`

# PAT file
pat_file="/home/milosz/Documents/folder.pat"

if [ -n "$xprop_command" -a -n "$xdotool_command" -a -n "$xte_command" -a -n "$xwd_command" -a -n "$xwininfo_command" ]; then
  # get active window id
  active_window_id=`$xdotool_command getactivewindow`

  # get class of the active window
  window_class=`$xprop_command -id $active_window_id | sed -n -e "s/^WM_CLASS(STRING).*\"\(.*\)\", \".*\"/\1/ p"`

  # execute only when active window is Google Chrome
  if [ "$window_class" = "konqueror" ]; then
    # create temporary file to store screen shot
    tmp_file=`mktemp`

    # move active window to foreground
    $xdotool_command windowactivate $active_window_id

    # create window screen shot
    $xwd_command -id $active_window_id | $xwdtopnm_command | $pnmtopng_command > $tmp_file

    # search for a pattern (remove ' 0' at the end of each line)
    rpos=`${visgrep_command} $tmp_file $pat_file $pat_file | sed s/\ 0//`

    # press shift and click on each folder
    for folder_pos in $rpos; do
       # get window position
       x1=`$xwininfo_command -id $active_window_id | awk '/Absolute upper-left X/ {print $NF}'`
       y1=`$xwininfo_command -id $active_window_id | awk '/Absolute upper-left Y/ {print $NF}'`

       # get found pattern position (inside window)
       x2=`echo $folder_pos | sed "s/\(.*\),.*/\1/"`
       y2=`echo $folder_pos | sed "s/.*,\(.*\)/\1/"`

       # calculate resulting position (add '5' to be inside)
       x=`expr $x1 + $x2 + 5`
       y=`expr $y1 + $y2 + 5`

       $xte_command "keydown Shift_L" "mousemove $x $y" "mouseclick 1" "keyup Shift_L"
    done

    # remove temporary file
    unlink $tmp_file
  fi
fi

Konqueror window after the script is executed:

konqueror_selected_folders

Notes

As you see there are many possibilities to automate mouse and keyboard. You can automatically feed data to any application, automate tedious tasks or even hide specific windows. I have just barely scratched the surface. The rest is up to you.

You are not forced to use xwd to create screen shots as ImageMagick import command will do fine.

Milosz Galazka's Picture

About Milosz Galazka

Milosz is a Linux Foundation Certified Engineer working for a successful Polish company as a system administrator and a long time supporter of Free Software Foundation and Debian operating system. He is also open for new opportunities and challenges.

Gdansk, Poland https://sleeplessbeastie.eu