How to merge multiple known_hosts entries into single row

Global known_hosts file located in /etc/ssh/ssh_known_hosts and user managed ~/.ssh/known_hosts contains known host public keys. Due to its nature these files can contain multiple entries using the same key. It is not a problem, but merging entries using the same key is the first step that is necessary to inspect it visually and to verify host entries.

I will use mawk to merge multiple known_hosts entries using the same key into single row. It has some limitations, especially regarding multidimensional arrays, but I like it and it is installed by default on Debian/Ubuntu Linux.

$ awk -W version
mawk 1.3.3 Nov 1996, Copyright (C) Michael D. Brennan

compiled limits:
max NF             32767
sprintf buffer      2040

mawk script

This is a mawk script. I have simplified it to be more readable and added support for optional marker (@cert-authority and @revoked keywords).

#!/usr/bin/mawk -f

BEGIN {
  FS=" "
}

/^$/ {next}

/^#/ {next}

{
  if (substr($1,1,1) == "@") {
    key_address   = $2
    key_type      = $3
    key_value     = $4
    if (NF > 4) {
      key_comment = substr($0, index($0,$5))
    } else {
      key_comment = ""
    }
    marker[key_type, key_value] = $1
  } else {
    key_address   = $1
    key_type      = $2
    key_value     = $3
    if (NF > 3) {
      key_comment = substr($0, index($0,$4))
    } else {
      key_comment = ""
    }
  }
  if (seen_this_key[key_type, key_value]++ == 0) {
    address[key_type, key_value] = key_address
    comment[key_type, key_value] = key_comment
  } else {
    address[key_type, key_value] = address[key_type, key_value] "," key_address
    if (NF > 3 && comment[key_type, key_value] != key_comment) {
      comment[key_type, key_value] = comment[key_type, key_value] ", " key_comment
    }
  }
}

END {
  for (key in address) {
    split(key,key_array,SUBSEP)
    key_type  = key_array[1]
    key_value = key_array[2]
    if (marker[key_type, key_value]) { 
      print marker[key_type, key_value] " " address[key_type, key_value] " " key_type " " key_value " " comment[key_type, key_value]
    } else {
      print address[key_type, key_value] " " key_type " " key_value " " comment[key_type, key_value]
    }
  }
}

Usage

Initial known_hosts file.

$ cat ~/.ssh/known_hosts
@revoked www.sleeplessbeastie.space ssh-rsa AAAAB3Nz...jzHuFCSGR [email protected]
wiki.sleeplessbeastie.space ecdsa-sha2-nistp256 AAAAE2Vj...OBuhu+9k=
192.0.2.20,192.0.2.21 ecdsa-sha2-nistp256 AAAAE2Vj...OBuhu+9k=
wiki.example.org ecdsa-sha2-nistp256 AAAAE2Vj...OBuhu+9k=
blog.sleeplessbeastie.eu ecdsa-sha2-nistp256 AAAAE2Vj...CGuhu-7k=
#wiki.sleeplessbeastie.space ssh-ed25519 AAAAC3Nz.../7q/1M1TB

Squeezed known_hosts file. Redirect output to temporary file to replace known_hosts file afterwards.

$ squeeze_known_hosts.awk ~/.ssh/known_hosts
@revoked www.sleeplessbeastie.space ssh-rsa AAAAB3Nz...jzHuFCSGR [email protected]
wiki.sleeplessbeastie.space,192.0.2.20,192.0.2.21,wiki.example.org ecdsa-sha2-nistp256 AAAAE2Vj...OBuhu+9k= 
blog.sleeplessbeastie.eu ecdsa-sha2-nistp256 AAAAE2Vj...CGuhu-7k=