How to convert manual pages to DokuWiki

I have created simple shell script to convert manual pages to DokuWiki as I want to have them always at hand and easily interlink to them from other pages.

Initial notes

Distinction between original and translated manual pages

Due to problems with national characters being messed up during conversion process I had to separate non-translated and translated manual pages.

Original manual pages are converted to HTML using groff utility, then to DokuWiki using pandoc converter.

$ zcat /usr/share/man/man1/ls.1.gz | \
  groff -Thtml -P -l -mmandoc | \
  pandoc -f html -t dokuwiki -o /tmp/ls.1.txt

Translated manual pages are converted to HTML using roffit utility, then to DokuWiki using pandoc converter.

$ zcat /usr/share/man/pl/man1/ls.1.gz | \
  roffit --bare | \
  pandoc -f html -t dokuwiki -o /tmp/

Required pandoc version

You need to use at least pandoc 1.13 released on 15 August 2014 - earlier versions does not support DokuWiki as an output format.

At the time of this writing it is not available in Debian Wheezy package repositories, so install it directly from pandoc website.

How to install localized manual pages

Install manpages-pl package to get manual pages translated into Polish from sections 1, 4, 5, 6, 7, 8 and manpages-pl-dev to get manual pages from sections 2, 3.

$ sudo apt-get install manpages-pl manpages-pl-dev

Use the same method to install desired translation - fr for French language, de for German language, hu for Hungarian language and so on.

Shell script

# Convert manual pages to DokuWiki
# dependencies: groff, roffit, pandoc (at least 1.13)

# language code
# EN (default): ""
# PL          : "pl"

# man paths

# output directory

if [ ! -d "$outputdir" ]; then
  mkdir -p "${outputdir}"
  for n in $(seq 1 8); do
    mkdir "${outputdir}/man${n}"

# store old IFS value

IFS=":" # manpath uses ":" as separator
for manpath in $(manpath); do
  IFS=$OIFS # restore IFS inside this loop

  for section in $(find ${mandir} -type d -name man? -maxdepth 1 -exec basename \{\} \; 2>/dev/null); do
    # for each available manual page   
    for manpage in $(find ${mandir}/${section}/ -type f); do
       if [ -z "$langcode" ]; then # en
         zcat ${manpage} | groff -Thtml -P -l -mmandoc 2>/dev/null | pandoc -f html -t dokuwiki -o ${outputdir}/${section}/$(basename $manpage .gz).txt
       else                        # pl, fr, de,  ...
         zcat ${manpage} | roffit --bare 2>/dev/null | pandoc -f html -t dokuwiki -o ${outputdir}/${section}/$(basename $manpage .gz).txt 

Ending notes

You can use nspages plugin to automatically generate list of namespaces and pages.

Milosz Galazka's Picture

About Milosz Galazka

Milosz is a Linux Foundation Certified Engineer working for a successful Polish company as a system administrator and a long time supporter of Free Software Foundation and Debian operating system.

Gdansk, Poland