So I have started with

wget --timestamping -r --page-requisites --continue --convert-links http://deedbot.org/

which mirrored the site nicely (if a bit slowly). In the end I noticed output like (at least it was not done silently):

Converting links in deedbot.org/deed-387343-14.txt... 3-0

Yes, someone decided that what is worth doing, is worth overdoing and thus text/plain is considered for converting links and crawling! Obviously, breaking the signatures in the process1. I have ended up with two step script:

cd /var/www/html
#Fetch deeds without converting links
wget --timestamping -r --continue http://deedbot.org/
#Fetch only HTML with requisites and convert links
wget --timestamping -r --page-requisites  --convert-links --reject .txt http://deedbot.org/

I have not found any info about this in the documentation, what other files it deems suitable for this treatment or how to affect this behavior. It says "GNU Wget 1.17.1". It is likely I will end up coding my own spider.

Oh, and almost forgot - the mirror can be found here.


[1]To whomever is now going to say "it's server fault, deedbot should output GPG signed material with proper mimetype like application/pgp-signature". I can only recommend frontal lobotomy by robot that fetches its instructions with wget.