About 8 hours ago I submitted a page on how to confuse SSH bots.
Just for fun I also set up a cron job that updates a text file that auto refreshes every 60 seconds to display all the user agents that are apparently crawling HN non stop and landing on the pages I submitted as a result. Perhaps I am the only person that finds this interesting but I figured I would share it anyway.
It seems drakma as the bot that HN uses to read the submitted site. There are now quite a collection of AI agents that hit the site. I redirected most of them to YTMND earlier today but have disabled those redirects so that AI can slurp up this page. I want to see if it really puts a load on the VM. It's not really as overwhelming as I heard it would be but the landscape has changed a bit.
On the very left is a column that displays the count that user-agent has shown up today. After that is whatever the user-agent lists itself as. The text file will auto-refresh every 60 seconds.
Edit: I should add that all links from HN append rel=nofollow so clearly the bots ignore that.
Current load to static pages:
load average: 0.00, 0.00, 0.00
Peak network throughput: 193kb/s out of a 2.4gb/s cap
Protocol counts thus far:
HTTP/2.0: 550
HTTP/1.1: 819
Most real people are HTTP/2.0 and most (but not all) bots are HTTP/1.1. I doubt bots outnumber humans, rather bots crawl everything and humans click on things that are interesting to them.
Only 3 connections using HTTP KeepAlive. There's a lot of DNS request for the HTTPS resource type.
Not to dunk on you but maybe write up your format in a human comprehensible way? A blog post? This tells me, well, nothing really
About 8 hours ago I submitted a page on how to confuse SSH bots.
Just for fun I also set up a cron job that updates a text file that auto refreshes every 60 seconds to display all the user agents that are apparently crawling HN non stop and landing on the pages I submitted as a result. Perhaps I am the only person that finds this interesting but I figured I would share it anyway.
It seems drakma as the bot that HN uses to read the submitted site. There are now quite a collection of AI agents that hit the site. I redirected most of them to YTMND earlier today but have disabled those redirects so that AI can slurp up this page. I want to see if it really puts a load on the VM. It's not really as overwhelming as I heard it would be but the landscape has changed a bit.
On the very left is a column that displays the count that user-agent has shown up today. After that is whatever the user-agent lists itself as. The text file will auto-refresh every 60 seconds.
Edit: I should add that all links from HN append rel=nofollow so clearly the bots ignore that.
Current load to static pages:
Peak network throughput: 193kb/s out of a 2.4gb/s capProtocol counts thus far:
Most real people are HTTP/2.0 and most (but not all) bots are HTTP/1.1. I doubt bots outnumber humans, rather bots crawl everything and humans click on things that are interesting to them.Only 3 connections using HTTP KeepAlive. There's a lot of DNS request for the HTTPS resource type.