trash
fedilink
CronyAkatsuki
link
fedilink
English
23
edit-2
5d

Try crowdsec.

You can set it up with list’s that are updated frequetly and have it look at caddy proxy logs and then it can easilly block ai/bot like traffic.

I have it blocking over 100k ip’s at this moment.

https://www.crowdsec.net/

zoey
creator
link
fedilink
English
185d

Not gonna lie, the $3900/mo at the top of the /pricing page is pretty wild.
Searched “crowdsec docker” and they have docs and all that. Thank you very much, I’ve heard of crowdsec before, but never paid much attention, absolutely will check this out!

K3CAN
link
fedilink
English
54d

The paid plans get you the “premium” blocklists, which includes one specially made to prevent AI scrapers, but a free account will still get you the actual software, the community blocklist, plus up to three "basic"lists.

CronyAkatsuki
link
fedilink
English
1
edit-2
4d

And the comminity blocklists are updated when more than a couple ( I think the number is something like 10-50 ) instances of crowdsec block an ip in some fast timeframe.

The ai blocklist just adds IP when even one instance finds an AI trying to scrape right from the useragent.

So even if the community blocklist has fewer ai ip’s, it does eventually include them.

@Starfarer@lemmy.today
link
fedilink
English
24d

Which Crowd-Sec blocklists are you using?

CronyAkatsuki
link
fedilink
English
14d

I’m using the default list alongside Firehol BotScout list and Firehol cybercrime tracker list set to ban.

Also using the Firehol cruzit.com list set to do captcha, just in case it’s not actually a bot.

I’m also using the cs-firewall-bouncer and a custom bouncer that’s shown on crowdsecs tutorials to detect privilege escalation for if anybody actually manages to get inside.

Alongside that I’m using a lot of scenario collection’s for specific software I’m using like nextcloud, grafana, ssh, … which helps a lot with attacks directly done on a service and not just general scraping or both path traversing.

All free and have been using it for a year, only complaint I have is that I had to make a cronjob to restart the crowdsec service every day because it would stop working after a couple days because of the amount of requests it has to process.

You don’t have to pay to use it

Create a post

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.

Rules:

  1. Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

  • 1 user online
  • 94 users / day
  • 654 users / week
  • 1.55K users / month
  • 4.06K users / 6 months
  • 1 subscriber
  • 4.21K Posts
  • 87.9K Comments
  • Modlog