Reddit says that it doesn’t want companies scraping the site for AI. Microsoft says it’s not doing that.
mozz
link
fedilink
141M

I know this because i wrote a page that IP bans anything that visits it, and l also put it as a not allowed spot in the robots.txt file.

This is fuckin GENIUS

only if you don’t want any visits except from yourself, because this removes your site from any search engine

should write a “disallow: /juicy-content” and then block anything that tries to access that page (only bad bots would follow that path)

That’s exactly what was described…?

Oops. As a non-native English speaker I misunderstood what he meant. I understood wrongly that he set the server to ban everything that asked for robots.txt

Just in case it makes you feel any better: I’m a native English speaker who always aced the reading comprehension tests back in school, and I read it the exact same way. Lol! I’m glad I wasn’t the only one. :)

mozz
link
fedilink
51M

You need to read again the thing that was described, more carefully. Imagine for example that by “a page,” the person means a page called /juicy-content or something.

Create a post

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community’s icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

  • 1 user online
  • 144 users / day
  • 275 users / week
  • 709 users / month
  • 2.87K users / 6 months
  • 1 subscriber
  • 3.09K Posts
  • 64.9K Comments
  • Modlog