Right now, robots.txt on lemmy.ca is configured this way

User-Agent: *
  Disallow: /login
  Disallow: /login_reset
  Disallow: /settings
  Disallow: /create_community
  Disallow: /create_post
  Disallow: /create_private_message
  Disallow: /inbox
  Disallow: /setup
  Disallow: /admin
  Disallow: /password_change
  Disallow: /search/
  Disallow: /modlog

Would it be a good idea privacy-wise to deny GPTBot from scrapping content from the server?

User-agent: GPTBot
Disallow: /

Thanks!

  • Elise@beehaw.org
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Just out of curiosity, why is everyone so up in arms about this? I mean sure it’s just another corp but any other reasons?

    • corsicanguppy@lemmy.ca
      link
      fedilink
      arrow-up
      6
      ·
      1 year ago

      Server load spent on a bot scraping our contributions to be used to make money.

      There’s so much there that it’s gonna offend someone.

      • Elise@beehaw.org
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        Wouldn’t it just be scraped once (per company)? That doesn’t sound like such a problem.