this post was submitted on 24 Jul 2024
182 points (96.0% liked)
Technology
59099 readers
3195 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I guess Reddit is permitted to only let Google index it
Are they though?
I don't know of any law that says that they can't.
I also don’t know if a law that says search engines have to honor a robots.txt file. I guess we will see what happens if Bing or some other service decides to ignore it.
You can just require a log in to view content, or just flat out auto ban indexing robots.
I guess they can make it hard to index by scraping by rate limiting or requiring login to view content etc and only provide Google the api to bypass the restrictions
There's probably a lot of ways to do it