Robots and discouraging search engine indexing

A question about the Robots setting that allows users to Discourage crawlers and search engines from indexing a book. I realize selecting this will prevent scraping tools used by most large language models to NOT scrape your book content. Would selecting this also make it so that a book would be less likely to show up in a user search engine search? I want to encourage use of this setting but want to accurately describe it to authors who may be concerned about it suppressing their published work- for promotional purposes. Thanks!

1 Like

Hi Lauren,

There are 3 settings in Sharing & Privacy that do different things:

Book visibility will make the book private and require a credentials to review content.

Under ‘Robots’, the first setting (Discourage AI…) will set a robots meta tag with the values ‘noai’ and ‘noimageai’. Good AI net citizens (debatable if they exist at all) should honour that and not ingest their content for training LLMs.

The ‘Discourage crawlers’ setting will also add ‘noindex’ and 'nofollow’ to the robots meta tag, which should tell a wider range of well behaved bots to not crawl and index the content. This includes search engines like Google, etc.

Unfortunately, this is something that cannot be enforced on the server end and relies on the bots themselves to honour these tags. So even with these settings some bad actors can and will crawl the content and use it for whatever purposes.

Thanks Christopher,
Would turning on the Discourage crawlers settings make it so that a book is less likely to show up in a Google search? I’m wondering if turning this setting on will make a book less findable by humans looking for a book (in addition to telling bots not to crawl/index the content)? I could see some book authors being concerned that checking this box will make their book less findable, and want to advise them properly.

Hi Lauren,

Yes, enabling “Discourage crawlers” will make the book much harder to find via search engines.

It adds a noindex tag, so Google and Bing generally will not show it in results. The book is still public, but only accessible via direct link or shared URLs like a syllabus or social media.

Bottom line: leave it unchecked if you want the book to be discoverable.

Use case: Drafts, private course materials, internal content, testing, or anything you want public by link but not searchable.