Website

When to use the crawler vs a direct integration
What the crawler does
Related Documentation

The Website source crawls a domain you control and ingests its pages into the AI’s training index. Use it when the content you want the AI to know about is reachable at a public URL — a help center, a product docs site, a marketing pages surface — and you don’t have a direct integration for the system that serves it.

When to use the crawler vs a direct integration

Situation	Use
You publish your help center through Zendesk, Intercom, or Front	The dedicated Zendesk / Intercom / Front source
Your content lives in Confluence, Notion, GitBook, or Freshdesk	The dedicated source for that tool
Your content is reachable at a public URL and nothing above applies	The crawler
The site is gated behind auth	Not today — the crawler only fetches public URLs

Don’t run both. If you’ve already connected Zendesk (or any other direct source), don’t also crawl the public help-center URL for the same content. The AI will double-index every article and surface duplicates in retrieval.

What the crawler does

Kicks off a crawl against the URL you provide.
Follows links within the domain up to page_limit pages (default 100, max 5000).
Extracts main content as Markdown, skipping navigation chrome and binary assets.
Tracks a content hash per page — re-crawls only re-index pages whose content changed.
Re-runs on a schedule you pick (crawl_interval_hours, default 168 hours / 7 days).
Exposes every discovered page so you can exclude individual URLs, re-include ones you excluded, or force-resync one page.

Connect a website

URL, limits, include/exclude paths, crawl interval.

Troubleshooting

Stuck crawls, locale scoping, pages not indexing.

Crawl API

Programmatic control of datasources, crawls, and pages.

Connect a knowledge source

Decision matrix for all sources.

Content rules Connect

Agent Knowledge

Main Knowledge Sources

Additional Knowledge Sources

When to use the crawler vs a direct integration

What the crawler does

Connect a website

Troubleshooting

Crawl API

Connect a knowledge source

Agent Knowledge

Main Knowledge Sources

Additional Knowledge Sources

Documentation Index

​When to use the crawler vs a direct integration

​What the crawler does

​Related Documentation

Connect a website

Troubleshooting

Crawl API

Connect a knowledge source

When to use the crawler vs a direct integration

What the crawler does

Related Documentation