Resources discovery

Let your HEA read your website safely

Your Human-Enhanced Agent learns from your website’s resources: the public pages, blog posts, articles, FAQs and documents you choose to expose.


Most sites work out of the box. Some use a Web Application Firewall (WAF) or “bot protection” that blocks new crawlers by default. In those cases, you may need to allow the HEA-World crawler.

What you allow (and what you don’t):
Our crawler makes read-only GET requests to public pages only, never logs in, and never posts data.
Allow this user agent in your firewall / security tools:

User-Agent: HEA-Crawler/1.0

1. Do I need this?

Only if your site shows “blocked” or “bot detected” errors when we try to scan it, or if your host uses an aggressive WAF.

2. Why do it?

So your HEA can generate itself from your site, keep answers up to date, and avoid copy-pasting long texts by hand.

3. What stays in your control?

You decide which domain and content are exposed, you can pause or revoke access at any time, and nothing gets published automatically.

Where does this usually apply?

Your webmaster or hosting provider will know exactly which layer is active, but in practice whitelisting is usually needed if you use:

  • WordPress sites using security plugins (WordFence, etc.) or host-level WAF (SiteGround, Cloudflare, Sucuri…).
  • Hosted builders such as Wix, Shopify, Squarespace, Webflow, Ghost with built-in bot protection or firewall add-ons.
  • Custom sites behind a CDN / WAF such as Cloudflare, Fastly, Akamai, etc.

If you’re not sure, simply forward the email template below to your webmaster or host.

How to whitelist the crawler on common platforms

Here are short, practical instructions for the most common setups. Exact labels may vary a bit depending on your plan and UI version.

Cloudflare WAF
  1. Go to Security > WAF.
  2. Create a new Custom Rule (or “User Agent rule”).
  3. Set the condition: Field User-Agent Operator contains Value HEA-Crawler/1.0
  4. Set the action to Allow.
  5. Optional: under Bots, add an exception so Bot Fight Mode is skipped for this user agent.

If your team prefers IP rules, your admin can also whitelist the static IP range used by your HEA integration.

SiteGround / SGCaptcha WordPress hosting
  1. Open Site Tools > Security for your site.
  2. Check modules such as Bot Protection / Captcha / AI anti-bot.
  3. If an option like Allowed User Agents exists, add: HEA-Crawler/1.0.
  4. If you don’t see it, contact SiteGround support and ask them to whitelist that user agent.

Example message: “Please whitelist the user-agent HEA-Crawler/1.0 so our AI assistant can read our public pages.”

Sucuri Firewall
  1. Log in to the Sucuri Firewall Dashboard.
  2. Go to Access Control > Whitelist.
  3. Add a new rule where User-Agent = HEA-Crawler/1.0 and set the action to Whitelist / Allow.
  4. If you are using IP-based rules, your admin can also whitelist the HEA integration IP range.
WordFence WordPress plugin
  1. In WordPress admin, go to WordFence > Firewall.
  2. Open Blocking or Advanced Firewall Options (depending on version).
  3. In the section for Allowed / Whitelisted User Agents, add HEA-Crawler/1.0.
  4. Save changes and clear any caching layers.

If you see “bot / human traffic” modes, ensure our user agent is not flagged as a generic bot.

Other platforms (Wix, Shopify, Squarespace, Webflow, Ghost…)
  • Look for sections called Firewall, Bot Protection, or WAF.
  • Search for Allowlist / Whitelist / Exceptions / Trusted bot.
  • Add an allow rule for the user agent HEA-Crawler/1.0.
  • If unsure, send them the email above (with in copy) and they can coordinate with us.

Tip: after whitelisting, trigger a “Refresh website content” from your HEA backoffice to confirm that resources are discovered correctly.