Web Scrape {API}

Discover every page on any website by crawling its sitemap.

Pass a domain name and get back up to 500 deduplicated page URLs. Sitemap index files are crawled recursively. Non-page resources are filtered out automatically.

Ideal for building content indexes, seeding crawlers, or auditing a competitor's full site structure in seconds.

No credit card required

View Documentation
Daydream logo
Passionfroot logo
Klarna logo
Super.com logo
Orange logo
SendX logo
Kovai logo
Daydream logo
Passionfroot logo
Klarna logo
Super.com logo
Orange logo
SendX logo
Kovai logo

What You Get

Each request crawls a domain's sitemaps and returns all discoverable page URLs.

  • Up to 500 page URLs — Deduplicated, page-only results filtered of images and PDFs
  • Sitemap index support — Recursively crawls nested sitemap index files automatically
  • Crawl metadata — Know how many sitemaps were discovered, fetched, skipped, and errored
  • Normalized domain input — Pass just the domain name; protocol handling is automatic

How It Works

  1. 01

    Send a domain

    Pass the domain name (e.g., “example.com”) — no protocol needed

  2. 02

    Sitemap files discovered

    The API checks robots.txt and common sitemap paths, then recursively follows sitemap index files

  3. 03

    URLs extracted and deduplicated

    Non-page resources (images, PDFs) are filtered; duplicate URLs are removed

  4. 04

    Clean URL list returned

    Up to 500 page URLs with metadata on how many sitemaps were crawled

API Response

GET /v1/web/scrape/sitemap?domain=brand.dev
{
  "success": true,
  "domain": "brand.dev",
  "urls": [
    "https://brand.dev/",
    "https://brand.dev/pricing",
    "https://brand.dev/blog",
    "https://brand.dev/data/logo-api",
    "https://brand.dev/use-cases/logo-link",
    "... up to 500 URLs"
  ],
  "meta": {
    "sitemapsDiscovered": 3,
    "sitemapsFetched": 3,
    "sitemapsSkipped": 0,
    "errors": 0
  }
}

Personalize at scale

Join 4,000+ businesses using Brand.dev to personalize their products.

Web Scrape Sitemap API - Crawl & Discover All URLs from Any Domain | Brand.dev