How many URLs does the sitemap API return?

The API returns up to 500 deduplicated page URLs per domain. Non-page resources like images, PDFs, and video files are automatically filtered out so you only get navigable page URLs.

Does it support sitemap index files?

Yes. The API handles sitemap index files recursively—it discovers child sitemaps and fetches them in parallel with concurrency control. The response metadata tells you how many sitemaps were discovered, fetched, and skipped.

What is the domain parameter format?

Pass the domain without the protocol—e.g., 'example.com' or 'blog.example.com'. The API automatically normalizes and validates the domain before crawling.

Web Scrape {API}

Discover every page on any website by crawling its sitemap.

Pass a domain name and get back up to 500 deduplicated page URLs. Sitemap index files are crawled recursively. Non-page resources are filtered out automatically.

Ideal for building content indexes, seeding crawlers, or auditing a competitor's full site structure in seconds.

No credit card required

View Documentation

What You Get

Each request crawls a domain's sitemaps and returns all discoverable page URLs.

Up to 500 page URLs — Deduplicated, page-only results filtered of images and PDFs
Sitemap index support — Recursively crawls nested sitemap index files automatically
Crawl metadata — Know how many sitemaps were discovered, fetched, skipped, and errored
Normalized domain input — Pass just the domain name; protocol handling is automatic

How It Works

01
Send a domain
Pass the domain name (e.g., “example.com”) — no protocol needed
02
Sitemap files discovered
The API checks robots.txt and common sitemap paths, then recursively follows sitemap index files
03
URLs extracted and deduplicated
Non-page resources (images, PDFs) are filtered; duplicate URLs are removed
04
Clean URL list returned
Up to 500 page URLs with metadata on how many sitemaps were crawled

API Response

GET /v1/web/scrape/sitemap?domain=brand.dev

{
  "success": true,
  "domain": "brand.dev",
  "urls": [
    "https://brand.dev/",
    "https://brand.dev/pricing",
    "https://brand.dev/blog",
    "https://brand.dev/data/logo-api",
    "https://brand.dev/use-cases/logo-link",
    "... up to 500 URLs"
  ],
  "meta": {
    "sitemapsDiscovered": 3,
    "sitemapsFetched": 3,
    "sitemapsSkipped": 0,
    "errors": 0
  }
}

Personalize at scale

Join 4,000+ businesses using Brand.dev to personalize their products.

Book a call→