Dynamic Sitemap & Robots.txt for Next.js Multi-Tenant
Step-by-step guide to tenant detection, scoping Payload queries, and runtime sitemaps/robots for Next.js on Vercel

⚡ Next.js Implementation Guides
In-depth Next.js guides covering App Router, RSC, ISR, and deployment. Get code examples, optimization checklists, and prompts to accelerate development.
Last week, I was deploying a multi-tenant Payload CMS application to Vercel when the build suddenly failed with a cryptic error: "[queryAllPageSlugs] Tenant is required but was not provided." The issue was that my sitemap.xml was trying to query database without any tenant context, breaking the entire build process. After hours of debugging through the Next.js App Router and Payload's multi-tenant system, I discovered a clean solution that maintains full tenant isolation while keeping the build process efficient. This guide shows you exactly how to configure dynamic sitemap and robots.txt files that work flawlessly across all tenants.
Understanding the Challenge
In a single-tenant setup, sitemap.xml and robots.txt are straightforward - you hardcode your domain and generate URLs from your database. But in a multi-tenant Payload CMS setup, each tenant has its own domain and content subset, making this approach problematic. The challenge is threefold:
- Next.js generates these files at build time, before any tenant context is available
- Payload's database layer requires tenant parameters for all queries to prevent cross-tenant data leaks
- The files must respond differently based on the incoming request hostname
The typical solutions of generating multiple static files or using API routes all have significant drawbacks - either maintenance overhead or SEO implications. The ideal solution is to make these files truly dynamic, responding appropriately based on the tenant that's requesting them.
Setting Up Tenant Detection
The first step is creating a reliable way to detect which tenant is requesting the sitemap or robots file. We'll use the request hostname to identify the tenant, matching first by exact domain then falling back to subdomain pattern.
Create a helper function that queries Payload's tenants collection to find the matching tenant:
// File: src/app/(frontend)/sitemap.ts
import { headers } from 'next/headers'
import { getPayload } from 'payload'
import configPromise from '@payload-config'
import { unstable_cache } from 'next/cache'
const getTenantByDomain = async (domain: string) => {
return await unstable_cache(
async () => {
const payload = await getPayload({ config: configPromise })
const tenants = await payload.find({
collection: 'tenants',
where: {
or: [
{ domain: { equals: domain } },
{ slug: { equals: domain.split('.')[0] } } // Fallback to slug for subdomain patterns
]
},
limit: 1,
})
return tenants.docs[0] || null
},
[`tenant-by-domain-${domain}`],
{
tags: ['tenants'],
revalidate: 3600, // Revalidate every hour
}
)()
}
This function does two important things: it queries Payload for a tenant matching either the exact domain or the first part of a subdomain, and it caches the result for one hour to avoid repeated database hits. The fallback logic allows setups like example-app.vercel.app to match the tenant with slug example-app.
Building the Dynamic Sitemap
With tenant detection in place, we can now build a sitemap that responds differently based on the requesting tenant. The key is to use Next.js' headers() function to get the current request hostname, then generate URLs using the tenant's specific domain.
// File: src/app/(frontend)/sitemap.ts
import type { MetadataRoute } from 'next'
import {
queryAllPageSlugs,
queryAllPostSlugs,
queryAllProductSlugs,
queryAllCaseStudySlugs,
queryAllJobOpeningSlugs,
} from '@/payload/db'
export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
// Get hostname from request headers
const hostname = (await headers()).get('host') || 'www.example.com'
// Try to find tenant by domain or subdomain
const tenant = await getTenantByDomain(hostname)
// If no tenant found, use fallback (example)
const baseUrl = tenant?.domain ? `https://${tenant.domain}` : `https://${hostname}`
const tenantSlug = tenant?.slug || 'example'
const [pages, posts, products, caseStudies, jobOpenings] = await Promise.all([
queryAllPageSlugs(tenantSlug),
queryAllPostSlugs(tenantSlug),
queryAllProductSlugs(tenantSlug),
queryAllCaseStudySlugs(tenantSlug),
queryAllJobOpeningSlugs(tenantSlug),
])
const entries: MetadataRoute.Sitemap = []
// Home page
entries.push({
url: baseUrl,
lastModified: new Date(),
changeFrequency: 'yearly',
priority: 1,
})
// Dynamic pages
pages.forEach((slug) => {
if (slug && slug !== 'home') {
entries.push({
url: `${baseUrl}/${slug}`,
lastModified: new Date(),
changeFrequency: 'monthly',
priority: 0.8,
})
}
})
// Blog posts
posts.forEach((slug) => {
if (slug) {
entries.push({
url: `${baseUrl}/blog/${slug}`,
lastModified: new Date(),
changeFrequency: 'weekly',
priority: 0.6,
})
}
})
// Products
products.forEach((slug) => {
if (slug) {
entries.push({
url: `${baseUrl}/products/${slug}`,
lastModified: new Date(),
changeFrequency: 'weekly',
priority: 0.7,
})
}
})
// Case studies
caseStudies.forEach((slug) => {
if (slug) {
entries.push({
url: `${baseUrl}/case-studies/${slug}`,
lastModified: new Date(),
changeFrequency: 'monthly',
priority: 0.7,
})
}
})
// Job openings
jobOpenings.forEach((slug) => {
if (slug) {
entries.push({
url: `${baseUrl}/careers/${slug}`,
lastModified: new Date(),
changeFrequency: 'weekly',
priority: 0.6,
})
}
})
return entries
}
The critical insight here is that we're passing the tenant slug to all the database query functions. This ensures each query is properly scoped to the correct tenant, maintaining the security boundary that Payload's multi-tenant system provides. The baseUrl is constructed using the tenant's configured domain when available, falling back to the request hostname if needed.
Implementing the Robots.txt File
The robots.txt implementation follows the same pattern but with a different structure since it needs to return a single object rather than an array of URLs.
// File: src/app/(frontend)/robots.ts
import type { MetadataRoute } from "next"
import { headers } from "next/headers"
import { getPayload } from "payload"
import configPromise from "@payload-config"
import { unstable_cache } from "next/cache"
export default async function robots(): Promise<MetadataRoute.Robots> {
// Get hostname from request headers
const hostname = (await headers()).get('host') || 'www.example.com'
// Try to find tenant by domain or subdomain
const tenant = await getTenantByDomain(hostname)
// If no tenant found, use fallback (example)
const baseUrl = tenant?.domain ? `https://${tenant.domain}` : `https://${hostname}`
return {
rules: [
{
userAgent: "*",
allow: "/",
disallow: [
"/admin",
"/api",
],
crawlDelay: 1,
},
{
userAgent: "Googlebot",
allow: "/",
disallow: [
"/admin",
"/api",
],
},
],
sitemap: `${baseUrl}/sitemap.xml`,
host: baseUrl,
}
}
The key difference here is the return structure - robots.txt returns a single object with rules and references to the tenant-specific sitemap. This ensures search engines get the correct sitemap URL for each tenant while maintaining consistent crawling rules.
Common Pitfalls and Solutions
During implementation, I encountered several critical issues that you'll want to avoid:
Headers API is Asynchronous
The Next.js headers() function returns a promise, but it's easy to forget this and write synchronous code. This causes TypeScript errors and runtime failures. Always remember to await the headers call:
// ❌ This will fail
const headersList = headers()
const hostname = headersList.get('host')
// ✅ This works
const hostname = (await headers()).get('host') || 'www.example.com'
Build-Time vs Runtime Context
Initially, I tried to access request context during build time, which fails because there's no actual request. The solution is to keep the files dynamic and let Next.js handle the runtime execution. This is why the files work perfectly in production but may show fallback content during static analysis.
Tenant Parameter Enforcement
Payload's database layer is designed to prevent cross-tenant data leaks by requiring tenant parameters. This caused our initial build error. The solution isn't to bypass this security but to properly provide the tenant context:
// ❌ This throws "Tenant is required but was not provided"
await queryAllPageSlugs()
// ✅ This works and maintains tenant isolation
await queryAllPageSlugs(tenantSlug)
Cache Key Collisions
When caching tenant queries, ensure your cache keys include the tenant identifier. Otherwise, a request for tenant A might return cached data from tenant B:
// ❌ Cache key doesn't include tenant
['tenants']
// ✅ Tenant-specific cache key
[`tenant-by-domain-${domain}`]
Testing and Verification
To verify your implementation works correctly, test both the sitemap and robots endpoints for each tenant:
# Test sitemap for different tenants
curl -H "Host: tenant-a.com" http://localhost:3000/sitemap.xml
curl -H "Host: tenant-b.com" http://localhost:3000/sitemap.xml
# Test robots.txt for different tenants
curl -H "Host: tenant-a.com" http://localhost:3000/robots.txt
curl -H "Host: tenant-b.com" http://localhost:3000/robots.txt
Each request should return URLs and configuration specific to the respective tenant. The sitemap should only include URLs for pages belonging to that tenant, and the robots.txt should reference the correct sitemap URL.
Performance Considerations
The caching strategy we implemented ensures that tenant lookups don't become a bottleneck. By caching for one hour with tenant-specific keys, we balance freshness with performance. You can adjust the revalidation period based on how frequently your tenant domains change.
For high-traffic sites, consider implementing a more aggressive caching strategy or using a CDN edge function to handle these endpoints, but for most applications, the built-in Next.js caching with our tenant-specific keys provides excellent performance.
Conclusion
By implementing dynamic tenant detection and properly scoping all database queries, we've solved the core challenge of multi-tenant sitemap and robots.txt generation in Payload 3 with Next.js. The solution maintains security boundaries, eliminates hardcoded URLs, and scales efficiently across any number of tenants.
You now have a complete understanding of how to configure dynamic sitemap.xml and robots.txt files that respond correctly to each tenant's domain while maintaining proper data isolation and performance. This approach works seamlessly with Payload's multi-tenant system and follows Next.js best practices for metadata file generation.
Let me know in the comments if you have questions, and subscribe for more practical development guides.
Thanks, Matija