How to Generate SEO-Friendly Sitemaps for Headless Shopify with Next.js

A Practical Guide to Building Efficient Sitemaps for Better Indexing, Performance, and Organic Growth in your Headless Shopify & Next.js 15 Store

·Matija Žiberna·
How to Generate SEO-Friendly Sitemaps for Headless Shopify with Next.js

While building a headless e-commerce solution with Shopify and Next.js 15, I discovered that my sitemap.xml was missing all product and collection URLs. This was a critical SEO issue. Search engines couldn't discover my client's 150+ products and 5 collections, severely impacting organic visibility.

Next.js provides native sitemap support through the app/sitemap.ts file convention as outlined in their documentation. However, the challenge with headless Shopify implementations is efficiently fetching product and collection data without overwhelming your build process.

The core problem was that my existing GraphQL queries were fetching complete product objects (variants, images, metafields) totaling ~3KB per product, when sitemaps only need handles and update dates. This resulted in 450KB+ data transfers and slow, unreliable builds.

This article demonstrates how to create ultra-lightweight GraphQL queries specifically for sitemap generation, reducing data transfer by 95% while supporting large product catalogs. As a bonus, we'll also implement app/robots.ts to complete your SEO setup.

The Challenge: Heavy Queries vs. Sitemap Requirements

Headless Shopify implementations typically use comprehensive GraphQL queries designed for page rendering. These queries fetch everything needed to display a product page: variants, pricing, images, metafields, and SEO data.

For sitemaps, we only need:

  • Product/collection handles (for URL construction)
  • Update timestamps (for lastModified field)
  • Featured images (optional, for enhanced SEO)

The performance difference is dramatic:

  • Standard product query: ~3KB per product
  • Sitemap-optimized query: ~150 bytes per product

For a store with 150 products, this means reducing data to around 25KB.

Tech Stack Requirements

This solution works with:

  • Next.js 13+ with App Router
  • Shopify Storefront API access
  • TypeScript (recommended for type safety)
  • GraphQL client (we'll use a custom shopifyFetch function)

Implementation Guide

Step 1: Create Lightweight GraphQL Queries

Objective: Design minimal GraphQL queries that fetch only essential sitemap data, reducing payload size by ~95% compared to standard product queries.

Why this matters: Shopify's Storefront API charges based on query complexity. Lightweight queries improve build times, reduce API costs, and prevent timeouts with large catalogs.

📦 File path: lib/shopify/queries/sitemap.ts

// Lightweight GraphQL queries specifically for sitemap generation
// These queries only fetch essential data needed for XML sitemap

export const getSitemapProductsQuery = `
  query getSitemapProducts($first: Int = 250, $after: String, $query: String) {
    products(first: $first, after: $after, query: $query) {
      pageInfo {
        hasNextPage
        hasPreviousPage
        endCursor
        startCursor
      }
      edges {
        node {
          handle
          updatedAt
          featuredImage {
            url
          }
        }
      }
    }
  }
`;

export const getSitemapCollectionsQuery = `
  query getSitemapCollections($first: Int = 100) {
    collections(first: $first) {
      edges {
        node {
          handle
          updatedAt
        }
      }
    }
  }
`;

Key Concepts:

The getSitemapProductsQuery uses cursor-based pagination ($after: String) essential for handling large product catalogs. The $first: Int = 250 parameter uses Shopify's maximum batch size for optimal API performance. The $query parameter enables filtering - we'll use this to exclude hidden products with syntax like -tag:hidden.

Each product node only fetches three fields:

  • handle - Constructs the product URL (/products/{handle})
  • updatedAt - Provides the lastModified timestamp for SEO
  • featuredImage.url - Optional field for enhanced image sitemaps

The collections query is simpler since most stores have fewer collections. We fetch 100 collections maximum without pagination, only requesting handle and updatedAt fields.


Step 2: Create TypeScript Interfaces

Objective: Define precise TypeScript interfaces for our minimal data structures and GraphQL operation types.

Why this matters: Strong typing prevents runtime errors and provides excellent developer experience with autocomplete and compile-time validation.

📦 File path: lib/shopify/types.ts (add to existing file)

// Sitemap-specific types for lightweight data fetching
export interface SitemapProduct {
  handle: string;
  updatedAt: string;
  featuredImage?: {
    url: string;
  };
}

export interface SitemapCollection {
  handle: string;
  updatedAt: string;
}

// Sitemap operation types for GraphQL responses
export type ShopifySitemapProductsOperation = {
  data: {
    products: Connection<SitemapProduct> & {
      pageInfo: {
        hasNextPage: boolean;
        hasPreviousPage: boolean;
        endCursor: string | null;
        startCursor: string | null;
      };
    };
  };
  variables: {
    first?: number;
    after?: string;
    query?: string;
  };
};

export type ShopifySitemapCollectionsOperation = {
  data: {
    collections: Connection<SitemapCollection>;
  };
  variables: {
    first?: number;
  };
};

Key Concepts:

The SitemapProduct and SitemapCollection interfaces represent the minimal data structures needed for sitemap generation. The featuredImage property is optional since not all products have featured images, while updatedAt comes from Shopify as an ISO date string.

The operation types (ShopifySitemapProductsOperation and ShopifySitemapCollectionsOperation) define the complete GraphQL response structure. These follow the standard pattern where:

  • data contains the actual response from Shopify
  • variables defines the input parameters for type safety
  • Connection<T> represents GraphQL's pagination pattern with edges and nodes
  • pageInfo provides pagination metadata (hasNextPage, endCursor, etc.)

Important: Ensure your existing codebase has a Connection<T> type defined. This is standard in Shopify integrations and represents the GraphQL connection pattern for paginated data.


Step 3: Create Data Fetcher Functions

Objective: Implement API functions that fetch sitemap data with pagination support, error handling, and performance optimizations.

Why this matters: These functions handle the complex logic of paginating through large product catalogs while maintaining build reliability and preventing API timeouts.

📦 File path: lib/shopify/fetchers/storefront/sitemap.ts

import { shopifyFetch } from "../../utils/clients";
import {
  getSitemapProductsQuery,
  getSitemapCollectionsQuery,
} from "../../queries/sitemap";
import { HIDDEN_PRODUCT_TAG } from "../../../constants"; // Your hidden product tag constant
import {
  SitemapProduct,
  SitemapCollection,
  ShopifySitemapProductsOperation,
  ShopifySitemapCollectionsOperation,
} from "../../types";

export async function getSitemapProducts(): Promise<SitemapProduct[]> {
  console.log("🗺️ Fetching products for sitemap...");

  try {
    const allProducts: SitemapProduct[] = [];
    let hasNextPage = true;
    let cursor: string | undefined = undefined;

    // Filter out hidden products using Shopify's query syntax
    const query = `-tag:${HIDDEN_PRODUCT_TAG}`;

    while (hasNextPage && allProducts.length < 500) {
      // Safety limit
      const res: { status: number; body: ShopifySitemapProductsOperation } =
        await shopifyFetch<ShopifySitemapProductsOperation>({
          query: getSitemapProductsQuery,
          variables: {
            first: 250,
            after: cursor,
            query,
          },
        });

      if (!res?.body?.data?.products?.edges) {
        console.warn("⚠️ No products returned from Shopify");
        break;
      }

      const products = res.body.data.products.edges.map(
        (edge: any) => edge.node
      );
      allProducts.push(...products);

      hasNextPage = res.body.data.products.pageInfo.hasNextPage;
      cursor = res.body.data.products.pageInfo.endCursor || undefined;

      console.log(
        `📦 Fetched ${products.length} products (total: ${allProducts.length})`
      );
    }

    console.log(
      `✅ Successfully fetched ${allProducts.length} products for sitemap`
    );
    return allProducts;
  } catch (error) {
    console.error("❌ Error fetching sitemap products:", error);
    throw error; // Re-throw to handle in sitemap.ts
  }
}

export async function getSitemapCollections(): Promise<SitemapCollection[]> {
  console.log("🗺️ Fetching collections for sitemap...");

  try {
    const res: { status: number; body: ShopifySitemapCollectionsOperation } =
      await shopifyFetch<ShopifySitemapCollectionsOperation>({
        query: getSitemapCollectionsQuery,
        variables: {
          first: 100, // More than enough for most stores
        },
      });

    if (!res?.body?.data?.collections?.edges) {
      console.warn("⚠️ No collections returned from Shopify");
      return [];
    }

    const collections = res.body.data.collections.edges
      .map((edge: any) => edge.node)
      .filter(
        (collection: any) => collection.handle && collection.handle !== ""
      ); // Filter out empty handles

    console.log(
      `✅ Successfully fetched ${collections.length} collections for sitemap`
    );
    return collections;
  } catch (error) {
    console.error("❌ Error fetching sitemap collections:", error);
    throw error;
  }
}

Key Concepts:

The getSitemapProducts function implements cursor-based pagination to handle large product catalogs efficiently. The pagination logic while (hasNextPage && allProducts.length < 500) continues fetching until either no more pages exist or a safety limit is reached to prevent infinite loops.

Shopify uses cursor-based pagination rather than offset-based. The cursor tracks position in the dataset, starting as undefined and updating with each page's endCursor. The || undefined conversion handles Shopify returning null when no more pages exist, but TypeScript expecting string | undefined.

Each function includes comprehensive error handling:

  • Try-catch blocks provide specific error logging for debugging
  • Errors are re-thrown to allow the sitemap to implement fallback strategies
  • Defensive checks (if (!res?.body?.data?.products?.edges)) prevent crashes from unexpected API responses

The shopifyFetch function returns a wrapper object { status: number; body: T } where the actual GraphQL data resides in res.body.data. This structure allows checking HTTP status codes while maintaining type safety with the generic T parameter.

Performance optimizations:

  • Batch size of 250 items uses Shopify's maximum for optimal API throughput
  • The 500-product safety limit prevents runaway pagination
  • Early returns on empty responses avoid unnecessary processing

Step 4: Update the Sitemap Implementation

Objective: Replace existing heavy queries with our optimized functions while implementing robust error handling and SEO best practices.

Why this matters: The sitemap.ts file is Next.js's entry point for sitemap generation. Proper implementation ensures reliable builds and optimal SEO with correct priorities and change frequencies.

📦 File path: app/sitemap.ts

import {
  getSitemapProducts,
  getSitemapCollections,
} from "lib/shopify/fetchers/storefront/sitemap";
import { getPages } from "lib/shopify"; // Keep existing pages function
import { validateEnvironmentVariables } from "lib/utils";
import { MetadataRoute } from "next";

export const dynamic = "force-dynamic";

export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
  validateEnvironmentVariables();

  const baseUrl =
    process.env.NODE_ENV === "production"
      ? "https://your-domain.com" // Replace with your actual domain
      : process.env.NEXT_PUBLIC_BASE_URL || "http://localhost:3000";

  console.log("🗺️ Generating sitemap for:", baseUrl);

  // Static routes with SEO best practices priorities
  const staticRoutes = [
    // Homepage - highest priority
    { url: "/", priority: 1.0, changeFrequency: "weekly" as const },

    // Main category pages - high priority
    { url: "/collections", priority: 0.8, changeFrequency: "daily" as const },
    { url: "/search", priority: 0.4, changeFrequency: "monthly" as const },

    // About/business pages - medium priority
    { url: "/about", priority: 0.5, changeFrequency: "monthly" as const },
    { url: "/contact", priority: 0.5, changeFrequency: "monthly" as const },

    // Legal pages - low priority but necessary
    { url: "/terms", priority: 0.3, changeFrequency: "yearly" as const },
    { url: "/privacy", priority: 0.3, changeFrequency: "yearly" as const },
  ];

  const routesMap = staticRoutes.map((route) => ({
    url: `${baseUrl}${route.url}`,
    lastModified: new Date().toISOString(),
    changeFrequency: route.changeFrequency,
    priority: route.priority,
  }));

  let dynamicRoutes: MetadataRoute.Sitemap = [];

  try {
    // Use Promise.all for parallel fetching (faster than sequential)
    const [sitemapProducts, sitemapCollections, pages] = await Promise.all([
      getSitemapProducts(),
      getSitemapCollections(),
      getPages().catch((error) => {
        console.error("❌ Error fetching pages:", error);
        return []; // Continue without pages if they fail
      }),
    ]);

    // Transform products for sitemap
    const productRoutes = sitemapProducts.map((product) => ({
      url: `${baseUrl}/products/${product.handle}`,
      lastModified: product.updatedAt,
      changeFrequency: "weekly" as const,
      priority: 0.6,
      // Include product image for better SEO (Google supports this)
      images: product.featuredImage?.url
        ? [product.featuredImage.url]
        : undefined,
    }));

    // Transform collections for sitemap
    const collectionRoutes = sitemapCollections.map((collection) => ({
      url: `${baseUrl}/collections/${collection.handle}`,
      lastModified: collection.updatedAt,
      changeFrequency: "weekly" as const,
      priority: 0.7, // Higher than products for category pages
    }));

    // Transform pages for sitemap
    const pageRoutes = pages.map((page) => ({
      url: `${baseUrl}/${page.handle}`,
      lastModified: page.updatedAt,
      changeFrequency: "monthly" as const,
      priority: 0.4,
    }));

    dynamicRoutes = [...productRoutes, ...collectionRoutes, ...pageRoutes];

    console.log(`✅ Sitemap generated successfully:`);
    console.log(`   📄 ${staticRoutes.length} static routes`);
    console.log(`   📦 ${productRoutes.length} product routes`);
    console.log(`   📂 ${collectionRoutes.length} collection routes`);
    console.log(`   📋 ${pageRoutes.length} page routes`);
    console.log(
      `   🔗 ${staticRoutes.length + dynamicRoutes.length} total URLs`
    );
  } catch (error) {
    console.error("❌ Error generating dynamic sitemap routes:", error);
    console.log("🔄 Falling back to static routes only");
    // Return only static routes if dynamic fetching fails
    return routesMap;
  }

  return [...routesMap, ...dynamicRoutes];
}

Key Concepts:

The implementation uses Promise.all() for concurrent API calls rather than sequential awaits, significantly reducing total fetch time. Each API call happens simultaneously: products, collections, and pages.

SEO Priority Implementation: The priority structure follows Google's e-commerce recommendations:

  • Homepage (1.0) - Your most important page
  • Collections (0.7) - Category pages are crucial for product discovery
  • Products (0.6) - Individual product pages
  • Static pages (0.3-0.5) - Based on business importance

Robust Error Handling: Individual .catch() blocks on specific functions prevent one failure from breaking the entire sitemap. If pages fail to fetch, the sitemap continues with products and collections. The global try-catch provides a final fallback to static routes only.

Image Sitemap Enhancement: Including images: product.featuredImage?.url ? [product.featuredImage.url] : undefined creates an enhanced sitemap that Google can use for image search results. This optional field improves SEO without affecting basic sitemap functionality.

The detailed logging provides visibility into the generation process, helpful for debugging build issues and monitoring sitemap health over time.


Step 5: Export the New Functions

Objective: Expose the sitemap functions through your main Shopify module exports for clean import paths.

Why this matters: Following the barrel export pattern keeps imports clean and maintains consistency with your existing codebase architecture.

File path: lib/shopify/index.ts (add to existing exports)

// Re-export sitemap functions
export {
  getSitemapProducts,
  getSitemapCollections,
} from "./fetchers/storefront/sitemap";

Key Concepts:

This implements the barrel export pattern standard in TypeScript projects. Instead of importing from deep file paths, consumers can import directly from the main module:

// Clean: import { getSitemapProducts } from "lib/shopify";
// Verbose: import { getSitemapProducts } from "lib/shopify/fetchers/storefront/sitemap";

This pattern improves maintainability - if you restructure your internal files, only the barrel exports need updating rather than every import throughout your codebase.


Bonus: Adding robots.txt Support

Objective: Complete your SEO setup by implementing app/robots.ts to reference your sitemap and control crawler access.

Why this matters: The robots.txt file tells search engines where to find your sitemap and which areas of your site to crawl or avoid.

File path: app/robots.ts

import type { MetadataRoute } from "next";

export default function robots(): MetadataRoute.Robots {
  const baseUrl =
    process.env.NODE_ENV === "production"
      ? "https://your-domain.com" // Replace with your actual domain
      : process.env.NEXT_PUBLIC_BASE_URL || "http://localhost:3000";

  return {
    rules: {
      userAgent: "*",
      allow: "/",
      disallow: [
        "/account",
        "/account/*",
        "/auth/*",
        "/api/*",
        "/admin/*",
        "/_next/*",
        "/private/*",
      ],
    },
    sitemap: `${baseUrl}/sitemap.xml`,
  };
}

The robots.txt implementation follows e-commerce best practices by allowing general crawling while protecting sensitive areas. The disallow array prevents search engines from indexing user accounts, authentication pages, API endpoints, and admin interfaces.

Most importantly, the sitemap property points directly to your optimized sitemap.xml, ensuring search engines can efficiently discover all your products and collections.

Results & Performance

Project Structure

lib/shopify/
├── queries/
│   └── sitemap.ts              # Lightweight GraphQL queries
├── fetchers/storefront/
│   └── sitemap.ts              # Data fetching functions
├── types.ts                    # Type definitions (updated)
└── index.ts                    # Exports (updated)

app/
├── sitemap.ts                  # Sitemap implementation (updated)
└── robots.ts                   # Robots.txt implementation (new)

Architecture Benefits

The implementation ensures a clean separation of concerns by keeping sitemap queries isolated from page rendering logic, maintains GraphQL efficiency by fetching only essential fields like handle, updatedAt, and featuredImage to reduce payload size while preserving SEO functionality, incorporates robust error resilience through fallback strategies that guarantee successful sitemap generation even if individual API calls fail, and upholds full type safety with complete TypeScript coverage to prevent runtime errors and enhance the developer experience.

Production Optimizations

Caching Implementation

import { unstable_cache } from "next/cache";

export const getCachedSitemapProducts = unstable_cache(
  async () => getSitemapProducts(),
  ["sitemap-products"],
  { revalidate: 3600 } // Cache for 1 hour
);

Incremental Regeneration

// Trigger sitemap updates when products change
export async function revalidateSitemap() {
  revalidateTag("sitemap");
}

Large Catalog Support

For stores with 10,000+ products, implement sitemap splitting:

<!-- sitemap-index.xml -->
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://your-domain.com/sitemap-products.xml</loc>
    <lastmod>2024-01-01T00:00:00Z</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://your-domain.com/sitemap-collections.xml</loc>
    <lastmod>2024-01-01T00:00:00Z</lastmod>
  </sitemap>
</sitemapindex>

Conclusion

This implementation provides a robust foundation for SEO-friendly sitemaps in headless Shopify applications. The lightweight query approach scales efficiently with large product catalogs while maintaining build reliability.

Key takeaways:

  • Always optimize GraphQL queries for their specific use case
  • Implement comprehensive error handling for build reliability
  • Use TypeScript for type safety and better developer experience
  • Follow SEO best practices with proper priorities and metadata

Start with this implementation and add complexity only when needed. Most e-commerce sites will find this approach sufficient for their entire product lifecycle.

Thanks, Matija

8

Comments

Enjoyed this article?
Subscribe to my newsletter for more insights and tutorials.
Matija Žiberna
Matija Žiberna
Full-stack developer, co-founder

I'm Matija Žiberna, a self-taught full-stack developer and co-founder passionate about building products, writing clean code, and figuring out how to turn ideas into businesses. I write about web development with Next.js, lessons from entrepreneurship, and the journey of learning by doing. My goal is to provide value through code—whether it's through tools, content, or real-world software.

You might be interested in