• Home
BuildWithMatija
Get In Touch
  1. Home
  2. Blog
  3. Payload
  4. How to Upgrade Your Payload Website with RAG and Vector Search

How to Upgrade Your Payload Website with RAG and Vector Search

Use Upstash + OpenAI embeddings and Payload Native Jobs to add RAG, semantic search, and context-aware chatbots to…

25th December 2025·Updated on:3rd January 2026·MŽMatija Žiberna·
Payload
How to Upgrade Your Payload Website with RAG and Vector Search

Need Help Making the Switch?

Moving to Next.js and Payload CMS? I offer advisory support on an hourly basis.

Book Hourly Advisory

Related Posts:

  • •Deterministic Upstash Vector Sync: Atomic CMS Indexing
  • •How to set-up livePreview in Payload with Nextjs's draftMode
  • •Run Payload CMS Jobs on Vercel: Complete 5‑Step Setup
  • •Zod v4 & Gemini: Fix Structured Output with z.toJSONSchema

Context: This guide assumes you have a running Payload CMS 3.0 project.

Imagine if your CMS didn't just store content, but actually understood it. By integrating a Vector Store (Upstash) with Payload, you unlock Chatbots, RAG (Retrieval Augmented Generation), and Semantic Search.

But there is a trap: AI operations are slow. Generating embeddings and syncing to Upstash can take 2-3 seconds—too long for a user to wait when saving a post.

This guide shows you how to implement a Background Job pipeline to sync your content asynchronously using Payload's native Jobs queue.


0. Prerequisites

Before writing code, we need to set up our environment.

Install Dependencies

npm install @upstash/vector openai

Environment Variables

Add these to your .env file:

# Get keys from https://console.upstash.com/vector
UPSTASH_VECTOR_REST_URL="https://your-index-url.upstash.io"
UPSTASH_VECTOR_REST_TOKEN="your-token"

# Get key from https://platform.openai.com/
OPENAI_API_KEY="sk-..."

Create the Upstash Index

CRITICAL: When creating your index in the Upstash Console, you MUST set the dimensions to 1024 to match OpenAI's text-embedding-3-small model config we will use.

  • Metric: Cosine (recommended)
  • Dimensions: 1024

1. Vector Infrastructure

Let's verify the basics first.

File: src/lib/vector/client.ts

import { Index } from '@upstash/vector'

if (!process.env.UPSTASH_VECTOR_REST_URL || !process.env.UPSTASH_VECTOR_REST_TOKEN) {
  throw new Error('Missing Upstash Vector env vars')
}

export const vectorIndex = new Index({
  url: process.env.UPSTASH_VECTOR_REST_URL,
  token: process.env.UPSTASH_VECTOR_REST_TOKEN,
})

File: src/lib/vector/embedding.ts

import OpenAI from 'openai'

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY })

export async function generateEmbedding(text: string): Promise<number[]> {
  const sanitizedText = text.replace(/\n/g, ' ')
  // IMPORTANT: dimensions must match your Upstash index
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: sanitizedText,
    dimensions: 1024, 
  })
  return response.data[0].embedding
}

2. The Logic: Operations

We need a function to handle the actual embedding logic. This is the code that will run inside our job.

File: src/lib/vector/operations.ts

import { vectorIndex } from './client'
import { generateEmbedding } from './embedding'

export async function embedDocument({ id, collection, text }: { id: string, collection: string, text: string }) {
  const embedding = await generateEmbedding(text)
  
  await vectorIndex.upsert([{
    id: `${collection}-${id}`,
    vector: embedding,
    metadata: {
      docId: id,
      collection,
      // Add other metadata here
    }
  }])
  
  console.log(`[Vector] Synced ${collection}/${id}`)
}

3. The Job: Upsert Handler

Now we create the Payload Task Handler. This runs in the background.

Task Input Interface: First, define what data we pass to the job.

export interface VectorUpsertInput {
  docId: string
  collection: string
}

File: src/payload/jobs/vector/upsert.ts

import type { TaskHandler } from 'payload'
import { embedDocument } from '@/lib/vector/operations'

export interface VectorUpsertInput {
  docId: string
  collection: string
}

export const vectorUpsertHandler: TaskHandler<VectorUpsertInput> = async ({ input, req }) => {
  const { docId, collection } = input

  req.payload.logger.info(`[Job] Starting vector sync for ${collection}/${docId}`)

  try {
    const doc = await req.payload.findByID({ collection, id: docId })

    if (doc._status && doc._status !== 'published') {
      return { output: { message: 'Skipped: Not published' } }
    }

    // Extract text content based on collection
    // Note: For richText fields, you'd want a lexicalToMarkdown utility here
    const content = (doc as any).content || (doc as any).description || ''
    
    if (!content) return { output: { message: 'Skipped: No content' } }

    await embedDocument({
      id: docId.toString(),
      collection,
      text: typeof content === 'string' ? content : JSON.stringify(content) 
    })

    return { output: { message: 'Success' } }
  } catch (error) {
    req.payload.logger.error(`[Job] Failed: ${error.message}`)
    throw error // Trigger retry
  }
}

4. Register the Job

Tell Payload about the job.

File: payload.config.ts

import { vectorUpsertHandler } from '@/payload/jobs/vector/upsert'

export default buildConfig({
  // ...
  jobs: {
    // Determine who can manually trigger jobs via API (if needed)
    access: { run: ({ req }) => !!req.user }, 
    tasks: [
      {
        slug: 'vector-upsert',
        handler: vectorUpsertHandler,
        retries: 3,
      },
    ],
  },
})

5. The Trigger: Collection Hook

Attach a hook to your collections to verify the publishing state and dispatch the job.

File: src/payload/hooks/syncToVectorStore.ts

import type { CollectionAfterChangeHook } from 'payload'

export const syncToVectorStoreAfterChange: CollectionAfterChangeHook = async ({
  doc,
  req,
  collection,
}) => {
  if (doc._status !== 'published') return doc

  await req.payload.jobs.queue({
    task: 'vector-upsert',
    input: {
      docId: doc.id,
      collection: collection.slug,
    },
  })

  return doc
}

CRITICAL STEP: Attach to Collection You must add this hook to every collection you want indexed!

File: src/collections/Posts.ts

import { syncToVectorStoreAfterChange } from '@/payload/hooks/syncToVectorStore'

export const Posts: CollectionConfig = {
  slug: 'posts',
  hooks: {
    afterChange: [syncToVectorStoreAfterChange], // <--- Add this!
  },
  // ...
}

6. Running the Jobs

Defining the job isn't enough; something needs to run it.

Local Development

In a separate terminal window, run:

npx payload jobs:run

This starts a long-running process that polls the payload-jobs collection.

Production (Vercel/Serverless)

Since you don't have a long-running server, usage Vercel Cron or an external cron service to poke Payload's job endpoint.

  1. Enable Vercel Cron.
  2. Payload automatically configured the endpoint at /api/payload-jobs/run.
  3. Ensure your vercel.json calls this endpoint periodically.

Summary

  1. Dependencies: Installed @upstash/vector & openai.
  2. Config: Created Index (1024 dims) & .env.
  3. Code: Added client, embedding, operations, and upsert job handler.
  4. Registration: Registered Job in payload.config.ts.
  5. Trigger: Added hook to Posts collection.
  6. Runner: Started npx payload jobs:run.

Now, when you publish a post, Payload queues the task, your worker picks it up, and your Vector Store stays perfect in sync—users never wait.

📚 Comprehensive Payload CMS Guides

Detailed Payload guides with field configuration examples, custom components, and workflow optimization tips to speed up your CMS development process.

No spam. Unsubscribe anytime.

📄View markdown version
0

Frequently Asked Questions

Comments

Leave a Comment

Your email will not be published

Stay updated! Get our weekly digest with the latest learnings on NextJS, React, AI, and web development tips delivered straight to your inbox.

10-2000 characters

• Comments are automatically approved and will appear immediately

• Your name and email will be saved for future comments

• Be respectful and constructive in your feedback

• No spam, self-promotion, or off-topic content

Matija Žiberna
Matija Žiberna
Full-stack developer, co-founder

I'm Matija Žiberna, a self-taught full-stack developer and co-founder passionate about building products, writing clean code, and figuring out how to turn ideas into businesses. I write about web development with Next.js, lessons from entrepreneurship, and the journey of learning by doing. My goal is to provide value through code—whether it's through tools, content, or real-world software.

You might be interested in

Deterministic Upstash Vector Sync: Atomic CMS Indexing
Deterministic Upstash Vector Sync: Atomic CMS Indexing

19th December 2025

How to set-up livePreview in Payload with Nextjs's draftMode
How to set-up livePreview in Payload with Nextjs's draftMode

17th December 2025

Run Payload CMS Jobs on Vercel: Complete 5‑Step Setup
Run Payload CMS Jobs on Vercel: Complete 5‑Step Setup

26th December 2025

Zod v4 & Gemini: Fix Structured Output with z.toJSONSchema
Zod v4 & Gemini: Fix Structured Output with z.toJSONSchema

24th December 2025

Table of Contents

  • 0. Prerequisites
  • Install Dependencies
  • Environment Variables
  • Create the Upstash Index
  • 1. Vector Infrastructure
  • 2. The Logic: Operations
  • 3. The Job: Upsert Handler
  • 4. Register the Job
  • 5. The Trigger: Collection Hook
  • 6. Running the Jobs
  • Local Development
  • Production (Vercel/Serverless)
  • Summary
On this page:
  • 0. Prerequisites
  • 1. Vector Infrastructure
  • 2. The Logic: Operations
  • 3. The Job: Upsert Handler
  • 4. Register the Job
Build With Matija Logo

Build with Matija

Matija Žiberna

I turn scattered business knowledge into one usable system. End-to-end system architecture, AI integration, and development.

Quick Links

Payload CMS Websites
  • Bespoke AI Applications
  • Projects
  • How I Work
  • Blog
  • Get in Touch

    Have a project in mind? Let's discuss how we can help your business grow.

    Contact me →
    © 2026BuildWithMatija•Principal-led system architecture•All rights reserved