How to Seed Payload CMS with CSV Files: A Complete Guide
Replace hardcoded seed data with maintainable CSV files for better content management

I was building a client project with Payload CMS when I hit a familiar frustration: managing seed data scattered across multiple JavaScript files. Every time content needed updating, I found myself digging through hardcoded objects, making changes, and hoping I didn't break anything. After implementing a CSV-based seeding system, I discovered how much cleaner and more maintainable this approach could be.
This guide shows you exactly how to build a comprehensive CSV seeding system for Payload CMS that handles everything from simple text fields to complex relationships and nested data structures.
The Problem with Hardcoded Seed Data
Traditional Payload seeding typically looks like this:
// File: src/lib/payload/seed/collections/testimonials.ts
const testimonials = [
{
name: "Jane Doe",
content: "Great service!",
rating: 5,
// ... more fields
},
// ... more objects
]
This approach becomes unwieldy quickly. Content updates require code changes, non-technical team members can't contribute, and managing relationships between collections becomes a nightmare.
Building the CSV Seeding Foundation
Let's start by creating the core infrastructure. First, we need a CSV reader utility that can parse our data files consistently.
// File: src/lib/payload/seed/csvReader.ts
import Papa from 'papaparse'
import fs from 'fs'
import path from 'path'
export async function readCsvFile(filePath: string): Promise<Array<Record<string, any>>> {
try {
const fullPath = path.resolve(filePath)
const csvData = fs.readFileSync(fullPath, 'utf8')
const result = Papa.parse(csvData, {
header: true,
skipEmptyLines: true,
transformHeader: (header) => header.trim(),
transform: (value) => {
const trimmed = value.trim()
// Handle boolean conversion
if (trimmed === 'true') return true
if (trimmed === 'false') return false
// Handle empty values
if (trimmed === '') return undefined
return trimmed
}
})
if (result.errors.length > 0) {
console.warn('CSV parsing warnings:', result.errors)
}
return result.data
} catch (error) {
console.error(`Error reading CSV file ${filePath}:`, error)
throw error
}
}
This utility handles the common CSV parsing challenges you'll encounter: trimming whitespace, converting boolean strings, and managing empty values. The transformHeader
function ensures consistent column naming even if your CSV has extra spaces.
Next, let's set up our directory structure for organizing CSV data:
mkdir -p src/lib/payload/seed/csv-data
Phase 1: Simple Collections
We'll start with the simplest case - flat data structures with basic field types. Let's implement testimonials seeding.
Create your first CSV file:
// File: src/lib/payload/seed/csv-data/testimonials.csv csv_id,name,testimonialDate,source,location,service,content,rating testimonial_jane,"Jane Doe","2023-10-26","google","Ljubljana","Bathroom renovation","Amazing work, highly recommend!",5 testimonial_john,"John Smith","2023-11-15","website","Maribor","Plumbing","Quick and professional service.",4 testimonial_maja,"Maja Novak",,"manual",,"Consultation","Very helpful advice.",5
Now implement the seeding function:
// File: src/lib/payload/seed/collections/testimonials.ts
import { Payload } from 'payload'
import { readCsvFile } from '../csvReader'
import path from 'path'
export async function seedTestimonials(payload: Payload): Promise<any[]> {
const csvPath = path.join(process.cwd(), 'src/lib/payload/seed/csv-data/testimonials.csv')
const csvData = await readCsvFile(csvPath)
const createdTestimonials = []
for (const row of csvData) {
try {
const testimonial = await payload.create({
collection: 'testimonials',
data: {
name: row.name,
testimonialDate: row.testimonialDate ? new Date(row.testimonialDate) : undefined,
source: row.source,
location: row.location,
service: row.service,
content: row.content,
rating: row.rating ? parseInt(row.rating) : undefined,
},
})
createdTestimonials.push(testimonial)
console.log(`Created testimonial: ${testimonial.name}`)
} catch (error) {
console.error(`Error creating testimonial from row:`, row, error)
}
}
return createdTestimonials
}
This function demonstrates the core pattern: read CSV data, iterate through rows, map fields to Payload's expected structure, and handle type conversions. Notice how we convert the date string to a Date object and parse the rating as an integer.
Let's add one more simple collection to reinforce the pattern - FAQ items:
// File: src/lib/payload/seed/csv-data/faq-items.csv csv_id,question,category,answer_html faq_1,"What are your working hours?","general","<p>Our regular working hours are Monday to Friday, 8:00 to 16:00. For urgent interventions, we are available outside working hours as well.</p>" faq_2,"In which area do you provide services?","general","<p>We provide services mainly in central Slovenia, including Ljubljana and surroundings, Domžale, Kamnik and Kranj.</p>"
// File: src/lib/payload/seed/collections/faq-items.ts
import { Payload } from 'payload'
import { readCsvFile } from '../csvReader'
import path from 'path'
export async function seedFaqItems(payload: Payload): Promise<any[]> {
const csvPath = path.join(process.cwd(), 'src/lib/payload/seed/csv-data/faq-items.csv')
const csvData = await readCsvFile(csvPath)
const createdFaqs = []
for (const row of csvData) {
try {
const faqItem = await payload.create({
collection: 'faqItems',
data: {
question: row.question,
category: row.category,
answer: {
root: {
children: [
{
children: [
{
detail: 0,
format: 0,
mode: "normal",
style: "",
text: row.answer_html?.replace(/<[^>]*>/g, '') || '',
type: "text",
version: 1,
},
],
direction: "ltr",
format: "",
indent: 0,
type: "paragraph",
version: 1,
},
],
direction: "ltr",
format: "",
indent: 0,
type: "root",
version: 1,
},
},
},
})
createdFaqs.push(faqItem)
console.log(`Created FAQ: ${faqItem.question}`)
} catch (error) {
console.error(`Error creating FAQ from row:`, row, error)
}
}
return createdFaqs
}
The FAQ implementation shows how to handle Payload's richText fields. For simplicity, we're converting HTML to plain text and wrapping it in Payload's expected lexical structure. This creates a basic paragraph with the content stripped of HTML tags.
Phase 2: Complex Data with JSON Arrays
Now we'll tackle collections that require more sophisticated data structures. Services with feature arrays are a perfect example:
// File: src/lib/payload/seed/csv-data/services.csv csv_id,title,description,priceDisplay,features_json service_plumbing,"Plumbing","We provide comprehensive solutions for plumbing installations, from planning to implementation and maintenance.","By agreement","[{""featureText"":""New buildings""},{""featureText"":""Renovations""},{""featureText"":""Repairs""}]" service_installation,"Sanitary Equipment Installation","Professional installation of shower cabins, bathtubs, toilets, sinks and other sanitary equipment.","From €150 onwards","[{""featureText"":""Installation""},{""featureText"":""Connection""},{""featureText"":""Consultation""}]"
// File: src/lib/payload/seed/collections/services.ts
import { Payload } from 'payload'
import { readCsvFile } from '../csvReader'
import path from 'path'
export async function seedServices(payload: Payload): Promise<any[]> {
const csvPath = path.join(process.cwd(), 'src/lib/payload/seed/csv-data/services.csv')
const csvData = await readCsvFile(csvPath)
const createdServices = []
for (const row of csvData) {
try {
// Parse JSON features
let features = []
if (row.features_json) {
try {
features = JSON.parse(row.features_json)
} catch (jsonError) {
console.warn(`Invalid JSON in features for ${row.title}:`, jsonError)
features = []
}
}
const service = await payload.create({
collection: 'services',
data: {
title: row.title,
description: row.description,
priceDisplay: row.priceDisplay,
features: features,
},
})
createdServices.push(service)
console.log(`Created service: ${service.title}`)
} catch (error) {
console.error(`Error creating service from row:`, row, error)
}
}
return createdServices
}
The key insight here is using JSON strings within CSV cells for complex data structures. We parse the features_json
column into an actual JavaScript array before passing it to Payload. This approach scales to any level of complexity while keeping the CSV format manageable.
For even more complex nested structures, like machinery specifications:
// File: src/lib/payload/seed/csv-data/machinery.csv csv_id,tabName,name,description,notes,specifications_json machine_excavator,"Excavators","Volvo EL70","Light excavator for smaller excavations.","Suitable for urban construction sites.","[{""specName"":""Dimensions"",""specDetails"":[{""detail"":""Length: 5.4m""},{""detail"":""Width: 2.1m""}]},{""specName"":""Weight"",""specDetails"":[{""detail"":""7 tons""}]}]"
// File: src/lib/payload/seed/collections/machinery.ts
import { Payload } from 'payload'
import { readCsvFile } from '../csvReader'
import path from 'path'
export async function seedMachinery(payload: Payload): Promise<any[]> {
const csvPath = path.join(process.cwd(), 'src/lib/payload/seed/csv-data/machinery.csv')
const csvData = await readCsvFile(csvPath)
const createdMachinery = []
for (const row of csvData) {
try {
let specifications = []
if (row.specifications_json) {
try {
specifications = JSON.parse(row.specifications_json)
} catch (jsonError) {
console.warn(`Invalid JSON in specifications for ${row.name}:`, jsonError)
specifications = []
}
}
const machine = await payload.create({
collection: 'machinery',
data: {
tabName: row.tabName,
name: row.name,
description: row.description,
notes: row.notes,
specifications: specifications,
},
})
createdMachinery.push(machine)
console.log(`Created machine: ${machine.name}`)
} catch (error) {
console.error(`Error creating machine from row:`, row, error)
}
}
return createdMachinery
}
This demonstrates handling deeply nested JSON structures within CSV files. The specifications field contains an array of objects, where each object has its own array of details. By using JSON strings, we maintain the full data structure while keeping it manageable in spreadsheet software.
Phase 3: Collection Relationships
The most complex scenario involves relationships between collections. Let's implement projects that reference both services and testimonials:
// File: src/lib/payload/seed/csv-data/projects.csv csv_id,title,description_html,projectStatus,location,metadata_json,tags_json,service_ids,testimonial_ids,project_type project_renovation,"Novak Bathroom Renovation","<p>Complete bathroom renovation in the Novak family apartment.</p>","completed","Ljubljana","{""startDate"":""2023-09-01"",""completionDate"":""2023-10-15"",""client"":""Novak Family"",""budget"":""10000 EUR""}","[{""tag"":""Renovation""},{""tag"":""Bathroom""}]","service_plumbing","testimonial_jane","renovation" project_newbuild,"Podlipnik House New Construction","<p>Implementation of all plumbing installations in newly built single-family house.</p>","completed","Domžale","{""completionDate"":""2024-01-20"",""client"":""Mr. Podlipnik""}","[{""tag"":""New Construction""},{""tag"":""House""}]","service_plumbing","","newbuild"
// File: src/lib/payload/seed/collections/projects.ts
import { Payload } from 'payload'
import { readCsvFile } from '../csvReader'
import path from 'path'
// Helper function to look up documents by CSV ID
async function lookupDocumentsByCsvIds(
payload: Payload,
collection: string,
csvIds: string[]
): Promise<string[]> {
if (!csvIds.length) return []
const results = []
for (const csvId of csvIds) {
try {
const docs = await payload.find({
collection,
where: {
// Assuming you store csv_id in your documents for lookup
csv_id: { equals: csvId }
},
limit: 1,
})
if (docs.docs.length > 0) {
results.push(docs.docs[0].id)
}
} catch (error) {
console.warn(`Could not find ${collection} with csv_id ${csvId}`)
}
}
return results
}
export async function seedProjects(payload: Payload): Promise<any[]> {
const csvPath = path.join(process.cwd(), 'src/lib/payload/seed/csv-data/projects.csv')
const csvData = await readCsvFile(csvPath)
const createdProjects = []
for (const row of csvData) {
try {
// Parse JSON fields
let metadata = {}
let tags = []
if (row.metadata_json) {
try {
metadata = JSON.parse(row.metadata_json)
} catch (jsonError) {
console.warn(`Invalid JSON in metadata for ${row.title}:`, jsonError)
}
}
if (row.tags_json) {
try {
tags = JSON.parse(row.tags_json)
} catch (jsonError) {
console.warn(`Invalid JSON in tags for ${row.title}:`, jsonError)
tags = []
}
}
// Handle relationships
const serviceIds = row.service_ids
? await lookupDocumentsByCsvIds(payload, 'services', row.service_ids.split(','))
: []
const testimonialIds = row.testimonial_ids
? await lookupDocumentsByCsvIds(payload, 'testimonials', row.testimonial_ids.split(','))
: []
// Convert HTML description to richText (simplified)
const description = {
root: {
children: [
{
children: [
{
detail: 0,
format: 0,
mode: "normal",
style: "",
text: row.description_html?.replace(/<[^>]*>/g, '') || '',
type: "text",
version: 1,
},
],
direction: "ltr",
format: "",
indent: 0,
type: "paragraph",
version: 1,
},
],
direction: "ltr",
format: "",
indent: 0,
type: "root",
version: 1,
},
}
const project = await payload.create({
collection: 'projects',
data: {
title: row.title,
description: description,
projectStatus: row.projectStatus,
location: row.location,
metadata: metadata,
tags: tags,
relatedServices: serviceIds,
relatedTestimonials: testimonialIds,
// Store csv_id for future lookups
csv_id: row.csv_id,
},
})
createdProjects.push(project)
console.log(`Created project: ${project.title}`)
} catch (error) {
console.error(`Error creating project from row:`, row, error)
}
}
return createdProjects
}
The relationship handling here introduces a lookup system. We use CSV IDs to reference documents across collections, then resolve these to actual Payload document IDs. This approach maintains referential integrity while keeping the CSV format readable.
The lookupDocumentsByCsvIds
helper function demonstrates how to find previously seeded documents. This assumes you're storing the original csv_id
field in your documents, which becomes crucial for managing relationships.
Orchestrating the Complete Seeding Process
Finally, let's tie everything together in a main seeding function:
// File: src/lib/payload/seed/index.ts
import { Payload } from 'payload'
import { seedTestimonials } from './collections/testimonials'
import { seedFaqItems } from './collections/faq-items'
import { seedServices } from './collections/services'
import { seedMachinery } from './collections/machinery'
import { seedProjects } from './collections/projects'
export async function seedDatabase(payload: Payload): Promise<void> {
console.log('Starting CSV-based database seeding...')
try {
// Phase 1: Simple collections (no dependencies)
console.log('Phase 1: Seeding simple collections...')
await seedTestimonials(payload)
await seedFaqItems(payload)
// Phase 2: Collections with complex data structures
console.log('Phase 2: Seeding complex collections...')
await seedServices(payload)
await seedMachinery(payload)
// Phase 3: Collections with relationships (depend on previous collections)
console.log('Phase 3: Seeding collections with relationships...')
await seedProjects(payload)
console.log('Database seeding completed successfully!')
} catch (error) {
console.error('Error during database seeding:', error)
throw error
}
}
The order matters here. Collections with relationships must be seeded after their dependencies. This orchestration ensures that when we try to look up related services or testimonials, they already exist in the database.
Advanced Features and Best Practices
For production use, consider these enhancements:
Error Handling and Validation:
// Add to your seeding functions
function validateRow(row: any, requiredFields: string[]): boolean {
for (const field of requiredFields) {
if (!row[field]) {
console.error(`Missing required field ${field} in row:`, row)
return false
}
}
return true
}
Progress Tracking:
// Add progress indicators for large datasets
console.log(`Processing ${csvData.length} records...`)
for (let i = 0; i < csvData.length; i++) {
const row = csvData[i]
// ... processing
if ((i + 1) % 10 === 0) {
console.log(`Processed ${i + 1}/${csvData.length} records`)
}
}
Relationship Caching:
// Cache relationship lookups to improve performance
const relationshipCache = new Map<string, string>()
async function cachedLookup(payload: Payload, collection: string, csvId: string): Promise<string | null> {
const cacheKey = `${collection}:${csvId}`
if (relationshipCache.has(cacheKey)) {
return relationshipCache.get(cacheKey)!
}
// Perform lookup and cache result
const result = await lookupDocumentsByCsvIds(payload, collection, [csvId])
const id = result.length > 0 ? result[0] : null
relationshipCache.set(cacheKey, id)
return id
}
Installation Requirements
Before implementing this system, install the required dependencies:
npm install papaparse @types/papaparse
# or
pnpm install papaparse @types/papaparse
# or
yarn add papaparse @types/papaparse
Conclusion
This CSV-based seeding approach transforms how you manage Payload CMS data. Instead of hunting through JavaScript files, your content lives in organized, version-controlled CSV files that anyone on your team can edit. You've learned how to handle simple fields, complex JSON structures, and cross-collection relationships while maintaining data integrity.
The phased implementation approach—starting with simple collections and progressively adding complexity—ensures you can adopt this system incrementally. Whether you're seeding a small blog or a complex e-commerce platform, this foundation scales to meet your needs.
Your CSV files become a single source of truth for seed data, your seeding process becomes predictable and maintainable, and your team gains the ability to manage content without touching code.
Let me know in the comments if you have questions, and subscribe for more practical development guides.
Thanks, Matija