- Payload CMS Logging: Queue-Based Production Best Practices
Payload CMS Logging: Queue-Based Production Best Practices
Design a resilient queue-based logging system in Payload CMS using a dedicated logs collection and Jobs queue.

Need Help Making the Switch?
Moving to Next.js and Payload CMS? I offer advisory support on an hourly basis.
Book Hourly AdvisoryRelated Posts:
If you are building anything non-trivial in Payload CMS, logging is one of the first things that looks simple and then starts hurting under real traffic. I ran into this while scaling request and hook-heavy workflows, and the fix was not just code changes. The important part was making the architecture decision first.
This guide is intentionally problem-aware before it is implementation-heavy: why queue-based logging matters in Payload, what goes wrong when you do not do it, and what the architecture should look like before you write a line of code.
Only after that do we implement the queue-based approach using a dedicated logs collection and Payload Jobs.
The Problem Before the Code
Payload makes it easy to write logs directly, but synchronous logging in request paths does not scale well. The symptoms are predictable:
- request blocking: API and hook execution time increases because each log write is in-band
- data loss under pressure: failures or timeouts during incidents can drop the very logs you need
- collection bloat and noise: operational events mixed with domain data become harder to manage and query
In practice, your API response time gets tied to log writes. Hooks feel slower than expected because each error/info payload waits on persistence. During dependency or DB instability, logging itself can fail and create secondary failures exactly when you need telemetry the most.
There is also a schema problem. If logs are mixed into business collections or scattered ad hoc, operational data becomes noisy and hard to query. You lose clean boundaries between domain data and platform observability.
The real issue is architectural: logging is an operational workload, but synchronous logging treats it like inline business logic. That is why this decision needs to be made before implementation details.
Architecture Decision: Synchronous vs Queue-Based Logging
You should explicitly choose between two models.
Synchronous logging is simpler to start with. It is fine for very low traffic or one-off scripts where latency and failure isolation do not matter.
Queue-based logging adds one more moving part, but it separates request execution from log persistence. That tradeoff is usually correct for production Payload systems because it gives you reliability boundaries:
- requests and hooks stay fast
- logging failures do not break business workflows
- log persistence can be retried and monitored independently
For a production setup, the target architecture is:
- dedicated
logscollection for structured log data - dedicated Jobs queue (
logs) for persistence workload - dedicated task (
persistLog) that writes queued input to the collection - safe enqueue utility used everywhere instead of legacy synchronous calls
Once this decision is made, implementation becomes straightforward. But before jumping to code, lock in a few design rules.
Non-Negotiable Design Rules
If you adopt queue-based logging, enforce these rules from day one.
First, queueing logs must never break business flow. Logging is important, but it should not be allowed to take down order creation, webhook handling, or admin actions.
Second, every log payload must be normalized before enqueue. That includes bounded string lengths, safe JSON serialization, and stable field names.
Third, use one dedicated queue for logs. Mixing logging with unrelated tasks makes priority and capacity planning harder.
Fourth, design retention early. Logs are high-volume by nature. If you do not define retention windows and archival rules up front, cost and query performance will degrade.
Architecture Checklist Before Coding
Use this checklist before implementation starts:
- Have we agreed that synchronous request-path logging is not our production default?
- Do we have a dedicated
logscollection schema? - Do we have a dedicated Jobs queue for logs?
- Do we have a dedicated persistence task contract (input and failure behavior)?
- Do we have fallback behavior if enqueue fails?
- Do we have retention and queue monitoring defined?
If any answer is "no," resolve it before writing utilities. Now let's implement.
Step 1: Define a Dedicated Logs Collection
Create a collection designed for operational records, not business entities.
// File: src/collections/Logs/index.ts
import { superAdminOrTenantAdminAccess } from '@/access/superAdminOrTenantAdmin';
import type { CollectionConfig } from 'payload';
export const Logs: CollectionConfig = {
slug: 'logs',
labels: {
singular: 'Log',
plural: 'Logs',
},
admin: {
group: 'System & Logs',
useAsTitle: 'description',
defaultColumns: ['timestamp', 'level', 'source', 'description', 'tenant'],
description: 'Store arbitrary JSON data, error information, and metadata for debugging and auditing purposes.',
components: {
Description: '/src/components/payload/custom/CollectionDescription',
},
hidden: false,
},
access: {
read: superAdminOrTenantAdminAccess,
create: superAdminOrTenantAdminAccess,
update: superAdminOrTenantAdminAccess,
delete: superAdminOrTenantAdminAccess,
},
fields: [
{
name: 'timestamp',
type: 'date',
label: 'Timestamp',
required: true,
defaultValue: () => new Date().toISOString(),
admin: {
description: 'When the log entry was created (auto-set)',
readOnly: true,
},
},
{
name: 'level',
type: 'select',
label: 'Log Level',
required: true,
defaultValue: 'info',
options: [
{ label: 'Debug', value: 'debug' },
{ label: 'Info', value: 'info' },
{ label: 'Warning', value: 'warning' },
{ label: 'Error', value: 'error' },
],
admin: {
description: 'Severity level of the log entry',
},
},
{
name: 'source',
type: 'select',
label: 'Source',
required: true,
defaultValue: 'manual',
options: [
{ label: 'Webhook', value: 'webhook' },
{ label: 'Hook', value: 'hook' },
{ label: 'API', value: 'api' },
{ label: 'Migration', value: 'migration' },
{ label: 'Manual', value: 'manual' },
],
admin: {
description: 'Where the log entry originated from',
},
},
{
name: 'description',
type: 'textarea',
label: 'Description',
admin: {
description: 'Optional human-readable summary of the log entry',
},
},
{
name: 'data',
type: 'json',
label: 'Data',
admin: {
description: 'Arbitrary JSON data dump for debugging (no schema validation)',
},
},
{
name: 'errorMessage',
type: 'text',
label: 'Error Message',
admin: {
description: 'Optional error message if this is an error log',
},
},
{
name: 'errorLocation',
type: 'text',
label: 'Error Location',
admin: {
description: 'Optional location where error occurred (file:line format)',
},
},
],
};
This gives you stable, queryable structure for operational events while keeping logs decoupled from business tables. It also establishes a clear contract for what the queue task should persist.
Step 2: Route Log Writes Through the Jobs Queue
Now implement queue-first logging utilities and stop writing logs inline in request/hook paths.
// File: src/utilities/createLog.ts
import { getPayloadClient } from '@/lib/payloadClient';
import type { Payload, PayloadRequest } from 'payload';
export interface LogEntry {
level: 'debug' | 'info' | 'warning' | 'error';
source: 'webhook' | 'hook' | 'api' | 'migration' | 'manual';
description?: string;
data?: Record<string, any>;
errorMessage?: string;
errorLocation?: string;
tenant: number; // Tenant ID only
}
const LOG_PERSIST_TASK_SLUG = 'persistLog';
const LOG_QUEUE_NAME = 'logs';
function fallbackConsoleLog(logEntry: Partial<LogEntry>, reason: string): void {
try {
const timestamp = new Date().toISOString();
const level = logEntry?.level ?? 'info';
const source = logEntry?.source ?? 'unknown';
const description = logEntry?.description ?? 'No description';
const tenant = logEntry?.tenant ?? 'unknown';
console.log(
`[Logging Fallback] [${timestamp}] [${level.toUpperCase()}] [${source}] tenant=${tenant} | ${description} | Reason: ${reason}`
);
if (logEntry?.data) {
try {
const dataStr = JSON.stringify(logEntry.data);
const truncated = dataStr.length > 1000 ? dataStr.slice(0, 1000) + '...[truncated]' : dataStr;
console.log(`[Logging Fallback] Data: ${truncated}`);
} catch {
console.log('[Logging Fallback] Data: [unable to serialize]');
}
}
} catch {
// Do nothing as absolute last resort
}
}
function serializeForStorage(obj: any): any {
if (!obj) return obj;
try {
const seen = new WeakSet();
const serialized = JSON.parse(
JSON.stringify(obj, (key, value) => {
if (typeof value === 'object' && value !== null) {
if (seen.has(value)) return '[Circular Reference]';
seen.add(value);
}
if (value instanceof Error) {
return {
message: value.message,
stack: value.stack,
name: value.name,
};
}
if (value instanceof Date) return value.toISOString();
return value;
})
);
return serialized;
} catch (e) {
console.warn('[Logging] Failed to serialize object, returning safe representation:', e);
return {
_serializationError: 'Failed to serialize full object',
_type: typeof obj,
_keys: Array.isArray(obj) ? `[${obj.length} items]` : Object.keys(obj || {}).slice(0, 10),
};
}
}
function buildSafeLogData(logEntry: LogEntry, safeTenant: number) {
let serializedData: any;
try {
serializedData = logEntry.data ? serializeForStorage(logEntry.data) : undefined;
} catch {
serializedData = { _error: 'Failed to serialize data' };
}
return {
timestamp: new Date().toISOString(),
level: (logEntry.level || 'info') as 'debug' | 'info' | 'warning' | 'error',
source: (logEntry.source || 'api') as 'webhook' | 'hook' | 'api' | 'migration' | 'manual',
description: String(logEntry.description || '').slice(0, 10000),
data: serializedData,
errorMessage: logEntry.errorMessage ? String(logEntry.errorMessage).slice(0, 5000) : undefined,
errorLocation: logEntry.errorLocation ? String(logEntry.errorLocation).slice(0, 1000) : undefined,
tenant: safeTenant,
};
}
export interface QueueLogResult {
queued: boolean
}
export type QueueHookLogEntry = Omit<LogEntry, 'source' | 'tenant'>
export async function queueLog(
req: PayloadRequest | undefined,
logEntry: LogEntry
): Promise<QueueLogResult> {
try {
if (!logEntry || typeof logEntry !== 'object') {
fallbackConsoleLog({}, 'Invalid logEntry provided for queueing');
return { queued: false };
}
const safeTenant = typeof logEntry.tenant === 'number' && !isNaN(logEntry.tenant)
? logEntry.tenant
: 1;
let payload: Payload | null = null;
try {
if (req?.payload && typeof req.payload.create === 'function') {
payload = req.payload;
} else {
payload = await getPayloadClient();
}
} catch (clientErr) {
fallbackConsoleLog(
{ ...logEntry, tenant: safeTenant },
`Failed to get Payload client for queueing: ${clientErr instanceof Error ? clientErr.message : String(clientErr)}`
);
return { queued: false };
}
if (!payload || !payload.jobs || typeof payload.jobs.queue !== 'function') {
fallbackConsoleLog(
{ ...logEntry, tenant: safeTenant },
'No valid payload jobs queue available'
);
return { queued: false };
}
const queueInput = buildSafeLogData({ ...logEntry, tenant: safeTenant }, safeTenant);
await payload.jobs.queue({
task: LOG_PERSIST_TASK_SLUG,
input: queueInput,
queue: LOG_QUEUE_NAME,
});
return { queued: true };
} catch (err) {
const errorMsg = err instanceof Error ? err.message : String(err);
fallbackConsoleLog(logEntry || {}, `Failed to queue log entry: ${errorMsg}`);
return { queued: false };
}
}
export async function queueLogHook(
req: PayloadRequest,
entry: QueueHookLogEntry
): Promise<void> {
const result = await queueLog(req, {
...entry,
source: 'hook',
tenant: 1,
});
if (!result.queued) {
req.payload.logger.error({
msg: 'Failed to queue hook log',
description: entry.description,
errorLocation: entry.errorLocation,
});
}
}
This code does four production-critical things. It normalizes input into a stable log schema, serializes unsafe payload data defensively, enqueues logs into a dedicated queue, and never lets logging failures crash the main flow. That is the practical reliability improvement over legacy synchronous logging.
With this in place, every API route and hook can call queueLog or queueLogHook and stay non-blocking.
Step 3: Use Queue Logging in Request/Hook Paths
Keep usage simple and consistent so teams do not drift back to inline writes.
// File: src/app/api/example/route.ts
import { queueLog } from '@/utilities/createLog';
await queueLog(req, {
level: 'info',
source: 'api',
description: 'Order webhook processed',
data: { orderId: 12345 },
tenant: 1,
});
// File: src/collections/Orders/hooks/example.ts
import { queueLogHook } from '@/utilities/createLog';
await queueLogHook(req, {
level: 'error',
description: 'afterChange failed to enqueue email task',
errorLocation: 'orders/afterChange.ts:42',
data: { orderId: doc.id },
});
At this point, your implementation is aligned with the architecture decision: operational telemetry is off the hot path and persisted asynchronously.
Operational Considerations After Deploy
A queue-based design solves request-path strain, but production quality depends on operations.
First, define retention. Logs are operational data, so set policy by value and cost instead of keeping everything forever. If a class of logs has no debugging or audit value after a time window, expire or archive it.
Second, monitor the queue itself. Your logging system is now a pipeline, so backlog depth and processing latency are the key health signals. If queue depth grows faster than workers drain it, your "working" logging setup is already degraded.
Third, plan for backlog behavior. During incident spikes, queue delay will increase. That is expected. What matters is that business requests still succeed and logs eventually persist once pressure drops. This is exactly why queue isolation is worth the extra moving part.
Conclusion
The problem was never just "how to write a log in Payload." The real production problem was coupling log persistence to request execution. The solution is a design choice first: dedicated logs collection plus a dedicated Jobs queue and task for asynchronous persistence.
With this architecture, you can keep request paths fast, preserve logs through transient failures, and operate logging as a system instead of a helper function.
Let me know in the comments if you have questions, and subscribe for more practical development guides.
Thanks, Matija
📚 Comprehensive Payload CMS Guides
Detailed Payload guides with field configuration examples, custom components, and workflow optimization tips to speed up your CMS development process.


