Full and incremental backups conflicting and stopping all backups

Emmanuel_Tom_Jose · February 26, 2026, 6:07am

We hit the same issue — full and incremental schedules competing for the pgBackRest lock, failed PerconaPGBackup objects piling up, and eventually all scheduled backups stopping. This persists beyond
v2.4.x — similar behavior has been reported on Can't start backup. Previous backup is still in progress , where a stuck PerconaPGBackup in “Starting” state blocks all
future backups.

Our workaround was to bypass the operator’s native scheduling entirely and use external Kubernetes CronJobs that kubectl exec into the repo-host pod. The key difference: the script checks if a backup
is already running before attempting one, so it never creates a conflicting backup that would generate a stuck PerconaPGBackup object.

Here’s the core logic (same for both full and incremental, just change --type=full to --type=incr):

apiVersion: batch/v1
kind: CronJob
metadata:
name: pgbackrest-full-weekly
namespace: postgres-operator
spec:
schedule: “30 2 * * 0”
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 3
jobTemplate:
spec:
ttlSecondsAfterFinished: 86400
template:
spec:
serviceAccountName: pgbackrest-cronjob-sa
containers:
- name: backup-trigger
image: bitnami/kubectl:latest
command:
- /bin/sh
- -c
- |
set -e
# Find the repo-host pod
REPO_HOST=$(kubectl get pod -n postgres-operator 
-l postgres-operator.crunchydata.com/data=pgbackrest 
–field-selector=status.phase=Running 
-o jsonpath=‘{.items[0].metadata.name}’ 2>/dev/null)
if [ -z “$REPO_HOST” ]; then
echo “SKIP: No running repo-host pod found”
exit 0
fi

# Check stanza is initialized
STANZA_INFO=$(kubectl exec -n postgres-operator "$REPO_HOST" \
-c pgbackrest -- pgbackrest info --stanza=db \
--output=json 2>/dev/null) || {
echo "SKIP: stanza not ready yet"
exit 0
}

# KEY: Check if another backup holds the lock
if echo "$STANZA_INFO" | grep -q '"held":true'; then

echo "SKIP: Another backup is already running"
exit 0
fi

# Safe to run
kubectl exec -n postgres-operator "$REPO_HOST" \ -c pgbackrest -- pgbackrest backup – stanza=db --type=full --log-level-console=info 

restartPolicy: OnFailure

You’ll need a ServiceAccount + Role + RoleBinding with pods: [get, list] and pods/exec: [create] in the namespace.

Why this works where native scheduling doesn’t:

concurrencyPolicy: Forbid — Kubernetes prevents overlapping CronJob runs
ttlSecondsAfterFinished: 86400 — completed Jobs cleaned up after 24h, no backlog
successfulJobsHistoryLimit/failedJobsHistoryLimit: 3 — caps Job object history
The “held”:true lock check — gracefully skips instead of failing and creating stuck PerconaPGBackup objects
Bypasses operator state tracking entirely — no pgv2.percona.com/backup-in-progress annotation issues

Remove the schedules block from your PerconaPGCluster CR when switching to this approach. We’ve been running this in production (weekly full + 4-hourly incremental to S3) for months with zero backlog
issues.

Topic		Replies	Views
Can't start backup. Previous backup is still in progress Percona Operator for PostgreSQL postgresql	6	504	February 26, 2026
Cron job issue for postgresql backup to s3 Percona Operator for PostgreSQL	5	590	September 22, 2025
Backup stuck in starting starte PostgreSQL	1	219	September 9, 2025
Percona K8S Operator, scheduled backup fails because another operation is running Percona Backup for MongoDB percona , closed-no-reply	0	184	June 17, 2024
Backup fails after helm upgrade Percona Operator for PostgreSQL	1	384	November 16, 2024

Full and incremental backups conflicting and stopping all backups

Related topics