PostgreSQL cluster creation doesn't work on a different namespace (than pgo)

Hello,

I’m following PostgreSQL installation guide: Install Percona Distribution for PostgreSQL on Kubernetes

It works as expected with default configuration on “pgo” namespace, but I’m unable to deploy to another namespace.

For background information: I tested PostgreSQL Operator version of 0.2.0 before this 1.0.0 on the same Kubernetes cluster.

Commands that I run on shell: commands - Pastebin.com
Operator configuration file (changed namespaces): operator conf - Pastebin.com
Cluster configuration file (changed namespace): cr conf - Pastebin.com
Operator logs: operator logs - Pastebin.com (kubectl logs postgres-operator-bcddf4647-pnsz5 operator -n possu2 -command)

So Operator logs are saying that there are some permissions missing, and operator is telling “version=0.2.0”, but how to fix this?

possu_gcp
Ok seems like instruction works just fine when there is a fresh Kubernetes cluster. This is deployed on GKE. The fact that pgbouncer is on “Pending” state is because, vCPU’s ran out…
So I think deployment fails on another Kubernetes cluster because of older version of Operator (0.2.0). Is this possible to upgrade? Or am I looking at the right direction?

1 Like

hi @katajistok ,

The documentation on how to update the operator you can find using the following link: Update Percona Distribution for PostgreSQL Operator

If you want to install several operators using one k8s cluster you need to set up the following operations in proper way.

E.g. operatorA will have:

namespace: "pgo-a"
pgo_operator_namespace: "pgo-a"

operatorB will have:

namespace: "pgo-b"
pgo_operator_namespace: "pgo-b"
1 Like

Thank you. I was able to deploy PostgreSQL on a fresh cluster just right your instruction says.

But I’m unable to upgrade/patch old Operator.

“kubectl describe pod cluster1-5b9f4c4645-58xpt” says:
Events:
Type Reason Age From Message


Normal Scheduled 10m default-scheduler Successfully assigned pgo/cluster1-5b9f4c4645-58xpt to fi-he-hdc-z1-g13-40
Normal SuccessfulAttachVolume 10m attachdetach-controller AttachVolume.Attach succeeded for volume “pvc-894ef55e-b4a9-4b61-b308-7ec4a6141802”
Normal Pulling 8m45s (x4 over 10m) kubelet Pulling image “percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha”
Warning Failed 8m44s (x4 over 10m) kubelet Failed to pull image “percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha”: rpc error: code = NotFound desc = failed to pull and unpack image “docker.io/percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha”: failed to resolve reference “docker.io/percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha”: docker.io/percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha: not found
Warning Failed 8m44s (x4 over 10m) kubelet Error: ErrImagePull
Warning Failed 8m17s (x6 over 10m) kubelet Error: ImagePullBackOff
Normal BackOff 7s (x42 over 10m) kubelet Back-off pulling image “percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha”
[root@dbaasjump002 deploy]# podman pull percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha
:heavy_check_mark: docker.io/percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha
Trying to pull docker.io/percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha
manifest unknown: manifest unknown
Error: Error initializing source docker://percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha: Error reading manifest v1.0.0-ppg13-postgres-ha in docker.io/percona/percona-postgresql-operator: manifest unknown: manifest unknown

1 Like

hi @katajistok,

It was doc issue and we have fixed it. The tag for docker images should be without ‘v’ prefix.

1 Like

Hi guys. Our Kubernetes cluster was re-installed. I was able to deploy PosgreSQL cluster on another namespace, but then I’m not able to deploy clusters to other namespaces. So it is a fresh k8s cluster. We are utilizing local storage with OpenEBS.

Operator logs on a broken envs show that there might be some issues with permissions. One row from logs: “1. Logs (or some of the latest rows) from operator “kubectl logs postgres-operator-6d8c847594-skhwd operator -n possu2”: 2. time=“2021-12-01T07:24:56Z” level=error msg=“Controller Manager: Controller Group for namespace possu2 does not have the required list privileges for resource pods in the Core API” func=“github.com/percona/percona-postgresql-operator/internal/controller/manager.(*ControllerManager).hasListerPrivs()” file=”/go/src/github.com/percona/percona-postgresql-operator/internal/controller/manager/controllermanager.go:370" version=0.2.0".

Details of one environment over here: postgredetails - Pastebin.com.

What could be done with this case? I might try to reproduce the same on Google cloud.

I have collected some information below:

PostgreSQL clusters:
[root@dbaasjump002 deploy]# kubectl get PerconaPGCluster -A
NAMESPACE NAME AGE
dbaas-kimmotestipg-013 dbaas-kimmotestipg-013 17h
dbaas-postgresql-possu01 possu01 25h
dbaas-postgresql-possu02 possu02 25h
possu2 cluster1 58m

Only one that is working:
[root@dbaasjump002 postgresql_production_configuration]# kubectl get pods -n dbaas-postgresql-possu01
NAME READY STATUS RESTARTS AGE
backrest-backup-possu01-jcg2t 0/1 Completed 0 24h
pgo-deploy-s9hdl 0/1 Completed 0 25h
possu01-5f4c6b955b-l9kxt 1/1 Running 0 24h
possu01-backrest-shared-repo-665bfd758c-wf897 1/1 Running 0 24h
possu01-pgbouncer-54d678db55-jhfl8 1/1 Running 0 24h
possu01-repl1-ffdfcfd5c-g7xsv 1/1 Running 0 24h
possu01-repl2-57d8fc4d6f-bjgfj 1/1 Running 0 24h
postgres-operator-64cd8c59f6-jjwl9 4/4 Running 1 25h

Configuration files: possu01 - Pastebin.com

Not working environment:
[root@dbaasjump002 postgresql_production_configuration]# kubectl get pods -n dbaas-postgresql-possu02
NAME READY STATUS RESTARTS AGE
pgo-deploy-wktd7 0/1 Completed 0 25h
postgres-operator-d9bfdd95c-qfblv 4/4 Running 0 25h

Configuration files: possu02 - Pastebin.com

Not working environment:
[root@dbaasjump002 deploy]# kubectl get pods -n possu2
NAME READY STATUS RESTARTS AGE
pgo-deploy-lp5td 0/1 Completed 0 60m
postgres-operator-6d8c847594-skhwd 4/4 Running 0 60m

Configuration files: postgresql - Pastebin.com
Details about this environment: postgredetails - Pastebin.com

1 Like

I reproduced the test on Google Cloud and got the same situation:
conaPGClusterh-Al:~/percona-postgresql-operator/deploy (fi-katajistok-test-project)$ kubectl get Per
NAMESPACE NAME AGE
pgo cluster1 34m
pgo2 cluster1 4m21s

kimkat@cloudshell:~/percona-postgresql-operator/deploy (fi-katajistok-test-project)$ kubectl get pods -n pgo
NAME READY STATUS RESTARTS AGE
backrest-backup-cluster1-9llvb 0/1 Completed 0 32m
cluster1-566f8f9977-kr8cp 1/1 Running 0 33m
cluster1-backrest-shared-repo-7b5bb89b87-hcnq7 1/1 Running 0 34m
cluster1-pgbouncer-5b59b7dc67-lpjdd 1/1 Running 0 32m
cluster1-repl1-5d88984f58-hn2ph 1/1 Running 0 31m
cluster1-repl2-7bc7bfc65b-b5qz8 1/1 Running 0 31m
pgo-deploy-596f8 0/1 Completed 0 43m
postgres-operator-bf9d88f79-dpc54 4/4 Running 1 43m

kimkat@cloudshell:~/percona-postgresql-operator/deploy (fi-katajistok-test-project)$ kubectl get pods -n pgo2
NAME READY STATUS RESTARTS AGE
pgo-deploy-k29vk 0/1 Completed 0 11m
postgres-operator-66b6ccd7c5-jtvl8 4/4 Running 0 11m

Logs from Operator in pgo2 namespace:
time=“2021-12-01T10:55:38Z” level=error msg=“Controller Manager: Controller Group for namespace pgo2 does not have the required list privileges for resource pgpolicies inthe pg.percona.com API” func=“github.com/percona/percona-postgresql-operator/internal/controller/manager.(*ControllerManager).hasListerPrivs()” file="/go/src/github.com/percona/percona-postgresql-operator/internal/controller/manager/controllermanager.go:357" version=0.2.0
time=“2021-12-01T10:55:38Z” level=error msg=“Controller Manager: Controller Group for namespace pgo2 does not have the required list privileges for resource pods in the Core API” func=“github.com/percona/percona-postgresql-operator/internal/controller/manager.(*ControllerManager).hasListerPrivs()” file="/go/src/github.com/percona/percona-postgresql-operator/internal/controller/manager/controllermanager.go:370" version=0.2.0
time=“2021-12-01T10:55:38Z” level=error msg=“Controller Manager: Controller Group for namespace pgo2 does not have the required list privileges for resource jobs in the Batch API” func=“github.com/percona/percona-postgresql-operator/internal/controller/manager.(*ControllerManager).hasListerPrivs()” file="/go/src/github.com/percona/percona-postgresql-operator/internal/controller/manager/controllermanager.go:382" version=0.2.0
time=“2021-12-01T10:55:38Z” level=error msg=“Namespace Controller: error syncing Namespace ‘pgo2’: Controller Manager: cannot start controller group for namespace pgo2 because it does not have the required privs, will attempt to start on the next ns refresh interval” func=“github.com/percona/percona-postgresql-operator/internal/controller/namespace.(*Controller).processNextWorkItem()” file="/go/src/github.com/percona/percona-postgresql-operator/internal/controller/namespace/namespacecontroller.go:151" version=0.2.0

I would like to go this thru with you guys in a meeting. Would this be possible?

1 Like