PostgreSQL cluster creation doesn't work on a different namespace (than pgo)

Hello,

I’m following PostgreSQL installation guide: Install Percona Distribution for PostgreSQL on Kubernetes

It works as expected with default configuration on “pgo” namespace, but I’m unable to deploy to another namespace.

For background information: I tested PostgreSQL Operator version of 0.2.0 before this 1.0.0 on the same Kubernetes cluster.

Commands that I run on shell: commands - Pastebin.com
Operator configuration file (changed namespaces): operator conf - Pastebin.com
Cluster configuration file (changed namespace): cr conf - Pastebin.com
Operator logs: operator logs - Pastebin.com (kubectl logs postgres-operator-bcddf4647-pnsz5 operator -n possu2 -command)

So Operator logs are saying that there are some permissions missing, and operator is telling “version=0.2.0”, but how to fix this?

possu_gcp
Ok seems like instruction works just fine when there is a fresh Kubernetes cluster. This is deployed on GKE. The fact that pgbouncer is on “Pending” state is because, vCPU’s ran out…
So I think deployment fails on another Kubernetes cluster because of older version of Operator (0.2.0). Is this possible to upgrade? Or am I looking at the right direction?

1 Like

hi @katajistok ,

The documentation on how to update the operator you can find using the following link: Update Percona Distribution for PostgreSQL Operator

If you want to install several operators using one k8s cluster you need to set up the following operations in proper way.

E.g. operatorA will have:

namespace: "pgo-a"
pgo_operator_namespace: "pgo-a"

operatorB will have:

namespace: "pgo-b"
pgo_operator_namespace: "pgo-b"
1 Like

Thank you. I was able to deploy PostgreSQL on a fresh cluster just right your instruction says.

But I’m unable to upgrade/patch old Operator.

“kubectl describe pod cluster1-5b9f4c4645-58xpt” says:
Events:
Type Reason Age From Message


Normal Scheduled 10m default-scheduler Successfully assigned pgo/cluster1-5b9f4c4645-58xpt to fi-he-hdc-z1-g13-40
Normal SuccessfulAttachVolume 10m attachdetach-controller AttachVolume.Attach succeeded for volume “pvc-894ef55e-b4a9-4b61-b308-7ec4a6141802”
Normal Pulling 8m45s (x4 over 10m) kubelet Pulling image “percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha”
Warning Failed 8m44s (x4 over 10m) kubelet Failed to pull image “percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha”: rpc error: code = NotFound desc = failed to pull and unpack image “docker.io/percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha”: failed to resolve reference “docker.io/percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha”: docker.io/percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha: not found
Warning Failed 8m44s (x4 over 10m) kubelet Error: ErrImagePull
Warning Failed 8m17s (x6 over 10m) kubelet Error: ImagePullBackOff
Normal BackOff 7s (x42 over 10m) kubelet Back-off pulling image “percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha”
[root@dbaasjump002 deploy]# podman pull percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha
:heavy_check_mark: docker.io/percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha
Trying to pull docker.io/percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha
manifest unknown: manifest unknown
Error: Error initializing source docker://percona/percona-postgresql-operator:v1.0.0-ppg13-postgres-ha: Error reading manifest v1.0.0-ppg13-postgres-ha in docker.io/percona/percona-postgresql-operator: manifest unknown: manifest unknown

1 Like

hi @katajistok,

It was doc issue and we have fixed it. The tag for docker images should be without ‘v’ prefix.

1 Like

Hi guys. Our Kubernetes cluster was re-installed. I was able to deploy PosgreSQL cluster on another namespace, but then I’m not able to deploy clusters to other namespaces. So it is a fresh k8s cluster. We are utilizing local storage with OpenEBS.

Operator logs on a broken envs show that there might be some issues with permissions. One row from logs: “1. Logs (or some of the latest rows) from operator “kubectl logs postgres-operator-6d8c847594-skhwd operator -n possu2”: 2. time=“2021-12-01T07:24:56Z” level=error msg=“Controller Manager: Controller Group for namespace possu2 does not have the required list privileges for resource pods in the Core API” func=“github.com/percona/percona-postgresql-operator/internal/controller/manager.(*ControllerManager).hasListerPrivs()” file=”/go/src/github.com/percona/percona-postgresql-operator/internal/controller/manager/controllermanager.go:370" version=0.2.0".

Details of one environment over here: postgredetails - Pastebin.com.

What could be done with this case? I might try to reproduce the same on Google cloud.

I have collected some information below:

PostgreSQL clusters:
[root@dbaasjump002 deploy]# kubectl get PerconaPGCluster -A
NAMESPACE NAME AGE
dbaas-kimmotestipg-013 dbaas-kimmotestipg-013 17h
dbaas-postgresql-possu01 possu01 25h
dbaas-postgresql-possu02 possu02 25h
possu2 cluster1 58m

Only one that is working:
[root@dbaasjump002 postgresql_production_configuration]# kubectl get pods -n dbaas-postgresql-possu01
NAME READY STATUS RESTARTS AGE
backrest-backup-possu01-jcg2t 0/1 Completed 0 24h
pgo-deploy-s9hdl 0/1 Completed 0 25h
possu01-5f4c6b955b-l9kxt 1/1 Running 0 24h
possu01-backrest-shared-repo-665bfd758c-wf897 1/1 Running 0 24h
possu01-pgbouncer-54d678db55-jhfl8 1/1 Running 0 24h
possu01-repl1-ffdfcfd5c-g7xsv 1/1 Running 0 24h
possu01-repl2-57d8fc4d6f-bjgfj 1/1 Running 0 24h
postgres-operator-64cd8c59f6-jjwl9 4/4 Running 1 25h

Configuration files: possu01 - Pastebin.com

Not working environment:
[root@dbaasjump002 postgresql_production_configuration]# kubectl get pods -n dbaas-postgresql-possu02
NAME READY STATUS RESTARTS AGE
pgo-deploy-wktd7 0/1 Completed 0 25h
postgres-operator-d9bfdd95c-qfblv 4/4 Running 0 25h

Configuration files: possu02 - Pastebin.com

Not working environment:
[root@dbaasjump002 deploy]# kubectl get pods -n possu2
NAME READY STATUS RESTARTS AGE
pgo-deploy-lp5td 0/1 Completed 0 60m
postgres-operator-6d8c847594-skhwd 4/4 Running 0 60m

Configuration files: postgresql - Pastebin.com
Details about this environment: postgredetails - Pastebin.com

1 Like

I reproduced the test on Google Cloud and got the same situation:
conaPGClusterh-Al:~/percona-postgresql-operator/deploy (fi-katajistok-test-project)$ kubectl get Per
NAMESPACE NAME AGE
pgo cluster1 34m
pgo2 cluster1 4m21s

kimkat@cloudshell:~/percona-postgresql-operator/deploy (fi-katajistok-test-project)$ kubectl get pods -n pgo
NAME READY STATUS RESTARTS AGE
backrest-backup-cluster1-9llvb 0/1 Completed 0 32m
cluster1-566f8f9977-kr8cp 1/1 Running 0 33m
cluster1-backrest-shared-repo-7b5bb89b87-hcnq7 1/1 Running 0 34m
cluster1-pgbouncer-5b59b7dc67-lpjdd 1/1 Running 0 32m
cluster1-repl1-5d88984f58-hn2ph 1/1 Running 0 31m
cluster1-repl2-7bc7bfc65b-b5qz8 1/1 Running 0 31m
pgo-deploy-596f8 0/1 Completed 0 43m
postgres-operator-bf9d88f79-dpc54 4/4 Running 1 43m

kimkat@cloudshell:~/percona-postgresql-operator/deploy (fi-katajistok-test-project)$ kubectl get pods -n pgo2
NAME READY STATUS RESTARTS AGE
pgo-deploy-k29vk 0/1 Completed 0 11m
postgres-operator-66b6ccd7c5-jtvl8 4/4 Running 0 11m

Logs from Operator in pgo2 namespace:
time=“2021-12-01T10:55:38Z” level=error msg=“Controller Manager: Controller Group for namespace pgo2 does not have the required list privileges for resource pgpolicies inthe pg.percona.com API” func=“github.com/percona/percona-postgresql-operator/internal/controller/manager.(*ControllerManager).hasListerPrivs()” file="/go/src/github.com/percona/percona-postgresql-operator/internal/controller/manager/controllermanager.go:357" version=0.2.0
time=“2021-12-01T10:55:38Z” level=error msg=“Controller Manager: Controller Group for namespace pgo2 does not have the required list privileges for resource pods in the Core API” func=“github.com/percona/percona-postgresql-operator/internal/controller/manager.(*ControllerManager).hasListerPrivs()” file="/go/src/github.com/percona/percona-postgresql-operator/internal/controller/manager/controllermanager.go:370" version=0.2.0
time=“2021-12-01T10:55:38Z” level=error msg=“Controller Manager: Controller Group for namespace pgo2 does not have the required list privileges for resource jobs in the Batch API” func=“github.com/percona/percona-postgresql-operator/internal/controller/manager.(*ControllerManager).hasListerPrivs()” file="/go/src/github.com/percona/percona-postgresql-operator/internal/controller/manager/controllermanager.go:382" version=0.2.0
time=“2021-12-01T10:55:38Z” level=error msg=“Namespace Controller: error syncing Namespace ‘pgo2’: Controller Manager: cannot start controller group for namespace pgo2 because it does not have the required privs, will attempt to start on the next ns refresh interval” func=“github.com/percona/percona-postgresql-operator/internal/controller/namespace.(*Controller).processNextWorkItem()” file="/go/src/github.com/percona/percona-postgresql-operator/internal/controller/namespace/namespacecontroller.go:151" version=0.2.0

I would like to go this thru with you guys in a meeting. Would this be possible?

1 Like

Hi Team,

I verified the same issue in the Google Cloud environment & following steps were executed:

  1. git clone -b v1.1.0 GitHub - percona/percona-postgresql-operator: Percona Distribution for PostgreSQL Operator
    cd percona-postgresql-operator/
    kubectl create namespace pgo
    kubectl config set-context $(kubectl config current-context) --namespace=pgo

  2. kubectl apply -f deploy/operator.yaml
    kubectl get pods
    NAME READY STATUS RESTARTS AGE
    pgo-deploy-5xpm9 1/1 Running 0 34s
    postgres-operator-59f88b784c-q6v4t 0/4 ContainerCreating 0 8s

  3. kubectl apply -f deploy/cr.yaml
    perconapgcluster.pg.percona.com/cluster1 created
    kubectl get pods
    NAME READY STATUS RESTARTS AGE
    backrest-backup-cluster1-z86lc 0/1 Completed 0 2m38s
    cluster1-backrest-shared-repo-5987db49c-s5ltl 1/1 Running 0 4m54s
    cluster1-c68d56c6-hks8h 1/1 Running 0 4m5s
    cluster1-pgbouncer-78d9967c89-bc54k 1/1 Running 0 3m3s
    cluster1-pgbouncer-78d9967c89-k52pw 1/1 Running 0 3m3s
    cluster1-pgbouncer-78d9967c89-l7n6f 1/1 Running 0 3m3s
    cluster1-repl1-5479dc5b49-pmmdm 1/1 Running 0 2m21s
    cluster1-repl2-7486fb9fd7-fdwrn 1/1 Running 0 2m20s
    pgo-deploy-5xpm9 0/1 Completed 0 7m34s
    postgres-operator-59f88b784c-q6v4t 4/4 Running 1 7m8s

  4. kubectl get secret cluster1-pguser-secret -o yaml & tested the Postgresql connectivity:
    pgdb=> \dn
    List of schemas
    Name | Owner
    -----------±---------
    pgbouncer | postgres
    pguser | pguser
    public | postgres
    (3 rows)

  5. Now created another namespace pgo1:
    kubectl create namespace pgo1
    kubectl config set-context $(kubectl config current-context) --namespace=pgo1

  6. Edited to operator.yaml with the new namespace value on the below lines:
    line no5: namespace: pgo1
    line no123: namespace: pgo1
    line no 192: pgo_operator_namespace: “pgo1”
    line no 263: namespace: pgo1
    line no 269: namespace: pgo1

  7. kubectl apply -f deploy/operator.yaml
    kubectl get pods -n pgo2 (When verified only the Deployment is running & the Percona operator itself not created)
    NAME READY STATUS RESTARTS AGE
    pgo-deploy-npzs9 0/1 Completed 0 122m

Please verify why Postgresql Operator pod is not getting started in another namespace other than pgo. Kindly share your inputs.

Thanks,
Sivapriya

1 Like

Hi @katajistok @sivapriyas ,

If you want to install several operators under one k8s cluster you need to change percona-postgresql-operator/operator.yaml at main · percona/percona-postgresql-operator · GitHub option to ‘disabled/readonly’ but I have found the issue connected with it and it can’t be used now. We are working to fix this issue and the fix will be available in the next release. If you want I can inform you as soon as we merge our changes into main and you can test it from your end too.

Thank you for your feedback.

2 Likes

Hi Sarzhan - Thanks for your update! Can you please let us know when the next release will be available for us to do the testing?

1 Like

@sivapriyas this was already fixed in the main branch. Feel free to try it out (just clone the main branch and deploy operator and DB from it) :slight_smile:

1 Like

Thanks for the update Spronin - can you please let me know which is the main branch? or please update the below command alone?
git clone -b v1.1.0 GitHub - percona/percona-postgresql-operator: Percona Distribution for PostgreSQL Operator
cd percona-postgresql-operator/

Which branch name I should select instead of V1.1.0 as I see 70 branches available in the website.

Thanks,
Sivapriya

1 Like

Main is the main or master :slight_smile:

Just do git clone https://github.com/percona/percona-server-mongodb-operator/ without any branch.

2 Likes

Ok Got it - Thanks I will check and update the post!

1 Like

Hi @sivapriyas , please inform us about your tests results. Thank you.

1 Like

I tested the same steps and it is working as expected in the Google cloud. Thanks for your inputs!

2 Likes

I believe not “mongodb” :slight_smile:

1 Like

@katajistok We are planing to release 1.2.0 in one month. And this fix will be included in this release.

2 Likes

Hi @SlavaSarzhan. Is there a new information about release date?

1 Like

Hey @katajistok - 28th of March is set as a release date.

2 Likes