Not the answer you need?
Register and ask your own question!
Many Forum changes were implemented on Tue 22 Sep. Read about new Ranks, Scoring, and Reactions.
Email [email protected] for any comments or concerns.

.status.conditions of CRD PerconaServerMongoDB

OleksiiOleksii ContributorCurrent User Role Novice
I'm using AWS EKS service. The etcd parameter max-request-bytes is set to 1.5 Mb and is not editable (even support cannot edit it).
I've started a percona operator on a cluster that was not yet ready for it. The PerconaServerMongoDB resource was created and was unsuccessfully trying to start pods. Finally, I fix the issues with the cluster and the pods got started.

Recently I was trying to update the operator but got the error `etcd request too large`. It appears all the attempts of starting the cluster are logged in the perconaservermongodbs object .status.conditions.

$ kubectl get perconaservermongodbs project-dev-app1-pmongo -o yaml | grep lastTransitionTime | wc -l

The resource is full of lines like these:

- lastTransitionTime: "2020-01-22T08:43:23Z"
status: "True"
type: ClusterInitializing
- lastTransitionTime: "2020-01-22T08:43:26Z"
status: "True"
type: ClusterInitializing
- lastTransitionTime: "2020-01-22T08:43:29Z"
status: "True"
type: ClusterInitializing

The size of the object is slightly more than 1.5 MB and i cannot edit it. I found no way of removing these lines from the object...
Does anyone have the same issue? Is there a way to edit the CRD object not deleting it?

PS kubectl edit perconaservermongodbs project-dev-app1-pmongo is not working. it reports the object is edited but it is not.


  • OleksiiOleksii Contributor Current User Role Novice
    For those who might have the same issue.

    AWS support agreed to increase the etcd limit. After they apply the change new lines begin to appear:
    - lastTransitionTime: "2020-03-02T05:32:21Z"
    status: "True"
    type: ClusterInitializing
    - lastTransitionTime: "2020-03-02T05:32:22Z"
    status: "True"
    type: ClusterInitializing

    and should hit the new limit in a while...

    Note: I use terraform to maintain infrastructure. I have a helm module that creates a Percona Operator in k8s. There are 5 clusters of mongo created from the same module.

    I've re-created the fault cluster and even re-created the PVC of mondo db folder. basically i deleted everything related to the fault CRD instance and created in from scratch. but the issue did not fix. BTW all cluster state was OK, after the recreation when the replica set got synced the state became OK too, no error in any logs (operator\coordinator\mongodb pods or k8s events). I've tried to debug but found nothing... so i deleted the fault cluster (one more time - only one cluster have this issue). On the next day, using the same terraform module I created a cluster again and the issue got fixed. Have no idea what was this and why it got fixed...

    If someone will face the same issue and find the root cause - will be useful to be mentioned here.

    I've an assumption but did not have a chance to check: The operator container version might be a faulty one. Maybe `if status.Status != currentRSstatus.Status` in a controller had a different logic... this is just my guess...
  • lorraine.pocklingtonlorraine.pocklington Percona Community Manager Legacy User Role Patron
    Thanks for updating this post with our solution, it is much appreciated.
    I will share your post with the Kubernetes team so they are aware and in case they would like to add anything.
Sign In or Register to comment.

MySQL, InnoDB, MariaDB and MongoDB are trademarks of their respective owners.
Copyright ©2005 - 2020 Percona LLC. All rights reserved.