Let me try to explain more methodically.
-
Our application uses postgres Full Text Search. See here: PostgreSQL: Documentation: 15: Chapter 12. Full Text Search
-
When you configure Postgres for ‘Full Text Search’. you very often need to create (and maintain) context specific dictionaries). See here: PostgreSQL: Documentation: 15: 12.6. Dictionaries
-
We use customized iSpell dictionaries.
-
To install a dictionary, you create it and the copy it to the $SHAREDIR/tsearch_data
folder. The value of SHAREDIR is compiled into Postgres and you can get it using:
pg_config --sharedir
On Percona Operator instances SHAREDIR is : /usr/pgsql-15/share/
On a default install of Postgres (including Percona’s bare metal install) this is : /usr/share/postgresql/15/
Which suggests that Percona have compiled thier own Postgres and when doing so, changed this folder. But that is irrelevant.
What is relevant is that to add directories you need to copy the dictionaries that you need to : $SHAREDIR/tsearch_data
- The issue
problem 1) that folder is owned by root with only read permissions for everyone else:
bash-4.4$ ls -l /usr/pgsql-15/share
total 1312
drwxr-xr-x 2 root root 4096 Jun 9 07:50 contrib
-rw-r--r-- 1 root root 33458 Mar 6 2023 errcodes.txt
drwxr-xr-x 2 root root 12288 Jun 9 07:50 extension
-rw-r--r-- 1 root root 115044 Mar 6 2023 information_schema.sql
drwxr-xr-x 5 root root 4096 Jun 9 07:50 man
-rw-r--r-- 1 root root 4703 Mar 6 2023 pg_hba.conf.sample
-rw-r--r-- 1 root root 1636 Mar 6 2023 pg_ident.conf.sample
-rw-r--r-- 1 root root 604 Mar 6 2023 pg_service.conf.sample
-rw-r--r-- 1 root root 929375 Mar 6 2023 postgres.bki
-rw-r--r-- 1 root root 19 Mar 6 2023 postgresql-15-libs.conf
-rw-r--r-- 1 root root 29383 Mar 6 2023 postgresql.conf.sample
-rw-r--r-- 1 root root 278 Mar 6 2023 psqlrc.sample
-rw-r--r-- 1 root root 44176 Mar 6 2023 snowball_create.sql
-rw-r--r-- 1 root root 35376 Mar 6 2023 sql_features.txt
-rw-r--r-- 1 root root 8816 Mar 6 2023 system_constraints.sql
-rw-r--r-- 1 root root 22914 Mar 6 2023 system_functions.sql
-rw-r--r-- 1 root root 49466 Mar 6 2023 system_views.sql
drwxr-xr-x 2 root root 4096 Jun 9 07:50 timezonesets
drwxr-xr-x 2 root root 4096 Jun 9 07:50 tsearch_data
and the files in it are similarly set:
bash-4.4$ ls -l /usr/pgsql-15/share/tsearch_data/
total 116
-rw-r--r-- 1 root root 424 Mar 6 2023 danish.stop
-rw-r--r-- 1 root root 453 Mar 6 2023 dutch.stop
-rw-r--r-- 1 root root 622 Mar 6 2023 english.stop
-rw-r--r-- 1 root root 1579 Mar 6 2023 finnish.stop
-rw-r--r-- 1 root root 805 Mar 6 2023 french.stop
-rw-r--r-- 1 root root 1349 Mar 6 2023 german.stop
-rw-r--r-- 1 root root 1227 Mar 6 2023 hungarian.stop
-rw-r--r-- 1 root root 243 Mar 6 2023 hunspell_sample.affix
-rw-r--r-- 1 root root 633 Mar 6 2023 hunspell_sample_long.affix
-rw-r--r-- 1 root root 98 Mar 6 2023 hunspell_sample_long.dict
-rw-r--r-- 1 root root 462 Mar 6 2023 hunspell_sample_num.affix
-rw-r--r-- 1 root root 129 Mar 6 2023 hunspell_sample_num.dict
-rw-r--r-- 1 root root 465 Mar 6 2023 ispell_sample.affix
-rw-r--r-- 1 root root 81 Mar 6 2023 ispell_sample.dict
-rw-r--r-- 1 root root 1654 Mar 6 2023 italian.stop
-rw-r--r-- 1 root root 4261 Mar 6 2023 nepali.stop
-rw-r--r-- 1 root root 851 Mar 6 2023 norwegian.stop
-rw-r--r-- 1 root root 1267 Mar 6 2023 portuguese.stop
-rw-r--r-- 1 root root 1235 Mar 6 2023 russian.stop
-rw-r--r-- 1 root root 2178 Mar 6 2023 spanish.stop
-rw-r--r-- 1 root root 559 Mar 6 2023 swedish.stop
-rw-r--r-- 1 root root 73 Mar 6 2023 synonym_sample.syn
-rw-r--r-- 1 root root 473 Mar 6 2023 thesaurus_sample.ths
-rw-r--r-- 1 root root 260 Mar 6 2023 turkish.stop
-rw-r--r-- 1 root root 9939 Mar 6 2023 unaccent.rules
-rw-r--r-- 1 root root 139 Mar 6 2023 xsyn_sample.rules
So… when you try to copy to the instance using ‘kubectl cp’ or when you log in to the instance using something like:
kubectl exec -it pg-cluster1-instance1-WHATEVER bash
you enter as user ‘postgres’ so you can not write to that folder. Which is somewhat expected because the contents of that folder need to match in ever instance and the contents will not be persisted.
So … that creates a real issue if you need to add dictionaries.
You suggested a sidecar… but as noted… a sidecar is a separate pod and has no access to files inside the postgres instance… only to share volumes.
So… what is really needed is a way to create a shared volume for folders like : $SHAREDIR/tsearch_data so that:
a) all instances can see the dictionaries and
b) db administrators can place dictionary files in one instance (or in a folder on the host server or storage system) and all instances can see the files
c) adding dictionaries DOES NOT require that postgres is restarted. if can be done while live.
The method employed by Zalando SU (the patromi team) in thier Postgres Operator is as follows… In the resource configuration file one can add the following:
additionalVolumes:
- name: shared-data
mountPath: /usr/pgsql-15/share/tsearch_data/
volumeSource:
emptyDir: {}
Which will create a storage volume (if necessary and with appropriate permissions) and mount it in every postgres instance… in the example above it would mount to $SHAREDIR/tsearch_data (/usr/pgsql-15/share/tsearch_data/ in the Percona container).
So… to get the files where they need to be, the options seem to be:
- modify the operator so that it creates additional shares so I can share that folder. Ideal but seems like a lot of work.
- create my own instance image based on the Percona image but with that folder writable… not sure if that is possible. I can not see where the instance image name is configured. Can you tell me? Not ideal because it means managing ‘n’ instances separately.
- find the root password. Not ideal but I could manually copy over the files that I need to each instance. Can you guys provide the password or is there a way to set it when configuring the cluster?
- Create a sidecar that has it’s own shared volume and with root access copy the files over myself using something like ssh or scp. I still need root access and this is not a great way to do this because these files are meant to be installable without a database restart. Can you guys provide the password or is there a way to set it when configuring the cluster?
- Ask you guys to add this new feature to the Percona Operator so that people with text search databases can use the Percona Operator. Something like what the Zalando Postgres Operator has. Best long term solution but my guess is that this will take AGES.
- Perhaps there is another way? Is there something in the PGO that I have not yet discovered? This lack of functionality seems like a HUGE oversight since the shared folder is supposed to be modifiable.
Does that help?