Score:1

Docker PostgreSQL change database encoding to UTF-8

in flag

I want to run via docker-compose a postgres container which has COLLATE and CTYPE 'C' and database encoding 'UTF-8'. But this looks to be impossible.

This is the part on the docker-compose.yml:

database:
    image: postgres:latest
    volumes:
        - db:/var/lib/postgresql/data
    environment:
        POSTGRES_PASSWORD: test
        LC_COLLATE: C
        LC_CTYPE: C
        LANG: C.UTF-8

And this is the log output:

The database cluster will be initialized with locales.
The default text search configuration will be set to "english".
  COLLATE:  C
  CTYPE:    C
  MESSAGES: C.UTF-8
  MONETARY: C.UTF-8
  NUMERIC:  C.UTF-8
  TIME:     C.UTF-8
The default database encoding has accordingly been set to "SQL_ASCII".

I must have the database encoding in UTF-8 and the COLLATE and CTYPE in 'C' and not 'C.UTF-8' as otherwise a dependend application cannot connect.

I didn't find anything in any documentation or anywhere else.

Score:2
cn flag

You need to conjoin two pieces of the puzzle here:

https://www.postgresql.org/docs/9.5/app-initdb.html

initdb, teachs you how to pass encoding information to the database creation function.

The postgres official Docker image, states you can pass options, to initdb:

https://hub.docker.com/_/postgres

Ergo, the answer would be something like:

database:
    image: postgres:latest
    volumes:
        - db:/var/lib/postgresql/data
    environment:
        POSTGRES_PASSWORD: test
        POSTGRES_INITDB_ARGS: '--encoding=UTF-8 --lc-collate=C --lc-ctype=C'

Or similar arguments. I ignored the lang option, as this is not an official "pass this flag to postgres" option on the man page (the first link I included).

My tests did not run this using docker compose, it was on the command line using the -e option. This is the exact same concept however; "environment" in docker compose is -e on the command line. To wit:

https://docs.docker.com/engine/reference/commandline/run/

--env , -e Set environment variables

Test #1 with only the password env set:

docker run -e POSTGRES_PASSWORD=test postgres:latest

Here's the output of the default run:

postgres@cbf23636dabc:~$ psql
psql (13.4 (Debian 13.4-1.pgdg100+1))
Type "help" for help.

postgres=# \l
                                 List of databases
   Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges   
-----------+----------+----------+------------+------------+-----------------------
 postgres  | postgres | UTF8     | en_US.utf8 | en_US.utf8 | 
 template0 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.utf8 | en_US.utf8 | =c/postgres          +
           |          |          |            |            | postgres=CTc/postgres

Test #2, with environment variables set as above in the suggested docker compose only on CLI:

docker run -e POSTGRES_PASSWORD=test -e POSTGRES_INITDB_ARGS='--encoding=UTF-8 --lc-collate=C --lc-ctype=C' postgres:latest

And then the output:

postgres@b6b80c876f3e:~$ psql 
psql (13.4 (Debian 13.4-1.pgdg100+1))
Type "help" for help.

postgres=# \l
                             List of databases
   Name    |  Owner   | Encoding | Collate | Ctype |   Access privileges   
-----------+----------+----------+---------+-------+-----------------------
 postgres  | postgres | UTF8     | C       | C     | 
 template0 | postgres | UTF8     | C       | C     | =c/postgres          +
           |          |          |         |       | postgres=CTc/postgres
 template1 | postgres | UTF8     | C       | C     | =c/postgres          +
           |          |          |         |       | postgres=CTc/postgres

Note also, the section on the official Postgresql Docker image page, where it describes initialization scripts. This is something you may look into as well.

Philipp avatar
in flag
Thank you. This is my first time working with PostgreSQL. I wasn't aware of how database creation works in detail.
cn flag
No problem! Postgres is a bit of a bear, but I'd definitely suggest finding a GUI to work with it if you can. There's several options, although my personal favorite is pgAdmin. Open source.
Score:0
mh flag

You can actually combine the build into docker compose:

services:
  postgres:
    container_name: postgres
    image: postgres-de
    build: ./pg-docker-config  # a Dockerfile must reside in this directory
    environment:
      - POSTGRES_USER=${POSTGRES_USER}
      - POSTGRES_PASSWORD=${POSTGRES_PW}
      - POSTGRES_DB=${POSTGRES_DB} #optional (specify default database instead of $POSTGRES_DB)
      - PGDATA=/var/lib/postgresql/data/pgdata
      - POSTGRES_INITDB_ARGS='--locale=de_DE.UTF8'
    ports:
      - "5432:5432"
    # volumes:
      # - /postgres/dev:/var/lib/postgresql/data
    restart: unless-stopped

The Dockerfile in ./pg-docker-config looks like this

FROM postgres:latest
RUN localedef -i de_DE -c -f UTF-8 -A /usr/share/locale/locale.alias de_DE.UTF-8
ENV LANG de_DE.utf8

Make sure to delete your existing database volume(s) before composing again by running ...

docker compose down --volumes

... or else you will see this message, and the database will not initialize the locale settings

PostgreSQL Database directory appears to contain a database; Skipping initialization

Now you can launch the container(s) with only one command :)

docker compose up -d
Score:0
cn flag
xji

I tried the approach above, and another issue I encountered was that the default Debian base that comes with the official Postgres Docker image doesn't have any other languages installed.

# locale -a
C
C.UTF-8
en_US.utf8
POSIX

To set another language, e.g. Chinese, it was necessary to run e.g. localedef -i zh_CN -c -f UTF-8 -A /usr/share/locale/locale.alias zh_CN.UTF-8.

Then setting the environment variables worked.

Because of this need, it seems that the default Postgres image doesn't satisfy all the needs, and it would be better to build a custom image, as documented in this post.

FROM postgres
RUN localedef -i zh_CN -c -f UTF-8 -A /usr/share/locale/locale.alias zh_CN.UTF-8
ENV LANG zh_CN.utf8

Then docker build -t your-custom-image-name .

Then you can use the custom image in your docker-compose.yml instead of the official postgres image, without needing to set any environment variable in addition.

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.