Обсуждение: What is the accepted practice to automate initdb (PostgreSQL 9.6) toa non-default directory?

Поиск
Список
Период
Сортировка

What is the accepted practice to automate initdb (PostgreSQL 9.6) toa non-default directory?

От
Mike Lonergan
Дата:
I'm attempting to fully script the build of a PostgreSQL server in an EC2 machine with EBS to house the default database cluster in the data_directory (aka $PGDATA?).  I'm doing this as a volunteer for HackOregon, thus I rationalize my lack of experience wrangling PostgreSQL.  My apologies up front for this lack of knowledge.

My problem: it appears that while initdb is still a shipping tool, it doesn't immediately result in a working database server, and I can't find a clean explanation in documentation (or google) of what additional commands I'm supposed to run to get the server to launch successfully.


The internet has the following pattern documented in many places (with varying install locations for "initdb", but always the same command) through at least PostgreSQL 9.3:
sudo -u postgres postgres /opt/pg/bin/initdb -D /data/

However while this command can complete successfully (which is encouraging, as it's still documented in the 9.6 docs), running sudo postgresql service start results in:

Redirecting to /bin/systemctl start postgresql.service
Job for postgresql.service failed because the control process exited with error code. See "systemctl status postgresql.service" and "journalctl -xe" for details.
When I run systemctl status postgresql.service -l  it reports:

● postgresql.service - PostgreSQL database server  Loaded: loaded (/usr/lib/systemd/system/postgresql.service; disabled; vendor preset: disabled)  Active: failed (Result: exit-code) since Thu 2018-03-01 05:41:09 UTC; 2min 8s ago Process: 3196 ExecStartPre=/usr/libexec/postgresql-check-db-dir %N (code=exited, status=1/FAILURE)

Mar 01 05:41:09 ip-172-31-31-191.us-west-2.compute.internal systemd[1]: Starting PostgreSQL database server...
Mar 01 05:41:09 ip-172-31-31-191.us-west-2.compute.internal postgresql-check-db-dir[3196]: Directory "/var/lib/pgsql/data" is missing or empty.
Mar 01 05:41:09 ip-172-31-31-191.us-west-2.compute.internal postgresql-check-db-dir[3196]: Use "/usr/bin/postgresql-setup --initdb"
Mar 01 05:41:09 ip-172-31-31-191.us-west-2.compute.internal postgresql-check-db-dir[3196]: to initialize the database cluster.
Mar 01 05:41:09 ip-172-31-31-191.us-west-2.compute.internal postgresql-check-db-dir[3196]: See /usr/share/doc/postgresql-9.6.6/README.rpm-dist for more information.
Mar 01 05:41:09 ip-172-31-31-191.us-west-2.compute.internal systemd[1]: postgresql.service: control process exited, code=exited status=1
Mar 01 05:41:09 ip-172-31-31-191.us-west-2.compute.internal systemd[1]: Failed to start PostgreSQL database server.
Mar 01 05:41:09 ip-172-31-31-191.us-west-2.compute.internal systemd[1]: Unit postgresql.service entered failed state.
Mar 01 05:41:09 ip-172-31-31-191.us-west-2.compute.internal systemd[1]: postgresql.service failed.
When I run sudo ls -la /var/lib/pgsql/data it reports:

total 0
drwx------ 2 postgres postgres  6 Dec 11 21:42 .
drwx------ 4 postgres postgres 54 Feb 25 21:38 ..
and when I look at /data, it contains all the files:

$ sudo ls -la /databases

total 52

drwx------ 19 postgres postgres  4096 Mar  2 16:33 .

dr-xr-xr-x 19 root     root       274 Mar  2 16:33 ..

drwx------  5 postgres postgres    41 Mar  2 16:33 base

drwx------  2 postgres postgres  4096 Mar  2 16:33 global

drwx------  2 postgres postgres    18 Mar  2 16:33 pg_clog

drwx------  2 postgres postgres     6 Mar  2 16:33 pg_commit_ts

drwx------  2 postgres postgres     6 Mar  2 16:33 pg_dynshmem

-rw-------  1 postgres postgres  4468 Mar  2 16:33 pg_hba.conf

-rw-------  1 postgres postgres  1636 Mar  2 16:33 pg_ident.conf

drwx------  4 postgres postgres    39 Mar  2 16:33 pg_logical

drwx------  4 postgres postgres    36 Mar  2 16:33 pg_multixact

drwx------  2 postgres postgres    18 Mar  2 16:33 pg_notify

drwx------  2 postgres postgres     6 Mar  2 16:33 pg_replslot

drwx------  2 postgres postgres     6 Mar  2 16:33 pg_serial

drwx------  2 postgres postgres     6 Mar  2 16:33 pg_snapshots

drwx------  2 postgres postgres     6 Mar  2 16:33 pg_stat

drwx------  2 postgres postgres     6 Mar  2 16:33 pg_stat_tmp

drwx------  2 postgres postgres    18 Mar  2 16:33 pg_subtrans

drwx------  2 postgres postgres     6 Mar  2 16:33 pg_tblspc

drwx------  2 postgres postgres     6 Mar  2 16:33 pg_twophase

-rw-------  1 postgres postgres     4 Mar  2 16:33 PG_VERSION

drwx------  3 postgres postgres    60 Mar  2 16:33 pg_xlog

-rw-------  1 postgres postgres    88 Mar  2 16:33 postgresql.auto.conf

-rw-------  1 postgres postgres 22301 Mar  2 16:33 postgresql.conf


The recommended command "postgresql-setup --initdb" doesn't seem to accept -D or --pgdata parameters, so that seems a non-starter for my scenario.

There's another hint along the way to use "postgresql-new-systemd-unit" instead, but the thin documentation seems to indicate that it's intended only for creating a second non-default database cluster, not for configuring the default cluster.


I've seen various unofficial solutions that use various combinations of manual setup and hand-editing of .conf files, and I'm trying hard to avoid that so that my ops team can build and rebuild this server in an emergency situation without any special knowledge of the behaviour of PostgreSQL or its tools.


Is there anywhere I can find the missing explanation for how to automate a PostgreSQL setup/config/start while placing the default database cluster in a non-default directory?

Thanks in advance for any hints you could provide.

Mike the Canuck
Mike Lonergan <mikethecanuck@gmail.com> writes:
> The internet has the following pattern documented in many places (with
> varying install locations for "initdb", but always the same command)
> through at least PostgreSQL 9.3:
> sudo -u postgres postgres /opt/pg/bin/initdb -D /data/

> However while this command can complete successfully (which is encouraging,
> as it's still documented in the 9.6 docs
> <https://www.postgresql.org/docs/9.6/static/app-initdb.html>), running sudo
> postgresql service start results in:

> Redirecting to /bin/systemctl start postgresql.service
> Job for postgresql.service failed because the control process exited
> with error code. See "systemctl status postgresql.service" and
> "journalctl -xe" for details.

Evidently you're relying on somebody's systemd unit script to launch
Postgres.  Almost certainly, that unit script has the location of the
data directory hard-wired into it.  If you want to use a non-default
directory location, you'll have to change the unit script (and possibly
then mutter some incantation to get systemd to notice you've changed
it; I've not worked with systemd in awhile so I forget).

> The recommended command "postgresql-setup --initdb" doesn't seem to accept
> -D or --pgdata parameters, so that seems a non-starter for my scenario.

While it's certainly possible to use a manual invocation of initdb
to get things going, when your distro provides a wrapper like
postgresql-setup you're better off using that, because it's certain
to get the data directory location, ownership, etc set up the way
the launch script expects.  The reason there's no -D option is that
it scrapes the datadir location out of the launch script; again,
you *must* change that script if you're going to use this launch
method.

> I've seen various unofficial solutions that use various combinations of
> manual setup and hand-editing of .conf files, and I'm trying hard to avoid
> that so that my ops team can build and rebuild this server in an emergency
> situation without any special knowledge of the behaviour of PostgreSQL or
> its tools.

If that's what you're striving for, then the very first recommendation
would be to *not* use a non-default data dir location, but just go along
with what the distro wants to do out-of-the-box.  You're just adding
another layer of complexity and things-to-go-wrong.  (An example of
the sort of thing I'm worried about is that you're likely also going to
need to have a conversation with SELinux about whether the postgres
daemon is allowed to use files in the nonstandard location.  It's doable,
certainly, but it's one more thing to configure.)

            regards, tom lane


Re: What is the accepted practice to automate initdb (PostgreSQL 9.6)to a non-default directory?

От
Mike Lonergan
Дата:
Thanks so much Tom, this makes a ton of sense now that you've pointed out all the things that sounded over my head when/if I bumped into them in my travails.

I'd sure prefer to not mess with the default database cluster location, but unfortunately given our deployment to AWS (on an EC2 machine for now, due to cost vs RDS), we can't rely on ephemeral storage - have to use an external EBS storage device.  That means one way or another - at time of running initdb or later (before creating the first database instance), we'll have to move to a non-default data dir.  [Someone suggested the use of Tablespaces to me recently, but that seems fragile - when we are considering the use of EBS 'snapshots' as well as 'pgdump' to enable a robust data preservation strategy.]

I'll definitely look into the launch script, see what I can do there.  Any other thoughts folks have would be more than welcome.

Thanks!
Mike

On 2 March 2018 at 09:14, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mike Lonergan <mikethecanuck@gmail.com> writes:
> The internet has the following pattern documented in many places (with
> varying install locations for "initdb", but always the same command)
> through at least PostgreSQL 9.3:
> sudo -u postgres postgres /opt/pg/bin/initdb -D /data/

> However while this command can complete successfully (which is encouraging,
> as it's still documented in the 9.6 docs
> <https://www.postgresql.org/docs/9.6/static/app-initdb.html>), running sudo
> postgresql service start results in:

> Redirecting to /bin/systemctl start postgresql.service
> Job for postgresql.service failed because the control process exited
> with error code. See "systemctl status postgresql.service" and
> "journalctl -xe" for details.

Evidently you're relying on somebody's systemd unit script to launch
Postgres.  Almost certainly, that unit script has the location of the
data directory hard-wired into it.  If you want to use a non-default
directory location, you'll have to change the unit script (and possibly
then mutter some incantation to get systemd to notice you've changed
it; I've not worked with systemd in awhile so I forget).

> The recommended command "postgresql-setup --initdb" doesn't seem to accept
> -D or --pgdata parameters, so that seems a non-starter for my scenario.

While it's certainly possible to use a manual invocation of initdb
to get things going, when your distro provides a wrapper like
postgresql-setup you're better off using that, because it's certain
to get the data directory location, ownership, etc set up the way
the launch script expects.  The reason there's no -D option is that
it scrapes the datadir location out of the launch script; again,
you *must* change that script if you're going to use this launch
method.

> I've seen various unofficial solutions that use various combinations of
> manual setup and hand-editing of .conf files, and I'm trying hard to avoid
> that so that my ops team can build and rebuild this server in an emergency
> situation without any special knowledge of the behaviour of PostgreSQL or
> its tools.

If that's what you're striving for, then the very first recommendation
would be to *not* use a non-default data dir location, but just go along
with what the distro wants to do out-of-the-box.  You're just adding
another layer of complexity and things-to-go-wrong.  (An example of
the sort of thing I'm worried about is that you're likely also going to
need to have a conversation with SELinux about whether the postgres
daemon is allowed to use files in the nonstandard location.  It's doable,
certainly, but it's one more thing to configure.)

                        regards, tom lane

Re: What is the accepted practice to automate initdb (PostgreSQL 9.6)to a non-default directory?

От
"David G. Johnston"
Дата:
On Fri, Mar 2, 2018 at 5:03 PM, Mike Lonergan <mikethecanuck@gmail.com> wrote:

I'll definitely look into the launch script, see what I can do there.  Any other thoughts folks have would be more than welcome.
 
​This recent Planet PostgreSQL link​ seems worth a read.


David J.

Re: What is the accepted practice to automate initdb (PostgreSQL 9.6)to a non-default directory?

От
Mike Lonergan
Дата:
David, thank you as well - that article is fantastic.  I'll give this a whirl, but on its face it looks like exactly the solution we're after.  That approach to embedded the env var for the service is new to me - addresses a complete blank in my knowledge.

Thanks!
Mike

On 2 March 2018 at 16:13, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Fri, Mar 2, 2018 at 5:03 PM, Mike Lonergan <mikethecanuck@gmail.com> wrote:

I'll definitely look into the launch script, see what I can do there.  Any other thoughts folks have would be more than welcome.
 
​This recent Planet PostgreSQL link​ seems worth a read.


David J.


Re: What is the accepted practice to automate initdb (PostgreSQL 9.6)to a non-default directory?

От
Mike Lonergan
Дата:
For reference by any future novices who might stumble across this thread (and for the morbid curiosity of anyone on the list), here's the shell script I've generated to solve this challenge in an AWS EC2 environment:

On 2 March 2018 at 16:18, Mike Lonergan <mikethecanuck@gmail.com> wrote:
David, thank you as well - that article is fantastic.  I'll give this a whirl, but on its face it looks like exactly the solution we're after.  That approach to embedded the env var for the service is new to me - addresses a complete blank in my knowledge.

Thanks!
Mike

On 2 March 2018 at 16:13, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Fri, Mar 2, 2018 at 5:03 PM, Mike Lonergan <mikethecanuck@gmail.com> wrote:

I'll definitely look into the launch script, see what I can do there.  Any other thoughts folks have would be more than welcome.
 
​This recent Planet PostgreSQL link​ seems worth a read.


David J.