Re: pg_dump is broken for partition tablespaces

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: pg_dump is broken for partition tablespaces
Дата
Msg-id 20190417213901.GA6555@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: pg_dump is broken for partition tablespaces  (David Rowley <david.rowley@2ndquadrant.com>)
Ответы Re: pg_dump is broken for partition tablespaces  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On 2019-Apr-17, David Rowley wrote:

> On Mon, 15 Apr 2019 at 15:26, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> >
> > On 2019-Apr-15, David Rowley wrote:
> >
> > > To be honest, if I'd done a better job of thinking through the
> > > implications of this tablespace inheritance in ca4103025d, then I'd
> > > probably have not bothered submitting a patch for it.  We could easily
> > > revert that, but we'd still be left with the same behaviour in
> > > partitioned indexes, which is in PG11.
> >
> > Well, I suppose if we do decide to revert it for tables, we should do it
> > for both tables and indexes.  But as I said, I'm not yet convinced that
> > this is the best way forward.
> 
> Ok.  Any ideas or suggestions on how we move on from here?  It seems
> like a bit of a stalemate.

Well, here's my proposed patch.  I'm now fairly happy with how this
looks now, concerning partitioned tables.

This is mostly what was already discussed:

1. pg_dump now uses regular CREATE TABLE followed by ALTER TABLE / ATTACH
   PARTITION when creating partitions, rather than CREATE TABLE
   PARTITION OF.  pg_dump --binary-upgrade was already doing that, so
   this part mostly removes some code.  In order to make the partitions
   reach the correct tablespace, the "default_tablespace" GUC is used.
   No TABLESPACE clause is added to the dump.  This is David's patch
   near the start of the thread.

2. When creating a partition using the CREATE TABLE PARTITION OF syntax,
   the TABLESPACE clause has highest precedence; if that is not given,
   the partitioned table's tablespace is used; if that is set to 0 (the
   default), default_tablespace is used; if that's set to empty or a
   nonexistant tablespace, the database's default tablespace is used.
   This is (I think) what Andres proposed in
   https://postgr.es/m/20190306223741.lolaaimhkkp4kict@alap3.anarazel.de

3. Partitioned relations can have the database tablespace in
   pg_class.reltablespace, as opposed to storage-bearing relations which
   cannot.  This is useful to be able to put partitions in the database
   tablespace even if the default_tablespace is set to something else.

4. For partitioned tables, ALTER TABLE .. SET TABLESPACE DEFAULT is
   available as suggested by David, which makes future partition
   creations target default_tablespace or the database's tablespace.  

5. Recreating indexes during table-rewriting ALTER TABLE resulted in
   broken indexes.  We already had some adhesive tape in place to make
   that work for regular indexes (commit bd673e8e864a); my approach to
   fix it for partitioned indexes is to temporarily reset
   default_tablespace to empty.


As for Tom's question in https://postgr.es/m/12678.1555252685@sss.pgh.pa.us :

> It's possible that Alvaro's patch is also broken, but I haven't had time
> to review it.  The immediate question is what happens when somebody makes
> a partitioned table in template1 and then does CREATE DATABASE with a
> tablespace option.  Does the partitioned table end up in the same
> tablespace as ordinary tables do?

Note that partitioned don't have any files, so they don't end up
anywhere; when a partition is created, the target tablespace is
determined using four rules instead of three (see #2 above) so yes, they
do end up in the same places as ordinary tables.  Note that even if you
do put a partitioned table in some tablespace, you will not later run
afoul of this check:
                ereport(ERROR,
                        (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                         errmsg("cannot assign new default tablespace \"%s\"",
                                tablespacename),
                         errdetail("There is a conflict because database \"%s\" already has some tables in this
tablespace.",
                                   dbtemplate)));
src/backend/commands/dbcommands.c:435

because that check uses ReadDir() and raise an error if any entry is
found; but partitioned tables don't have files in directory, so nothing
happens.  (Of course, it will hit if you have a partition in that
tablespace, but that's also the case for regular tables.)

(I propose to commit both 0002 and 0003 as a single unit that fixes the
whole problem, rather than attacking backend and pg_dump separately.
0001 appears logically separate and I would push on its own.)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: block-level incremental backup
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: pgsql: Fix unportable code in pgbench.