Re: Declarative partitioning

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: Declarative partitioning
Дата
Msg-id CAFjFpRepk6LThUivaySe42iWFVS=63Vo=s-7izwvF_JibbRc7Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Declarative partitioning  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Ответы Re: Declarative partitioning  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Список pgsql-hackers
With the new set of patches, I am getting following warnings but no compilation failures. Patches apply smoothly.
partition.c:1216:21: warning: variable ‘form’ set but not used [-Wunused-but-set-variable]
partition.c:1637:20: warning: variable ‘form’ set but not used [-Wunused-but-set-variable]

For creating partition the documentation seems to suggest the syntax
create table t1_p1 partition of t1 for values {start {0} end {100} exclusive;
but following syntax works
create table t1_p1 partition of t1 for values start (0) end (100) exclusive;





> 5. In expand_partitioned_rtentry(), instead of finding all the leaf level
> relations, it might be better to build the RTEs, RelOptInfo and paths for
> intermediate relations as well. This helps in partition pruning. In your
> introductory write-up you have mentioned this. It might be better if v1
> includes this change, so that the partition hierarchy is visible in EXPLAIN
> output as well.
>
> 6. Explain output of scan on a partitioned table shows Append node with
> individual table scans as sub-plans. May be we should annotate Append node
> with
> the name of the partitioned table to make EXPLAIN output more readable.

I got rid of all the optimizer changes in the new version (except a line
or two).  I did that by switching to storing parent-child relationships in
pg_inherits so that all the existing optimizer code for inheritance sets
works unchanged.  Also, the implementation detail that required to put
only leaf partitions in the append list is also gone.  Previously no
storage was allocated for partitioned tables (either root or any of the
internal partitions), so it was being done that way.

With this set of patches we are still flattening all the partitions.
postgres=# create table t1 (a int, b int) partition by range(a);
CREATE TABLE
postgres=# create table t1_p1 partition of t1 for values start (0) end (100) exclusive partition by range(b);
CREATE TABLE
postgres=# create table t1_p1_p1 partition of t1_p1 for values start (0) end (100) exclusive;
CREATE TABLE
postgres=# explain verbose select * from t1;
                               QUERY PLAN                               
-------------------------------------------------------------------------
 Append  (cost=0.00..32.60 rows=2262 width=8)
   ->  Seq Scan on public.t1  (cost=0.00..0.00 rows=1 width=8)
         Output: t1.a, t1.b
   ->  Seq Scan on public.t1_p1  (cost=0.00..0.00 rows=1 width=8)
         Output: t1_p1.a, t1_p1.b
   ->  Seq Scan on public.t1_p1_p1  (cost=0.00..32.60 rows=2260 width=8)
         Output: t1_p1_p1.a, t1_p1_p1.b
(7 rows)
Retaining the partition hierarchy would help to push-down join across partition hierarchy effectively.
 

Regarding 6, it seems to me that because Append does not have a associated
relid (like scan nodes have with scanrelid).  Maybe we need to either fix
Append or create some enhanced version of Append which would also support
dynamic pruning.

Right, I think, Append might store the relid of the parent table as well as partitioning information at that level along-with the subplans.

Some more comments
Would it be better to declare PartitionDescData as
{
int nparts;
PartitionInfo *partinfo; /* array of partition information structures. */
}

The new syntax allows CREATE TABLE to be specified as partition of an already partitioned table. Is it possible to do the same for CREATE FOREIGN TABLE? Or that's material for v2? Similarly for ATTACH PARTITION.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Bapat
Дата:
Сообщение: Re: Declarative partitioning
Следующее
От: Magnus Hagander
Дата:
Сообщение: checkpoint_flush_after documentation inconsistency