Обсуждение: Partitioning Tables

Поиск
Список
Период
Сортировка

Partitioning Tables

От
"Campbell, Lance"
Дата:

PostgreSQL 12

 

I have a challenge.  I have two tables, group and group_member.  A group table has a type indicator telling me which of three ways the table can be used.  The group member table is made up of 50 million records that have a foreign key to the group table.

 

I really need the queries to be fast for one particular type of group.  This type has less than a million members in it.  So my first thought was to create a separate group_member table just for members of this type of group.  But I have to change a lot of SQL.

 

The other idea I thought of is there a way to use table partitions?  If the query goes against a group of type A then it would pull from the small partition but if it is a group of some other type it would query against the other partition. 

 

So if I do a query like the below it would ideally pull from partition group_member_a rather than group_member_other.  However, how would PostgreSQL know on insert a group_member into the proper partition? 

 

Select group_member.* from group, group_member WHERE group.type=’A’ and group.id=group_member.user=’bob smith’;

 

Also, what happens if I did a query like this.  Will it know to scan both group_member_a and group_member_other:

 

Select group_member.* from group_member where group_member.user=’bob smith’;

 

Thoughts?

 

Thanks,

 

Lance Campbell

Re: Partitioning Tables

От
Scott Ribe
Дата:
> On Feb 5, 2021, at 11:27 AM, Campbell, Lance <lance@illinois.edu> wrote:
>
> So if I do a query like the below it would ideally pull from partition group_member_a rather than group_member_other.
However, how would PostgreSQL know on insert a group_member into the proper partition?  
>
> Select group_member.* from group, group_member WHERE group.type=’A’ and group.id=group_member.user=’bob smith’;

When you define the partition, you declare what values go into it. With PG 12, you can insert into the parent and the
rowwill get redirected into the right place. Some older versions (I don't remember how old) required you to insert into
theright table, or some voodoo with rewrite rules or triggers. 

In other words, smarts about which partitions to scan in your example pre-dates smarts about inserting.

I don't remember for sure, but I think maybe PG 13 adds the feature where if you update the group type, the row gets
moved.Prior, you'd have to delete and re-insert. Someone correct me? 

> Also, what happens if I did a query like this.  Will it know to scan both group_member_a and group_member_other:
>
> Select group_member.* from group_member where group_member.user=’bob smith’;

Yes.

Anyway, you are on the right track, in that what you're describing could work.

There are less invasive things you could try, which wouldn't give as much benefit--like partial indexes

for instance

create index ... on group_member(user) where type = 'A'

that gets you fast location of a single user, but if you looked for a range of users, reading the index would be fast,
butyou'd still wind up with the matching rows scattered all over the single big tablee 





Re: Partitioning Tables

От
Ron
Дата:
On 2/5/21 12:42 PM, Scott Ribe wrote:
>> On Feb 5, 2021, at 11:27 AM, Campbell, Lance <lance@illinois.edu> wrote:
>>
>> So if I do a query like the below it would ideally pull from partition group_member_a rather than
group_member_other. However, how would PostgreSQL know on insert a group_member into the proper partition?
 
>>   
>> Select group_member.* from group, group_member WHERE group.type=’A’ and group.id=group_member.user=’bob smith’;
> When you define the partition, you declare what values go into it. With PG 12, you can insert into the parent and the
rowwill get redirected into the right place. Some older versions (I don't remember how old) required you to insert into
theright table, or some voodoo with rewrite rules or triggers.
 
>
> In other words, smarts about which partitions to scan in your example pre-dates smarts about inserting.
>
> I don't remember for sure, but I think maybe PG 13 adds the feature where if you update the group type, the row gets
moved.Prior, you'd have to delete and re-insert. Someone correct me?
 

I don't know about v13, but v12 (and below) error on updating the partition 
key.  (That's a common limitation which helps the optimizer /know/ what's in 
a partition.)

-- 
Angular momentum makes the world go 'round.



Re: Partitioning Tables

От
"Campbell, Lance"
Дата:
This is what I really want to do.  But it does not look like PostgreSQL supports referencing the parent tables for
decidingwhat partition to put the data in:
 

CREATE TABLE group_manager.group (id INTEGER, type TEXT, name TEXT, CONSTRAINT group_pkey PRIMARY KEY (id) );
-- type would have values of A , B or C

CREATE TABLE group_member (fk_group_id INTEGER, id INTEGER, user_id TEXT) PARTITION BY LIST(group.type);

-- Above PARTITION BY LIST(group.type) does not work.

CREATE TABLE group_member_A PARTITION OF group_member FOR VALUES IN ('A');
CREATE TABLE group_member_B PARTITION OF group_member FOR VALUES IN ('B'); 
CREATE TABLE group_member_C PARTITION OF group_member FOR VALUES IN ('C');

It does look like it would work if I added the "type" from the group table to the group_member table.  But otherwise
thePARITION BY LIST does not support referencing a parent table.
 

Lance 

On 2/5/21, 12:42 PM, "Scott Ribe" <scott_ribe@elevated-dev.com> wrote:

    > On Feb 5, 2021, at 11:27 AM, Campbell, Lance <lance@illinois.edu> wrote:
    > 
    > So if I do a query like the below it would ideally pull from partition group_member_a rather than
group_member_other. However, how would PostgreSQL know on insert a group_member into the proper partition? 
 
    >  
    > Select group_member.* from group, group_member WHERE group.type=’A’ and group.id=group_member.user=’bob smith’;
    
    When you define the partition, you declare what values go into it. With PG 12, you can insert into the parent and
therow will get redirected into the right place. Some older versions (I don't remember how old) required you to insert
intothe right table, or some voodoo with rewrite rules or triggers.
 
    
    In other words, smarts about which partitions to scan in your example pre-dates smarts about inserting.
    
    I don't remember for sure, but I think maybe PG 13 adds the feature where if you update the group type, the row
getsmoved. Prior, you'd have to delete and re-insert. Someone correct me?
 
    
    > Also, what happens if I did a query like this.  Will it know to scan both group_member_a and group_member_other:
    >  
    > Select group_member.* from group_member where group_member.user=’bob smith’;
    
    Yes.
    
    Anyway, you are on the right track, in that what you're describing could work.
    
    There are less invasive things you could try, which wouldn't give as much benefit--like partial indexes
    
    for instance
    
    create index ... on group_member(user) where type = 'A'
    
    that gets you fast location of a single user, but if you looked for a range of users, reading the index would be
fast,but you'd still wind up with the matching rows scattered all over the single big tablee
 
    
    
    


Re: Partitioning Tables

От
"Campbell, Lance"
Дата:
I had a typo:

This is what I really want to do.  But it does not look like PostgreSQL supports referencing the parent tables for
decidingwhat partition to put the data in:
 

CREATE TABLE group (id INTEGER, type TEXT, name TEXT, CONSTRAINT group_pkey PRIMARY KEY (id) );
-- type would have values of A , B or C

CREATE TABLE group_member (fk_group_id INTEGER, id INTEGER, user_id TEXT) PARTITION BY LIST(group.type);

-- Above PARTITION BY LIST(group.type) does not work.

CREATE TABLE group_member_A PARTITION OF group_member FOR VALUES IN ('A');
CREATE TABLE group_member_B PARTITION OF group_member FOR VALUES IN ('B');
CREATE TABLE group_member_C PARTITION OF group_member FOR VALUES IN ('C');

It does look like it would work if I added the "type" from the group table to the group_member table.  But otherwise
thePARITION BY LIST does not support referencing a parent table.
 

Lance

On 2/5/21, 1:18 PM, "Campbell, Lance" <lance@illinois.edu> wrote:

    This is what I really want to do.  But it does not look like PostgreSQL supports referencing the parent tables for
decidingwhat partition to put the data in:
 
    
    CREATE TABLE group_manager.group (id INTEGER, type TEXT, name TEXT, CONSTRAINT group_pkey PRIMARY KEY (id) );
    -- type would have values of A , B or C
    
    CREATE TABLE group_member (fk_group_id INTEGER, id INTEGER, user_id TEXT) PARTITION BY LIST(group.type);
    
    -- Above PARTITION BY LIST(group.type) does not work.
    
    CREATE TABLE group_member_A PARTITION OF group_member FOR VALUES IN ('A');
    CREATE TABLE group_member_B PARTITION OF group_member FOR VALUES IN ('B'); 
    CREATE TABLE group_member_C PARTITION OF group_member FOR VALUES IN ('C');
    
    It does look like it would work if I added the "type" from the group table to the group_member table.  But
otherwisethe PARITION BY LIST does not support referencing a parent table.
 
    
    Lance 
    
    On 2/5/21, 12:42 PM, "Scott Ribe" <scott_ribe@elevated-dev.com> wrote:
    
        > On Feb 5, 2021, at 11:27 AM, Campbell, Lance <lance@illinois.edu> wrote:
        > 
        > So if I do a query like the below it would ideally pull from partition group_member_a rather than
group_member_other. However, how would PostgreSQL know on insert a group_member into the proper partition? 
 
        >  
        > Select group_member.* from group, group_member WHERE group.type=’A’ and group.id=group_member.user=’bob
smith’;
        
        When you define the partition, you declare what values go into it. With PG 12, you can insert into the parent
andthe row will get redirected into the right place. Some older versions (I don't remember how old) required you to
insertinto the right table, or some voodoo with rewrite rules or triggers.
 
        
        In other words, smarts about which partitions to scan in your example pre-dates smarts about inserting.
        
        I don't remember for sure, but I think maybe PG 13 adds the feature where if you update the group type, the row
getsmoved. Prior, you'd have to delete and re-insert. Someone correct me?
 
        
        > Also, what happens if I did a query like this.  Will it know to scan both group_member_a and
group_member_other:
        >  
        > Select group_member.* from group_member where group_member.user=’bob smith’;
        
        Yes.
        
        Anyway, you are on the right track, in that what you're describing could work.
        
        There are less invasive things you could try, which wouldn't give as much benefit--like partial indexes
        
        for instance
        
        create index ... on group_member(user) where type = 'A'
        
        that gets you fast location of a single user, but if you looked for a range of users, reading the index would
befast, but you'd still wind up with the matching rows scattered all over the single big tablee
 
        
        
        
    
    


Re: Partitioning Tables

От
Scott Ribe
Дата:
> On Feb 5, 2021, at 12:19 PM, Campbell, Lance <lance@illinois.edu> wrote:
>
> But it does not look like PostgreSQL supports referencing the parent tables for deciding what partition to put the
datain 

correct, I missed that in your first post





Re: Partitioning Tables

От
"David G. Johnston"
Дата:
On Friday, February 5, 2021, Campbell, Lance <lance@illinois.edu> wrote:

It does look like it would work if I added the "type" from the group table to the group_member table.  But otherwise the PARITION BY LIST does not support referencing a parent table.

Replace “parent table”  with “some random table only related to the table in question by a foreign key” and it probably will make more sense.

David J.

Re: Partitioning Tables

От
Laurenz Albe
Дата:
On Fri, 2021-02-05 at 18:27 +0000, Campbell, Lance wrote:
> I have a challenge.  I have two tables, group and group_member.  A group table has a
>  type indicator telling me which of three ways the table can be used.  The group member
>  table is made up of 50 million records that have a foreign key to the group table.
> 
> I really need the queries to be fast for one particular type of group.  This type has
>  less than a million members in it.  So my first thought was to create a separate
>  group_member table just for members of this type of group.  But I have to change a lot of SQL.
> 
> The other idea I thought of is there a way to use table partitions?  If the query goes
>  against a group of type A then it would pull from the small partition but if it is a group
>  of some other type it would query against the other partition. 

If you want to partition "group_member", the partitioning key cannot be in "group".

So you would have to (redundantly) add "group.type" to "group_member".
You could ensure consistency by including that column in the foreign key.

If you want partition pruning to take effect, you will have to include
a condition on "group_member.type" in the WHERE condition.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com