Re: Could postgres12 support millions of sequences? (like 10 million)

Поиск
Список
Период
Сортировка
От Adrian Klaver
Тема Re: Could postgres12 support millions of sequences? (like 10 million)
Дата
Msg-id 032b7a7f-e08a-fb7d-abf9-4a227f815f7a@aklaver.com
обсуждение исходный текст
Ответ на Re: Could postgres12 support millions of sequences? (like 10 million)  (Adrian Klaver <adrian.klaver@aklaver.com>)
Ответы Re: Could postgres12 support millions of sequences? (like 10 million)
Список pgsql-general
On 3/20/20 9:59 AM, Adrian Klaver wrote:
> On 3/19/20 10:31 PM, pabloa98 wrote:
>> I see.
>>
>> Any suggestion? It should behave like a sequence in the sense that 
>> concurrent transitions will get different numbers from this 
>> alternative sequence like solution.
>>
>> In our case, we will need to do a call nextval('some_seq') (or 
>> similar) from different processes no more than twice every minute.
>>
>>
>> It would be nice to have a sequence data type. Correct me if I am 
>> wrong, but It seems to me that a sequence data type would cost the 
>> same or less than the current sequence implementation.
>>
>> The schema would be more clear too. We could have a table like:
>>
>> CREATE TABLE pair(
>> group INT NOT NULL,
>> element INT NOT NULL,
>> seq SEQUENCE INCREMENT 1
>>                START 1
>>                CACHE 1
>>                MINVALUE 1
>>                MAXVALUE 99999999
>>          NOT NULL,
>> CONSTRAINT PRIMARY KEY (group, element)
>> );
>>
>> And then:
>>
>> INSERT INTO event(group, element, code)
>> VALUES (
>>           1,
>>           1,
>>           ( SELECT seq.nextval('code_1_1_seq') FROM pair p WHERE 
>> p.group=1 and p.code=1 )
>> );
>>
>> Or perhaps storing all the sequences in the same table as rows will 
>> have the same behavior.
> 
> If code is just something to show the sequence of insertion for group, 
> element combinations then maybe something like below:
> 
> CREATE TABLE event(
> group_id INT NOT NULL, --changed as group is reserved word
> element INT NOT NULL,
> insert_ts timestamptz NOT NULL DEFAULT clock_timestamp(),
> PRIMARY KEY(group_id, element, insert_ts)
> );
> 
> 
> insert into event(group_id, element) VALUES
> (1, 1),
> (1, 1),
> (1, 1),
> (2, 1),
> (1, 1),
> (1, 3),
> (1, 1),
> (1, 3),
> (2, 1),
> (2, 1);
> 
> 
> select * from event ;
>   group_id | element |           insert_ts
> ----------+---------+--------------------------------
>          1 |       1 | 03/20/2020 09:51:12.675926 PDT
>          1 |       1 | 03/20/2020 09:51:12.675985 PDT
>          1 |       1 | 03/20/2020 09:51:12.675991 PDT
>          2 |       1 | 03/20/2020 09:51:12.675996 PDT
>          1 |       1 | 03/20/2020 09:51:12.676 PDT
>          1 |       3 | 03/20/2020 09:51:12.676004 PDT
>          1 |       1 | 03/20/2020 09:51:12.676008 PDT
>          1 |       3 | 03/20/2020 09:51:12.676012 PDT
>          2 |       1 | 03/20/2020 09:51:12.676016 PDT
>          2 |       1 | 03/20/2020 09:51:12.67602 PDT
> (10 rows)
> 
> 
> select group_id, element, row_number() OVER (partition by (group_id, 
> element) order by (group_id, element)) AS code from event;
>   group_id | element | code
> ----------+---------+------
>          1 |       1 |    1
>          1 |       1 |    2
>          1 |       1 |    3
>          1 |       1 |    4
>          1 |       1 |    5
>          1 |       3 |    1
>          1 |       3 |    2
>          2 |       1 |    1
>          2 |       1 |    2
>          2 |       1 |    3
> (10 rows)

Oops the above actually returned:

select group_id, element, row_number() OVER (partition by (group_id, 
element) order by (group_id, element)) AS code, insert_ts from event;
  group_id | element | code |           insert_ts
----------+---------+------+--------------------------------
         1 |       1 |    1 | 03/20/2020 09:51:12.675926 PDT
         1 |       1 |    2 | 03/20/2020 09:51:12.675985 PDT
         1 |       1 |    3 | 03/20/2020 09:51:12.675991 PDT
         1 |       1 |    4 | 03/20/2020 09:51:12.676008 PDT
         1 |       1 |    5 | 03/20/2020 09:51:12.676 PDT
         1 |       3 |    1 | 03/20/2020 09:51:12.676004 PDT
         1 |       3 |    2 | 03/20/2020 09:51:12.676012 PDT
         2 |       1 |    1 | 03/20/2020 09:51:12.67602 PDT
         2 |       1 |    2 | 03/20/2020 09:51:12.676016 PDT
         2 |       1 |    3 | 03/20/2020 09:51:12.675996 PDT
(10 rows)


Needs to be:

select group_id, element, row_number() OVER (partition by (group_id, 
element) order by (group_id, element, insert_ts)) AS code, insert_ts 
from event;
  group_id | element | code |           insert_ts
----------+---------+------+--------------------------------
         1 |       1 |    1 | 03/20/2020 09:51:12.675926 PDT
         1 |       1 |    2 | 03/20/2020 09:51:12.675985 PDT
         1 |       1 |    3 | 03/20/2020 09:51:12.675991 PDT
         1 |       1 |    4 | 03/20/2020 09:51:12.676 PDT
         1 |       1 |    5 | 03/20/2020 09:51:12.676008 PDT
         1 |       3 |    1 | 03/20/2020 09:51:12.676004 PDT
         1 |       3 |    2 | 03/20/2020 09:51:12.676012 PDT
         2 |       1 |    1 | 03/20/2020 09:51:12.675996 PDT
         2 |       1 |    2 | 03/20/2020 09:51:12.676016 PDT
         2 |       1 |    3 | 03/20/2020 09:51:12.67602 PDT
(10 rows)

> 
>>
>> Pablo
>>
>>
>> On Thu, Mar 19, 2020 at 7:56 PM Tom Lane <tgl@sss.pgh.pa.us 
>> <mailto:tgl@sss.pgh.pa.us>> wrote:
>>
>>     Michael Lewis <mlewis@entrata.com <mailto:mlewis@entrata.com>> 
>> writes:
>>      > On Thu, Mar 19, 2020, 5:48 PM David G. Johnston
>>     <david.g.johnston@gmail.com <mailto:david.g.johnston@gmail.com>>
>>      > wrote:
>>      >> However, one other consideration with sequences: do you care that
>>      >> PostgreSQL will cache/pin (i.e., no release) every single
>>     sequence you
>>      >> touch for the lifetime of the session? (I do not think DISCARD
>>     matters here
>>      >> but I'm just guessing)
>>
>>      > Would you expand on this point or is there someplace specific 
>> in the
>>      > documentation on this?
>>
>>     I think what David is worried about is that a sequence object is a
>>     one-row table in PG's implementation.  Thus
>>
>>     (1) each sequence requires a dozen or two rows in assorted system
>>     catalogs (not sure exactly how many offhand).
>>
>>     (2) each sequence uses up 8KB on disk for its table file.
>>
>>     (3) each sequence you actually access within a particular session
>>     results in creation of relcache and catcache entries in that
>>     session's local memory.  I'm not sure offhand how big those are
>>     either, but a few KB per sequence would be a reasonable guess.
>>
>>     (4) each sequence competes for space in the shared-buffer arena,
>>     since its 8K block has to be swapped into there whenever you try
>>     to access/increment the sequence's value.
>>
>>     This is all mighty inefficient of course, and there's been talk
>>     of trying to reduce the per-sequence overhead; but I don't know
>>     of anyone actively working on that.  As things stand, I think
>>     having millions of sequences would be quite painful performance-
>>     wise, especially if your workload were actively hitting a lot
>>     of them concurrently.  It would work, for some value of "work",
>>     but it wouldn't perform very well.
>>
>>     Also, as multiple people mentioned already, this still wouldn't
>>     guarantee gap-free sequences of ID values.
>>
>>                              regards, tom lane
>>
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: Could postgres12 support millions of sequences? (like 10 million)
Следующее
От: Justin King
Дата:
Сообщение: Re: PG12 autovac issues