Re: [HACKERS] Replication/backup defaults

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: [HACKERS] Replication/backup defaults
Дата
Msg-id c2d7db7e-7e05-51cf-206d-07a8cafe1c98@2ndquadrant.com
обсуждение исходный текст
Ответ на [HACKERS] Replication/backup defaults  (Magnus Hagander <magnus@hagander.net>)
Ответы Re: [HACKERS] Replication/backup defaults  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Список pgsql-hackers
On 01/05/2017 05:37 PM, Stephen Frost wrote:
> Tomas,
>
> * Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
>> On 01/05/2017 02:23 PM, Magnus Hagander wrote:
>>> It's easy enough to construct a benchmark specifically to show the
>>> difference, but of any actual "normal workload" for it. Typically the
>>> optimization applies to things like bulk loading, which typically never
>>> done alone and does not lend itself to that type of benchmarking very
>>> easily.
>>
>> Not sure if I understand correctly what you're saying. You're saying
>> that although it'd be easy to construct a benchmark showing
>> significant performance impact, it won't represent a common
>> workload. Correct?
>
> I think he's saying that it's not very easy to construct a good example
> of typical bulk-loading workloads using just pgbench.  Bulk loading
> certainly happens with PG and I don't think we'll make very many friends
> if we break optimizations when wal_level is set to minimal like those
> you get using:
>
> BEGIN;
> CREATE TABLE x (c1 int);
> COPY x FROM STDIN;
> COMMIT;
>
> or:
>
> BEGIN;
> TRUNCATE x;
> COPY x FROM STDIN;
> COMMIT;
>
> Changing the wal_level from 'minimal' to 'replica' or 'logical' with
> such a benchmark is going to make the WAL go from next-to-nothing to
> size-of-database.

Sure, I do know how to construct such workloads - and it's trivial even 
with pgbench custom scripts. The question is whether such workloads are 
common or not.

Most importantly, no one is proposing to break the optimizations, but 
changing the defaults - users relying on the optimizations are free to 
switch back to wal_level=minimal if needed.
>
> One doesn't typically *just* do bulk loads, however,
> often it's a bulk load into a table and then the contents of that table
> are merged with another table or perhaps joined to it to produce some
> report or something along those lines.  In many of those cases, our
> more-recently added capability to have UNLOGGED tables will work, but
> not all (in particular, it can be very handy to load everything in using
> the above technique and then switch the wal_level to replica, which
> avoids having to have the bulk of the data sent through WAL, something
> you can't avoid if you want to turn an unlogged table into a logged
> one).
>

Ultimately, the question is whether the number of people running into 
"Hey, I can't take pg_basebackup or setup a standby with the default 
config!" is higher or lower than number of people running into "Hey, 
CREATE TABLE + COPY is slower now!"

I haven't seen many systems relying on such load optimizations, for a 
number of reasons:

1) The important/critical systems usually have replicas, so are 
inherently incompatible with wal_level=minimal.

2) The batch jobs usually don't truncate the main table, but load the 
increment into a temporary/unlogged table first, then merge it into the 
main one.

That is not to say there are no other cases benefiting from those 
optimizations, but we're talking about the default value - we're not 
removing the wal_level=minimal.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ants Aasma
Дата:
Сообщение: Re: [HACKERS] Replication slot xmin is not reset if HS feedback isturned off while standby is shut down
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] [COMMITTERS] pgsql: Fix possible crash reading pg_stat_activity.