Обсуждение: [COMMITTERS] pgsql: Mark pg_start_backup and pg_stop_backup as parallel-restricted.
[COMMITTERS] pgsql: Mark pg_start_backup and pg_stop_backup as parallel-restricted.
От
Robert Haas
Дата:
Mark pg_start_backup and pg_stop_backup as parallel-restricted. They depend on backend-private state that will not be synchronized by the parallel machinery, so they should not be marked parallel-safe. This issue also exists in 9.6, but we obviously can't do anything about 9.6 clusters that already exist. Possibly this could be back-patched so that future 9.6 clusters would come out OK, or possibly we should back-patch some other fix, but that would need more discussion. David Steele, reviewed by Michael Paquier Discussion: http://postgr.es/m/CA+TgmoYCWfO2UM-t=HUMFJyxJywLDiLL0nAJpx88LKtvBvNECw@mail.gmail.com Branch ------ master Details ------- http://git.postgresql.org/pg/commitdiff/9fe3c644a73198941e9a502958c24727dc4a6434 Modified Files -------------- src/include/catalog/catversion.h | 2 +- src/include/catalog/pg_proc.h | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)
Re: [COMMITTERS] pgsql: Mark pg_start_backup and pg_stop_backup asparallel-restricted.
От
David Steele
Дата:
On 3/6/17 12:48 PM, Robert Haas wrote: > Mark pg_start_backup and pg_stop_backup as parallel-restricted. > > They depend on backend-private state that will not be synchronized by > the parallel machinery, so they should not be marked parallel-safe. > This issue also exists in 9.6, but we obviously can't do anything > about 9.6 clusters that already exist. Possibly this could be > back-patched so that future 9.6 clusters would come out OK, or > possibly we should back-patch some other fix, but that would need more > discussion. I think it would be worth back-patching the catalog fix for future 9.6 clusters as a start. Parallelism is off by default in 9.6 so that mitigates some of the problem. I don't have any regression tests that cover backups when parallelism is enabled in 9.6, but I'm willing to do that and see if this is a realistic issue or not. -- -David david@pgmasters.net
David Steele <david@pgmasters.net> writes: > On 3/6/17 12:48 PM, Robert Haas wrote: >> This issue also exists in 9.6, but we obviously can't do anything >> about 9.6 clusters that already exist. Possibly this could be >> back-patched so that future 9.6 clusters would come out OK, or >> possibly we should back-patch some other fix, but that would need more >> discussion. > I think it would be worth back-patching the catalog fix for future 9.6 > clusters as a start. Yes, I think it's rather silly not to do so. We have made comparable backpatched fixes multiple times in the past. What is worth discussing is whether there are *additional* things we ought to do in 9.6 to prevent misbehavior in installations initdb'd pre-9.6.3. If there's a cheap way of testing "AmInParallelWorker", I'd be in favor of adding a quick-n-dirty test and ereport(ERROR) to these functions in the 9.6 branch, so that at least you get a clean error and not some weird misbehavior. Not sure if there's anything more we can do than that. regards, tom lane
Re: [COMMITTERS] pgsql: Mark pg_start_backup and pg_stop_backup asparallel-restricted.
От
Stephen Frost
Дата:
Tom, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > David Steele <david@pgmasters.net> writes: > > On 3/6/17 12:48 PM, Robert Haas wrote: > >> This issue also exists in 9.6, but we obviously can't do anything > >> about 9.6 clusters that already exist. Possibly this could be > >> back-patched so that future 9.6 clusters would come out OK, or > >> possibly we should back-patch some other fix, but that would need more > >> discussion. > > > I think it would be worth back-patching the catalog fix for future 9.6 > > clusters as a start. > > Yes, I think it's rather silly not to do so. We have made comparable > backpatched fixes multiple times in the past. What is worth discussing is > whether there are *additional* things we ought to do in 9.6 to prevent > misbehavior in installations initdb'd pre-9.6.3. > > If there's a cheap way of testing "AmInParallelWorker", I'd be in favor of > adding a quick-n-dirty test and ereport(ERROR) to these functions in the > 9.6 branch, so that at least you get a clean error and not some weird > misbehavior. Not sure if there's anything more we can do than that. That's more-or-less what I was thinking (and suggested to David over IM a little while ago, actually). I don't know if there's an easy way to do such a check, but I don't think it would really need to be particularly cheap, just not overly complex. These code paths are certainly not ones that need to be high-performance. Thanks! Stephen
Вложения
On Mon, Mar 6, 2017 at 3:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yes, I think it's rather silly not to do so. We have made comparable > backpatched fixes multiple times in the past. What is worth discussing is > whether there are *additional* things we ought to do in 9.6 to prevent > misbehavior in installations initdb'd pre-9.6.3. > > If there's a cheap way of testing "AmInParallelWorker", I'd be in favor of > adding a quick-n-dirty test and ereport(ERROR) to these functions in the > 9.6 branch, so that at least you get a clean error and not some weird > misbehavior. Not sure if there's anything more we can do than that. Sounds like you want IsParallelWorker(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [COMMITTERS] pgsql: Mark pg_start_backup and pg_stop_backup asparallel-restricted.
От
David Steele
Дата:
On 3/6/17 3:28 PM, Stephen Frost wrote: > > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> David Steele <david@pgmasters.net> writes: >>> On 3/6/17 12:48 PM, Robert Haas wrote: >>>> This issue also exists in 9.6, but we obviously can't do anything >>>> about 9.6 clusters that already exist. Possibly this could be >>>> back-patched so that future 9.6 clusters would come out OK, or >>>> possibly we should back-patch some other fix, but that would need more >>>> discussion. >> >>> I think it would be worth back-patching the catalog fix for future 9.6 >>> clusters as a start. >> >> Yes, I think it's rather silly not to do so. We have made comparable >> backpatched fixes multiple times in the past. What is worth discussing is >> whether there are *additional* things we ought to do in 9.6 to prevent >> misbehavior in installations initdb'd pre-9.6.3. >> >> If there's a cheap way of testing "AmInParallelWorker", I'd be in favor of >> adding a quick-n-dirty test and ereport(ERROR) to these functions in the >> 9.6 branch, so that at least you get a clean error and not some weird >> misbehavior. Not sure if there's anything more we can do than that. > > That's more-or-less what I was thinking (and suggested to David over IM > a little while ago, actually). I don't know if there's an easy way to > do such a check, but I don't think it would really need to be > particularly cheap, just not overly complex. These code paths are > certainly not ones that need to be high-performance. Way back when, I tried to get backups on 9.6 to fail due to pg_stop_backup() running in a parallel worker and I was not able to make it happen so I gave up and moved on to other things. However, we just got a report from the field that a user ran into this exact situation: ERROR: [057]: raised from remote-0 protocol on 'XX.XX.XXX.XXX': unable to execute query 'select lsn::text as lsn, pg_catalog.pg_xlogfile_name(lsn)::text as wal_segment_name, labelfile::text as backuplabel_file, spcmapfile::text as tablespacemap_file from pg_catalog.pg_stop_backup(false)' ERROR: non-exclusive backup is not in progress HINT: Did you mean to use pg_stop_backup('t')? CONTEXT: parallel worker https://github.com/pgbackrest/pgbackrest/issues/1083 So apparently it is possible. To get them working as soon as possible I recommended that they run: alter role postgres set max_parallel_workers_per_gather = 0; And that solved their problem. 9.6 is getting on in years so I'm not sure how much time/effort we want to spend on this, but I figured it was worth mentioning. I did another round of trying to reproduce the issue but came up short a second time. I'm willing to put together a patch for 9.6 to update the catalog and/or add the error if there is interest. Thoughts? Regards, -- -David david@pgmasters.net
David Steele <david@pgmasters.net> writes: > ... So apparently it is possible. To get them working as soon as possible I > recommended that they run: > alter role postgres set max_parallel_workers_per_gather = 0; > And that solved their problem. 9.6 is getting on in years so I'm not > sure how much time/effort we want to spend on this, but I figured it was > worth mentioning. > I did another round of trying to reproduce the issue but came up short a > second time. I was able to force it like this: regression=# select pg_start_backup('foo', 'f', 'f'); pg_start_backup ----------------- 0/18000028 (1 row) regression=# set force_parallel_mode TO 'on'; SET regression=# set max_parallel_workers_per_gather = 4; SET regression=# explain select * from pg_catalog.pg_stop_backup(false); QUERY PLAN -------------------------------------------------------------------------- Gather (cost=1000.00..1000.11 rows=1 width=72) Workers Planned: 1 Single Copy: true -> Function Scan on pg_stop_backup (cost=0.00..0.01 rows=1 width=72) (4 rows) regression=# select * from pg_catalog.pg_stop_backup(false); ERROR: non-exclusive backup is not in progress HINT: Did you mean to use pg_stop_backup('t')? CONTEXT: parallel worker It doesn't seem terribly likely that anybody would be using force_parallel_mode = on in production, but perhaps there's some reasonable combination of the other parallelization planning GUCs that can make this plan look attractive. I'm kind of inclined not to bother with a code fix at this late date, given that we've had so few trouble reports. The right answer is to tell anyone who's affected to fix their catalogs manually, viz update pg_proc set proparallel = 'r' where proname = 'pg_start_backup'; update pg_proc set proparallel = 'r' where proname = 'pg_stop_backup'; If we had back-patched 9fe3c644a (sans catversion bump of course) at the time, then initdb's done with 9.6.3 or later would have gotten this right. In hindsight, not doing that was clearly wrong. But it seems a bit late to do it now. regards, tom lane
Re: [COMMITTERS] pgsql: Mark pg_start_backup and pg_stop_backup asparallel-restricted.
От
David Steele
Дата:
On 6/24/20 6:27 PM, Tom Lane wrote: > David Steele <david@pgmasters.net> writes: >> ... So apparently it is possible. To get them working as soon as possible I >> recommended that they run: >> alter role postgres set max_parallel_workers_per_gather = 0; >> And that solved their problem. 9.6 is getting on in years so I'm not >> sure how much time/effort we want to spend on this, but I figured it was >> worth mentioning. > >> I did another round of trying to reproduce the issue but came up short a >> second time. > > I was able to force it like this: > > regression=# set force_parallel_mode TO 'on'; > SET Ah, yes, that works. Now at least I can test it. > It doesn't seem terribly likely that anybody would be using > force_parallel_mode = on in production, but perhaps there's some > reasonable combination of the other parallelization planning GUCs > that can make this plan look attractive. I'll also ask the user if they have this GUC enabled. > I'm kind of inclined not to bother with a code fix at this late date, > given that we've had so few trouble reports. The right answer is > to tell anyone who's affected to fix their catalogs manually, viz > > update pg_proc set proparallel = 'r' where proname = 'pg_start_backup'; > update pg_proc set proparallel = 'r' where proname = 'pg_stop_backup'; Only pg_stop_backup() needs to be updated, but that's probably the best solution. > If we had back-patched 9fe3c644a (sans catversion bump of course) > at the time, then initdb's done with 9.6.3 or later would have gotten > this right. In hindsight, not doing that was clearly wrong. But it > seems a bit late to do it now. OK by me. Regards, -- -David david@pgmasters.net
Re: [COMMITTERS] pgsql: Mark pg_start_backup and pg_stop_backup as parallel-restricted.
От
Magnus Hagander
Дата:
On Thu, Jun 25, 2020 at 2:11 PM David Steele <david@pgmasters.net> wrote:
On 6/24/20 6:27 PM, Tom Lane wrote:
> David Steele <david@pgmasters.net> writes:
>> ... So apparently it is possible. To get them working as soon as possible I
>> recommended that they run:
>> alter role postgres set max_parallel_workers_per_gather = 0;
>> And that solved their problem. 9.6 is getting on in years so I'm not
>> sure how much time/effort we want to spend on this, but I figured it was
>> worth mentioning.
>
>> I did another round of trying to reproduce the issue but came up short a
>> second time.
>
> I was able to force it like this:
>
> regression=# set force_parallel_mode TO 'on';
> SET
Ah, yes, that works. Now at least I can test it.
> It doesn't seem terribly likely that anybody would be using
> force_parallel_mode = on in production, but perhaps there's some
> reasonable combination of the other parallelization planning GUCs
> that can make this plan look attractive.
I'll also ask the user if they have this GUC enabled.
Maybe have pgbackrest issue an explicit SET force_parallel_mode=off when it runs against a 9.6?
> I'm kind of inclined not to bother with a code fix at this late date,
> given that we've had so few trouble reports. The right answer is
> to tell anyone who's affected to fix their catalogs manually, viz
>
> update pg_proc set proparallel = 'r' where proname = 'pg_start_backup';
> update pg_proc set proparallel = 'r' where proname = 'pg_stop_backup';
Only pg_stop_backup() needs to be updated, but that's probably the best
solution.
> If we had back-patched 9fe3c644a (sans catversion bump of course)
> at the time, then initdb's done with 9.6.3 or later would have gotten
> this right. In hindsight, not doing that was clearly wrong. But it
> seems a bit late to do it now.
OK by me.
+1. Seems infrequent enough or we would've heard much about it before. And it's not going to be *all* that common with brand new clusters on 9.6 these days (I hope).
Re: [COMMITTERS] pgsql: Mark pg_start_backup and pg_stop_backup asparallel-restricted.
От
David Steele
Дата:
On 6/25/20 8:43 AM, Magnus Hagander wrote: > On Thu, Jun 25, 2020 at 2:11 PM David Steele <david@pgmasters.net > <mailto:david@pgmasters.net>> wrote: > On 6/24/20 6:27 PM, Tom Lane wrote: > > > > I was able to force it like this: > > > > regression=# set force_parallel_mode TO 'on'; > > SET > > Ah, yes, that works. Now at least I can test it. > > > It doesn't seem terribly likely that anybody would be using > > force_parallel_mode = on in production, but perhaps there's some > > reasonable combination of the other parallelization planning GUCs > > that can make this plan look attractive. > > I'll also ask the user if they have this GUC enabled. The user confirmed they are running with force_parallel_mode=on so that probably explains why we have never seen this in the field before. > Maybe have pgbackrest issue an explicit SET force_parallel_mode=off when > it runs against a 9.6? According to the documentation the way to disable parallelism is: set max_parallel_workers_per_gather = 0 So we added that to session initialization in pgBackRest: https://github.com/pgbackrest/pgbackrest/commit/ea04ec7b3f4c6cf25c1b827c25c6a3c5896159a8 I'm worried that (as Tom said) the planner might find another reason to choose a parallel query. I'm looking at this as more than a fix for 9.6. I can't see any reason for the backup control session to run queries in parallel and possibly use more resources, so parallelism is disabled for all versions that support it. Regards, -- -David david@pgmasters.net
Re: [COMMITTERS] pgsql: Mark pg_start_backup and pg_stop_backup as parallel-restricted.
От
Magnus Hagander
Дата:
On Thu, Jun 25, 2020 at 4:56 PM David Steele <david@pgmasters.net> wrote:
On 6/25/20 8:43 AM, Magnus Hagander wrote:
> On Thu, Jun 25, 2020 at 2:11 PM David Steele <david@pgmasters.net
> <mailto:david@pgmasters.net>> wrote:
> On 6/24/20 6:27 PM, Tom Lane wrote:
> >
> > I was able to force it like this:
> >
> > regression=# set force_parallel_mode TO 'on';
> > SET
>
> Ah, yes, that works. Now at least I can test it.
>
> > It doesn't seem terribly likely that anybody would be using
> > force_parallel_mode = on in production, but perhaps there's some
> > reasonable combination of the other parallelization planning GUCs
> > that can make this plan look attractive.
>
> I'll also ask the user if they have this GUC enabled.
The user confirmed they are running with force_parallel_mode=on so that
probably explains why we have never seen this in the field before.
That's good.
> Maybe have pgbackrest issue an explicit SET force_parallel_mode=off when
> it runs against a 9.6?
According to the documentation the way to disable parallelism is:
set max_parallel_workers_per_gather = 0
Interesting. I had somehow gotten into my head that force_parallel_mode would override that. But it doesn't.
So we added that to session initialization in pgBackRest:
https://github.com/pgbackrest/pgbackrest/commit/ea04ec7b3f4c6cf25c1b827c25c6a3c5896159a8
Personally I would've done it *just* for 9.6 and not for 10+, but that's mostly semantic :) But if you do it for 9.6 then *eventually* you will be able to retire it.
I'm worried that (as Tom said) the planner might find another reason to
choose a parallel query.
I'm looking at this as more than a fix for 9.6. I can't see any reason
for the backup control session to run queries in parallel and possibly
use more resources, so parallelism is disabled for all versions that
support it.
Right. But since the parameters are flagged as parallel restricted in 10+...
Or are you saying you're worried about other things, completely unrelated to start/stop backup, that the session might run?
David Steele <david@pgmasters.net> writes: > I'm looking at this as more than a fix for 9.6. I can't see any reason > for the backup control session to run queries in parallel and possibly > use more resources, so parallelism is disabled for all versions that > support it. Post-9.6, these functions are correctly marked as parallel-restricted, so there really shouldn't be an issue. regards, tom lane
Re: [COMMITTERS] pgsql: Mark pg_start_backup and pg_stop_backup asparallel-restricted.
От
David Steele
Дата:
On 6/25/20 11:02 AM, Magnus Hagander wrote: > On Thu, Jun 25, 2020 at 4:56 PM David Steele <david@pgmasters.net > <mailto:david@pgmasters.net>> wrote: > > So we added that to session initialization in pgBackRest: > > https://github.com/pgbackrest/pgbackrest/commit/ea04ec7b3f4c6cf25c1b827c25c6a3c5896159a8 > > Personally I would've done it *just* for 9.6 and not for 10+, but that's > mostly semantic :) But if you do it for 9.6 then *eventually* you will > be able to retire it. We still support 8.3 so the day when we drop 9.6 support seems so far away that I'm just not worried about it. And yes, we still see users with 8.3/8.4 databases. Mostly they just want to update but backups are an essential part of that process and some of them are multi-TB databases. We are planning to drop 8.3 support soon, though, and just tell users to go back and use X version of pgbackrest if they really need it. This has more to do with the availability of packages than anything else. Christoph has built packages all the way back to 8.4 for all recent Debian distros (thanks Christoph!) but the only place we can get 8.3 packages is on EOL distros, e.g. Ubuntu 12.04. > I'm worried that (as Tom said) the planner might find another reason to > choose a parallel query. > > I'm looking at this as more than a fix for 9.6. I can't see any reason > for the backup control session to run queries in parallel and possibly > use more resources, so parallelism is disabled for all versions that > support it. > > Right. But since the parameters are flagged as parallel restricted in 10+... > > Or are you saying you're worried about other things, completely > unrelated to start/stop backup, that the session might run? That's what I'm worried about. We run other queries and functions besides pg_start_backup()/pg_stop_backup(). None of them need to be parallelized so why not just disable it? One less variable to worry about. Regards, -- -David david@pgmasters.net