Обсуждение: Issue with bgworker, SPI and pgstat_report_stat

Поиск
Список
Период
Сортировка

Issue with bgworker, SPI and pgstat_report_stat

От
Julien Rouhaud
Дата:
Hello,

While investigating on a bloat issue with a colleague, we found that if
a bgworker executes some queries with SPI, the statistic changes will
never be reported, since pgstat_report_stat() is only called in regular
backends.

In our case, the bgworker is the only process inserting and deleting a
large amount of data on some tables, so the autovacuum never tried to do
any maintenance on these tables.

Should a bgworker modifing data have to call pgstat_report_stat() to
avoid this problem? I don't find any documentation suggesting it, and it
seems that worker_spi (used as a template for this bgworker and I
suppose a lot of other one) is also affected.

If yes, I think at least worker_spi should be fixed (patched attached).

Regards.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org

Вложения

Re: Issue with bgworker, SPI and pgstat_report_stat

От
Robert Haas
Дата:
On Thu, Jul 7, 2016 at 1:52 PM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:
> While investigating on a bloat issue with a colleague, we found that if
> a bgworker executes some queries with SPI, the statistic changes will
> never be reported, since pgstat_report_stat() is only called in regular
> backends.
>
> In our case, the bgworker is the only process inserting and deleting a
> large amount of data on some tables, so the autovacuum never tried to do
> any maintenance on these tables.

Ouch.

> Should a bgworker modifing data have to call pgstat_report_stat() to
> avoid this problem? I don't find any documentation suggesting it, and it
> seems that worker_spi (used as a template for this bgworker and I
> suppose a lot of other one) is also affected.

That certainly seems like the simplest fix.  Not sure if there's a better one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Issue with bgworker, SPI and pgstat_report_stat

От
Andres Freund
Дата:
On 2016-07-07 14:04:36 -0400, Robert Haas wrote:
> On Thu, Jul 7, 2016 at 1:52 PM, Julien Rouhaud
> <julien.rouhaud@dalibo.com> wrote:
> > Should a bgworker modifing data have to call pgstat_report_stat() to
> > avoid this problem? I don't find any documentation suggesting it, and it
> > seems that worker_spi (used as a template for this bgworker and I
> > suppose a lot of other one) is also affected.
> 
> That certainly seems like the simplest fix.  Not sure if there's a better one.

I think a better fix would be to unify the startup & error handling
code. We have way to many slightly diverging copies. But that's a bigger
task, so I'm not protesting against making a more localized fix.

Andres



Re: Issue with bgworker, SPI and pgstat_report_stat

От
Michael Paquier
Дата:
On Fri, Jul 8, 2016 at 3:06 AM, Andres Freund <andres@anarazel.de> wrote:
> On 2016-07-07 14:04:36 -0400, Robert Haas wrote:
>> On Thu, Jul 7, 2016 at 1:52 PM, Julien Rouhaud
>> <julien.rouhaud@dalibo.com> wrote:
>> > Should a bgworker modifing data have to call pgstat_report_stat() to
>> > avoid this problem? I don't find any documentation suggesting it, and it
>> > seems that worker_spi (used as a template for this bgworker and I
>> > suppose a lot of other one) is also affected.
>>
>> That certainly seems like the simplest fix.  Not sure if there's a better one.
>
> I think a better fix would be to unify the startup & error handling
> code. We have way to many slightly diverging copies. But that's a bigger
> task, so I'm not protesting against making a more localized fix.

It seems to me that the only fix is to have the bgworker call
pgstat_report_stat() and not mess up with the in-core backend code.
Personally, I always had the image of a bgworker to be an independent
process, so when it wants to do something, be it mimicking normal
backends, it should do it by itself. Take the example of SIGHUP
processing. If the bgworker does not ProcessConfigFile() no parameters
updates will happen in the context of the bgworker.
-- 
Michael



Re: Issue with bgworker, SPI and pgstat_report_stat

От
Julien Rouhaud
Дата:
On 08/07/2016 01:53, Michael Paquier wrote:
> On Fri, Jul 8, 2016 at 3:06 AM, Andres Freund <andres@anarazel.de> wrote:
>> On 2016-07-07 14:04:36 -0400, Robert Haas wrote:
>>> On Thu, Jul 7, 2016 at 1:52 PM, Julien Rouhaud
>>> <julien.rouhaud@dalibo.com> wrote:
>>>> Should a bgworker modifing data have to call pgstat_report_stat() to
>>>> avoid this problem? I don't find any documentation suggesting it, and it
>>>> seems that worker_spi (used as a template for this bgworker and I
>>>> suppose a lot of other one) is also affected.
>>>
>>> That certainly seems like the simplest fix.  Not sure if there's a better one.
>>
>> I think a better fix would be to unify the startup & error handling
>> code. We have way to many slightly diverging copies. But that's a bigger
>> task, so I'm not protesting against making a more localized fix.
> 
> It seems to me that the only fix is to have the bgworker call
> pgstat_report_stat() and not mess up with the in-core backend code.
> Personally, I always had the image of a bgworker to be an independent
> process, so when it wants to do something, be it mimicking normal
> backends, it should do it by itself. Take the example of SIGHUP
> processing. If the bgworker does not ProcessConfigFile() no parameters
> updates will happen in the context of the bgworker.
> 

I'm not opposed, but in this case we should also provide a proper
documentation of all the required actions to mimick normal backends.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org



Re: Issue with bgworker, SPI and pgstat_report_stat

От
Michael Paquier
Дата:
On Mon, Jul 11, 2016 at 3:36 AM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:
> I'm not opposed, but in this case we should also provide a proper
> documentation of all the required actions to mimick normal backends.

I'd rather just add a note like "Have a look at PostgresMain if you
want to imitate a backend able to run queries!" instead of listing all
the actions doable. If low-level things are added or removed in the
future in PostgresMain, it is very likely that the documentation will
not be updated at the same time, killing its purpose at the end.
-- 
Michael



Re: Issue with bgworker, SPI and pgstat_report_stat

От
Craig Ringer
Дата:
On 11 July 2016 at 11:49, Michael Paquier <michael.paquier@gmail.com> wrote:
On Mon, Jul 11, 2016 at 3:36 AM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:
> I'm not opposed, but in this case we should also provide a proper
> documentation of all the required actions to mimick normal backends.

I'd rather just add a note like "Have a look at PostgresMain if you
want to imitate a backend able to run queries!" instead of listing all
the actions doable. If low-level things are added or removed in the
future in PostgresMain, it is very likely that the documentation will
not be updated at the same time, killing its purpose at the end.

That sounds like a bug breeding ground. Especially with extensions whose bgworkers operate across Pg versions, extensions that get updated without re-checking PostgresMain and don't notice some new housekeeping task, etc.

Rather than encouraging every extension author to copy and paste random chunks of PostgresMain, probably incorrectly, I really think factoring the required logic out into something reusable by bgworkers is going to be the way to go.  
 
--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Issue with bgworker, SPI and pgstat_report_stat

От
Tomas Vondra
Дата:
On 07/11/2016 06:53 AM, Craig Ringer wrote:
> On 11 July 2016 at 11:49, Michael Paquier <michael.paquier@gmail.com
> <mailto:michael.paquier@gmail.com>> wrote:
>
>     On Mon, Jul 11, 2016 at 3:36 AM, Julien Rouhaud
>     <julien.rouhaud@dalibo.com <mailto:julien.rouhaud@dalibo.com>> wrote:
>     > I'm not opposed, but in this case we should also provide a proper
>     > documentation of all the required actions to mimick normal backends.
>
>     I'd rather just add a note like "Have a look at PostgresMain if you
>     want to imitate a backend able to run queries!" instead of listing all
>     the actions doable. If low-level things are added or removed in the
>     future in PostgresMain, it is very likely that the documentation will
>     not be updated at the same time, killing its purpose at the end.
>
>
> That sounds like a bug breeding ground. Especially with extensions whose
> bgworkers operate across Pg versions, extensions that get updated
> without re-checking PostgresMain and don't notice some new housekeeping
> task, etc.
>
> Rather than encouraging every extension author to copy and paste random
> chunks of PostgresMain, probably incorrectly, I really think factoring
> the required logic out into something reusable by bgworkers is going to
> be the way to go.
>

I'm not sure I agree with this. Clearly, the fact that worker_spi does 
not invoke pgstat_report_stat() is a bug, but as Michael points out, we 
don't have much insight into what is happening in bgworkers.

Following the changes in PostgresMain() - particularly if your bgworker 
needs to support multiple versions - is difficult, sure. But well, if 
you decided to implement a bgworker operating at such low level, you 
voluntarily accepts that responsibility.

That does not mean we can't make that easier. For example, what about 
extending the bgworker API with

(a) a set of flags (similar to bgw_flags) identifying which maintenance 
tasks the bgworker requests, and

(b) a BackgroundWorkerCleanup() function the bgworker might place at a 
suitable place, invoking all the requested maintenance tasks?

Sure, this may only help for bgworkers that do stuff fairly close to 
regular backends, but maybe that's enough.

In any case, I think adding the pgstat_report_stat() into worker_spi 
seems like a reasonable (and backpatchable) fix.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Issue with bgworker, SPI and pgstat_report_stat

От
Michael Paquier
Дата:
On Sat, Sep 3, 2016 at 10:02 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
> In any case, I think adding the pgstat_report_stat() into worker_spi seems
> like a reasonable (and backpatchable) fix.

Doing just that sounds reasonable seen from here. I am wondering also
if it would not be worth mentioning in the documentation of the
bgworkers that users trying to emulate somewhat the behavior of a
backend should look at PostgresMain(). The code in itself is full of
hints as well.
-- 
Michael



Re: Issue with bgworker, SPI and pgstat_report_stat

От
Robert Haas
Дата:
On Sat, Sep 3, 2016 at 12:29 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Sat, Sep 3, 2016 at 10:02 AM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com> wrote:
>> In any case, I think adding the pgstat_report_stat() into worker_spi seems
>> like a reasonable (and backpatchable) fix.
>
> Doing just that sounds reasonable seen from here. I am wondering also
> if it would not be worth mentioning in the documentation of the
> bgworkers that users trying to emulate somewhat the behavior of a
> backend should look at PostgresMain(). The code in itself is full of
> hints as well.

Everybody seems happy with this fix for a first step, so I've
committed it and back-patched it to 9.3.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Issue with bgworker, SPI and pgstat_report_stat

От
Julien Rouhaud
Дата:
On 28/09/2016 18:46, Robert Haas wrote:
> 
> Everybody seems happy with this fix for a first step, so I've
> committed it and back-patched it to 9.3.
> 

Thanks!

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org