Обсуждение: GIN pending list clean up exposure to SQL

Поиск
Список
Период
Сортировка

GIN pending list clean up exposure to SQL

От
Jeff Janes
Дата:
I've written a function which allows users to clean up the pending list.  It takes the index name and returns the number of pending list pages deleted.

# select * from gin_clean_pending_list('foo_text_array_idx');
 gin_clean_pending_list
------------------------
                    278
(1 row)

Time: 31994.880 ms


This is needed because there needs to be a way to offload this duty from the user backends, and the only other way to intentionaly clean up the list is by vacuum (and the rest of a vacuum can take days to run on a large table).  Autoanalyze will also do it, but it hard to arrange for those to occur at need, and unless you drop default_statistics_target very low they can also take a long time.  And if you do lower the target, it screws up your statistics, of course.

I've currently crammed it into pageinspect, simply because that is where I found the existing code which I used as an exemplar for writing this code.

But where does this belong?  Core?  Its own separate extension?

Cheers,

Jeff
Вложения

Re: GIN pending list clean up exposure to SQL

От
Oleg Bartunov
Дата:


On Thu, Aug 13, 2015 at 2:19 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
I've written a function which allows users to clean up the pending list.  It takes the index name and returns the number of pending list pages deleted.

# select * from gin_clean_pending_list('foo_text_array_idx');
 gin_clean_pending_list
------------------------
                    278
(1 row)

Time: 31994.880 ms


This is needed because there needs to be a way to offload this duty from the user backends, and the only other way to intentionaly clean up the list is by vacuum (and the rest of a vacuum can take days to run on a large table).  Autoanalyze will also do it, but it hard to arrange for those to occur at need, and unless you drop default_statistics_target very low they can also take a long time.  And if you do lower the target, it screws up your statistics, of course.

I've currently crammed it into pageinspect, simply because that is where I found the existing code which I used as an exemplar for writing this code.

But where does this belong?  Core?  Its own separate extension?

For a long time we have gevel extension (http://www.sigaev.ru/git/gitweb.cgi?p=gevel.git;a=summary) , which was originally developed to help us debugging indexes, but it's also contains user's functions.  Your function could be there, but I have no idea about gevel itself. Probably, it should be restructurized and come to contrib. Do you have time to look into ?

 

Cheers,

Jeff


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: GIN pending list clean up exposure to SQL

От
Alvaro Herrera
Дата:
Jeff Janes wrote:
> I've written a function which allows users to clean up the pending list.
> It takes the index name and returns the number of pending list pages
> deleted.

I just noticed that your patch uses AccessShareLock on the index.  Is
that okay?  I would have assumed that you'd need ShareUpdateExclusive
(same as vacuum uses), but I don't really know.  Was that a carefully
thought-out choice?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: GIN pending list clean up exposure to SQL

От
Alvaro Herrera
Дата:
Alvaro Herrera wrote:
> Jeff Janes wrote:
> > I've written a function which allows users to clean up the pending list.
> > It takes the index name and returns the number of pending list pages
> > deleted.
> 
> I just noticed that your patch uses AccessShareLock on the index.  Is
> that okay?  I would have assumed that you'd need ShareUpdateExclusive
> (same as vacuum uses), but I don't really know.  Was that a carefully
> thought-out choice?

After reading gitPendingCleanup it becomes clear that there's no need
for a stronger lock than what you've chosen.  Jaime Casanova just
pointed this out to me.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: GIN pending list clean up exposure to SQL

От
Jaime Casanova
Дата:
On 12 August 2015 at 20:19, Jeff Janes <jeff.janes@gmail.com> wrote:
>
> But where does this belong?  Core?  Its own separate extension?
>

I will say core. Gin indexes are in core and i don't see why this
function shouldn't.
FWIW, brin indexes has a similar function brin_summarize_new_values() in core

--
Jaime Casanova                      www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación



Re: GIN pending list clean up exposure to SQL

От
Jaime Casanova
Дата:
On 19 November 2015 at 14:18, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Alvaro Herrera wrote:
>> Jeff Janes wrote:
>> > I've written a function which allows users to clean up the pending list.
>> > It takes the index name and returns the number of pending list pages
>> > deleted.
>>
>> I just noticed that your patch uses AccessShareLock on the index.  Is
>> that okay?  I would have assumed that you'd need ShareUpdateExclusive
>> (same as vacuum uses), but I don't really know.  Was that a carefully
>> thought-out choice?
>
> After reading gitPendingCleanup it becomes clear that there's no need
> for a stronger lock than what you've chosen.  Jaime Casanova just
> pointed this out to me.
>

But it should do some checks, no?
- only superusers?
- what i received as parameter is a GIN index?

--
Jaime Casanova                      www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación



Re: GIN pending list clean up exposure to SQL

От
Jaime Casanova
Дата:
On 19 November 2015 at 14:47, Jaime Casanova
<jaime.casanova@2ndquadrant.com> wrote:
> On 19 November 2015 at 14:18, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>> Alvaro Herrera wrote:
>>> Jeff Janes wrote:
>>> > I've written a function which allows users to clean up the pending list.
>>> > It takes the index name and returns the number of pending list pages
>>> > deleted.
>>>
>>> I just noticed that your patch uses AccessShareLock on the index.  Is
>>> that okay?  I would have assumed that you'd need ShareUpdateExclusive
>>> (same as vacuum uses), but I don't really know.  Was that a carefully
>>> thought-out choice?
>>
>> After reading gitPendingCleanup it becomes clear that there's no need
>> for a stronger lock than what you've chosen.  Jaime Casanova just
>> pointed this out to me.
>>
>
> But it should do some checks, no?
> - only superusers?
> - what i received as parameter is a GIN index?
>

I just notice this:

+       ginInsertCleanup(&ginstate, true, &stats);

ginInsertCleanup() now has four parameters, so you should update the call

--
Jaime Casanova                      www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación



Re: GIN pending list clean up exposure to SQL

От
Jaime Casanova
Дата:
On 19 November 2015 at 14:57, Jaime Casanova
<jaime.casanova@2ndquadrant.com> wrote:
> On 19 November 2015 at 14:47, Jaime Casanova
> <jaime.casanova@2ndquadrant.com> wrote:
>> On 19 November 2015 at 14:18, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>> Alvaro Herrera wrote:
>>>> Jeff Janes wrote:
>>>> > I've written a function which allows users to clean up the pending list.
>>>> > It takes the index name and returns the number of pending list pages
>>>> > deleted.
>>>>
>>>> I just noticed that your patch uses AccessShareLock on the index.  Is
>>>> that okay?  I would have assumed that you'd need ShareUpdateExclusive
>>>> (same as vacuum uses), but I don't really know.  Was that a carefully
>>>> thought-out choice?
>>>
>>> After reading gitPendingCleanup it becomes clear that there's no need
>>> for a stronger lock than what you've chosen.  Jaime Casanova just
>>> pointed this out to me.
>>>
>>
>> But it should do some checks, no?
>> - only superusers?
>> - what i received as parameter is a GIN index?
>>
>
> I just notice this:
>
> +       ginInsertCleanup(&ginstate, true, &stats);
>
> ginInsertCleanup() now has four parameters, so you should update the call
>

Btw, this is not in the commitfest and seems like a useful thing to have

--
Jaime Casanova                      www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación



Re: GIN pending list clean up exposure to SQL

От
Jim Nasby
Дата:
On 11/19/15 10:47 AM, Jaime Casanova wrote:
> - only superusers?

I would think the owner of the table (index?) should also be able to run 
this.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: GIN pending list clean up exposure to SQL

От
Jaime Casanova
Дата:
On 21 November 2015 at 03:54, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 11/19/15 10:47 AM, Jaime Casanova wrote:
>>
>> - only superusers?
>
>
> I would think the owner of the table (index?) should also be able to run
> this.

agreed, that makes sense

--
Jaime Casanova                      www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación



Re: GIN pending list clean up exposure to SQL

От
Fujii Masao
Дата:
On Sun, Nov 22, 2015 at 1:15 PM, Jaime Casanova
<jaime.casanova@2ndquadrant.com> wrote:
> On 21 November 2015 at 03:54, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
>> On 11/19/15 10:47 AM, Jaime Casanova wrote:
>>>
>>> - only superusers?
>>
>>
>> I would think the owner of the table (index?) should also be able to run
>> this.
>
> agreed, that makes sense

Also the function should ensure that the server is running in
normal mode (not recovery mode) and the specified relation is
NOT other session's temporary table.

I added the similar function into pg_bigm extension
(pg_gin_pending_cleanup function in
https://osdn.jp/projects/pgbigm/scm/git/pg_bigm/blobs/master/bigm_gin.c).
It might help us improve the patch.

Regards,

-- 
Fujii Masao



Re: GIN pending list clean up exposure to SQL

От
Jeff Janes
Дата:
On Thu, Nov 19, 2015 at 8:37 AM, Jaime Casanova
<jaime.casanova@2ndquadrant.com> wrote:
> On 12 August 2015 at 20:19, Jeff Janes <jeff.janes@gmail.com> wrote:
>>
>> But where does this belong?  Core?  Its own separate extension?
>>
>
> I will say core. Gin indexes are in core and i don't see why this
> function shouldn't.
> FWIW, brin indexes has a similar function brin_summarize_new_values() in core


I was holding off on doing more on this until after the bug
http://www.postgresql.org/message-id/CAMkU=1xALfLhUUohFP5v33RzedLVb5aknNUjcEuM9KNBKrB6-Q@mail.gmail.com
was resolved, as I think it might change the way things work enough to
make a difference, at least to the documentation if not the
functioning.

I will look at brin_summarize_new_values for guidance.  But now we
have one vote for core and one for 'gevel' extension, so I'm not sure
where to go.  The nice thing about core is that is always there (in
the future) and maintained by a group of people, and as you point out
there is precedence with the BRIN index.  But the nice thing with an
extension is that it easier to adapt for existing installations, so
you don't have to wait for 9.6 to get access to it.

I'll prepare a patch for core for the January commitfest, and see if
it flies.  If not, there is always the extension to fall back to.

Cheers,

Jeff



Re: GIN pending list clean up exposure to SQL

От
Jeff Janes
Дата:
On Wed, Nov 25, 2015 at 9:29 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>
> I'll prepare a patch for core for the January commitfest, and see if
> it flies.  If not, there is always the extension to fall back to.

Here is an updated patch.  I've added type and permission checks,
docs, and some regression tests.

Cheers,

Jeff

Вложения

Re: GIN pending list clean up exposure to SQL

От
Julien Rouhaud
Дата:
On 29/12/2015 00:30, Jeff Janes wrote:
> On Wed, Nov 25, 2015 at 9:29 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>
>> I'll prepare a patch for core for the January commitfest, and see if
>> it flies.  If not, there is always the extension to fall back to.
> 
> Here is an updated patch.  I've added type and permission checks,
> docs, and some regression tests.
> 

I just reviewed it. Patch applies cleanly, everything works as intended,
including regression tests.

I think the function should be declared as strict.

Also, there are some trailing whitespaces in the documentation diff.

Regards.

> Cheers,
> 
> Jeff
> 
> 
> 
> 


-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org



Re: GIN pending list clean up exposure to SQL

От
Jeff Janes
Дата:
On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:
> On 29/12/2015 00:30, Jeff Janes wrote:
>> On Wed, Nov 25, 2015 at 9:29 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>>
>>> I'll prepare a patch for core for the January commitfest, and see if
>>> it flies.  If not, there is always the extension to fall back to.
>>
>> Here is an updated patch.  I've added type and permission checks,
>> docs, and some regression tests.
>>
>
> I just reviewed it. Patch applies cleanly, everything works as intended,
> including regression tests.
>
> I think the function should be declared as strict.

OK.  I see that brin_summarize_new_values, which I modeled this on,
was recently changed to be strict.  So I've changed this the same way.

>
> Also, there are some trailing whitespaces in the documentation diff.

Fixed.  I also added the DESC to the pg_proc entry, which I somehow
missed before.

Thanks,

Jeff

Вложения

Re: GIN pending list clean up exposure to SQL

От
Julien Rouhaud
Дата:
On 15/01/2016 22:59, Jeff Janes wrote:
> On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
> <julien.rouhaud@dalibo.com> wrote:
>> On 29/12/2015 00:30, Jeff Janes wrote:
>>> On Wed, Nov 25, 2015 at 9:29 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>>>
>>>> I'll prepare a patch for core for the January commitfest, and see if
>>>> it flies.  If not, there is always the extension to fall back to.
>>>
>>> Here is an updated patch.  I've added type and permission checks,
>>> docs, and some regression tests.
>>>
>>
>> I just reviewed it. Patch applies cleanly, everything works as intended,
>> including regression tests.
>>
>> I think the function should be declared as strict.
> 
> OK.  I see that brin_summarize_new_values, which I modeled this on,
> was recently changed to be strict.  So I've changed this the same way.
> 
>>
>> Also, there are some trailing whitespaces in the documentation diff.
> 
> Fixed.

Thanks!

>  I also added the DESC to the pg_proc entry, which I somehow
> missed before.
> 

Good catch, I also missed it.

All looks fine to me, I flag it as ready for committer.


-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org



Re: GIN pending list clean up exposure to SQL

От
Fujii Masao
Дата:
On Sat, Jan 16, 2016 at 7:42 AM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:
> On 15/01/2016 22:59, Jeff Janes wrote:
>> On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
>> <julien.rouhaud@dalibo.com> wrote:
>>> On 29/12/2015 00:30, Jeff Janes wrote:
>>>> On Wed, Nov 25, 2015 at 9:29 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>>>>
>>>>> I'll prepare a patch for core for the January commitfest, and see if
>>>>> it flies.  If not, there is always the extension to fall back to.
>>>>
>>>> Here is an updated patch.  I've added type and permission checks,
>>>> docs, and some regression tests.
>>>>
>>>
>>> I just reviewed it. Patch applies cleanly, everything works as intended,
>>> including regression tests.
>>>
>>> I think the function should be declared as strict.
>>
>> OK.  I see that brin_summarize_new_values, which I modeled this on,
>> was recently changed to be strict.  So I've changed this the same way.
>>
>>>
>>> Also, there are some trailing whitespaces in the documentation diff.
>>
>> Fixed.
>
> Thanks!
>
>>  I also added the DESC to the pg_proc entry, which I somehow
>> missed before.
>>
>
> Good catch, I also missed it.
>
> All looks fine to me, I flag it as ready for committer.

When I compiled the PostgreSQL with the patch, I got the following error.
ISTM that the inclusion of pg_am.h header file is missing in ginfast.c.

ginfast.c: In function 'gin_clean_pending_list':
ginfast.c:980: error: 'GIN_AM_OID' undeclared (first use in this function)
ginfast.c:980: error: (Each undeclared identifier is reported only once
ginfast.c:980: error: for each function it appears in.)

gin_clean_pending_list() must check whether the server is in recovery or not.
If it's in recovery, the function must exit with an error. That is, IMO,
something like the following check must be added.
        if (RecoveryInProgress())                ereport(ERROR,

(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),                                 errmsg("recovery is in progress"),
                              errhint("GIN pending list cannot be
 
cleaned up during recovery.")));

It's better to make gin_clean_pending_list() check whether the target index
is temporary index of other sessions or not, like pgstatginindex() does.

+    Relation    indexRel = index_open(indexoid, AccessShareLock);

ISTM that AccessShareLock is not safe when updating the pending list and
GIN index main structure. Probaby we should use RowExclusiveLock?

Regards,

-- 
Fujii Masao



Re: GIN pending list clean up exposure to SQL

От
Julien Rouhaud
Дата:
On 20/01/2016 15:17, Fujii Masao wrote:
> 
> When I compiled the PostgreSQL with the patch, I got the following error.
> ISTM that the inclusion of pg_am.h header file is missing in ginfast.c.
> 
> ginfast.c: In function 'gin_clean_pending_list':
> ginfast.c:980: error: 'GIN_AM_OID' undeclared (first use in this function)
> ginfast.c:980: error: (Each undeclared identifier is reported only once
> ginfast.c:980: error: for each function it appears in.)
> 
> gin_clean_pending_list() must check whether the server is in recovery or not.
> If it's in recovery, the function must exit with an error. That is, IMO,
> something like the following check must be added.
> 
>          if (RecoveryInProgress())
>                  ereport(ERROR,
> 
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>                                   errmsg("recovery is in progress"),
>                                   errhint("GIN pending list cannot be
> cleaned up during recovery.")));
> 
> It's better to make gin_clean_pending_list() check whether the target index
> is temporary index of other sessions or not, like pgstatginindex() does.
> 
> +    Relation    indexRel = index_open(indexoid, AccessShareLock);
> 
> ISTM that AccessShareLock is not safe when updating the pending list and
> GIN index main structure. Probaby we should use RowExclusiveLock?
> 

This lock should be safe, as pointed by Alvaro and Jaime earlier in this
thread
(http://www.postgresql.org/message-id/20151119161846.GK614468@alvherre.pgsql),
as ginInsertCleanup() can be called concurrently.

Also, since 38710a374ea the ginInsertCleanup() call must be fixed:

-   ginInsertCleanup(&ginstate, false, true, &stats);
+   ginInsertCleanup(&ginstate, true, &stats);


Regards.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org



Re: GIN pending list clean up exposure to SQL

От
Jeff Janes
Дата:
On Wed, Jan 20, 2016 at 6:17 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Sat, Jan 16, 2016 at 7:42 AM, Julien Rouhaud
> <julien.rouhaud@dalibo.com> wrote:
>> On 15/01/2016 22:59, Jeff Janes wrote:
>>> On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
>>> <julien.rouhaud@dalibo.com> wrote:
>>
>> All looks fine to me, I flag it as ready for committer.
>
> When I compiled the PostgreSQL with the patch, I got the following error.
> ISTM that the inclusion of pg_am.h header file is missing in ginfast.c.

Thanks.  Fixed.

> gin_clean_pending_list() must check whether the server is in recovery or not.
> If it's in recovery, the function must exit with an error. That is, IMO,
> something like the following check must be added.
>
>          if (RecoveryInProgress())
>                  ereport(ERROR,
>
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>                                   errmsg("recovery is in progress"),
>                                   errhint("GIN pending list cannot be
> cleaned up during recovery.")));
>
> It's better to make gin_clean_pending_list() check whether the target index
> is temporary index of other sessions or not, like pgstatginindex() does.

I've added both of these checks.  Sorry I overlooked your early email
in this thread about those.

>
> +    Relation    indexRel = index_open(indexoid, AccessShareLock);
>
> ISTM that AccessShareLock is not safe when updating the pending list and
> GIN index main structure. Probaby we should use RowExclusiveLock?

Other callers of the ginInsertCleanup function also do so while
holding only the AccessShareLock on the index.  It turns out that
there is a bug around this, as discussed in "Potential GIN vacuum bug"
(http://www.postgresql.org/message-id/flat/CAMkU=1xALfLhUUohFP5v33RzedLVb5aknNUjcEuM9KNBKrB6-Q@mail.gmail.com)

But, that bug has to be solved at a deeper level than this patch.

I've also cleaned up some other conflicts, and chose a more suitable
OID for the new catalog function.

The number of new header includes needed to implement this makes me
wonder if I put this code in the correct file, but I don't see a
better location for it.

New version attached.

Thanks,

Jeff

Вложения

Re: GIN pending list clean up exposure to SQL

От
Fujii Masao
Дата:
On Mon, Jan 25, 2016 at 3:54 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Wed, Jan 20, 2016 at 6:17 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Sat, Jan 16, 2016 at 7:42 AM, Julien Rouhaud
>> <julien.rouhaud@dalibo.com> wrote:
>>> On 15/01/2016 22:59, Jeff Janes wrote:
>>>> On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
>>>> <julien.rouhaud@dalibo.com> wrote:
>>>
>>> All looks fine to me, I flag it as ready for committer.
>>
>> When I compiled the PostgreSQL with the patch, I got the following error.
>> ISTM that the inclusion of pg_am.h header file is missing in ginfast.c.
>
> Thanks.  Fixed.
>
>> gin_clean_pending_list() must check whether the server is in recovery or not.
>> If it's in recovery, the function must exit with an error. That is, IMO,
>> something like the following check must be added.
>>
>>          if (RecoveryInProgress())
>>                  ereport(ERROR,
>>
>> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>>                                   errmsg("recovery is in progress"),
>>                                   errhint("GIN pending list cannot be
>> cleaned up during recovery.")));
>>
>> It's better to make gin_clean_pending_list() check whether the target index
>> is temporary index of other sessions or not, like pgstatginindex() does.
>
> I've added both of these checks.  Sorry I overlooked your early email
> in this thread about those.
>
>>
>> +    Relation    indexRel = index_open(indexoid, AccessShareLock);
>>
>> ISTM that AccessShareLock is not safe when updating the pending list and
>> GIN index main structure. Probaby we should use RowExclusiveLock?
>
> Other callers of the ginInsertCleanup function also do so while
> holding only the AccessShareLock on the index.  It turns out that
> there is a bug around this, as discussed in "Potential GIN vacuum bug"
> (http://www.postgresql.org/message-id/flat/CAMkU=1xALfLhUUohFP5v33RzedLVb5aknNUjcEuM9KNBKrB6-Q@mail.gmail.com)
>
> But, that bug has to be solved at a deeper level than this patch.
>
> I've also cleaned up some other conflicts, and chose a more suitable
> OID for the new catalog function.
>
> The number of new header includes needed to implement this makes me
> wonder if I put this code in the correct file, but I don't see a
> better location for it.
>
> New version attached.

Thanks for updating the patch! It looks good to me.

Based on your patch, I just improved the doc. For example,
I added the following note into the doc.

+    These functions cannot be executed during recovery.
+    Use of these functions is restricted to superusers and the owner
+    of the given index.

If there is no problem, I'm thinking to commit this version.

Regards,

--
Fujii Masao

Вложения

Re: GIN pending list clean up exposure to SQL

От
Julien Rouhaud
Дата:
On 27/01/2016 10:27, Fujii Masao wrote:
> On Mon, Jan 25, 2016 at 3:54 PM, Jeff Janes <jeff.janes@gmail.com>
> wrote:
>> On Wed, Jan 20, 2016 at 6:17 AM, Fujii Masao
>> <masao.fujii@gmail.com> wrote:
>>> On Sat, Jan 16, 2016 at 7:42 AM, Julien Rouhaud 
>>> <julien.rouhaud@dalibo.com> wrote:
>>>> On 15/01/2016 22:59, Jeff Janes wrote:
>>>>> On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud 
>>>>> <julien.rouhaud@dalibo.com> wrote:
>>>> 
>>>> All looks fine to me, I flag it as ready for committer.
>>> 
>>> When I compiled the PostgreSQL with the patch, I got the
>>> following error. ISTM that the inclusion of pg_am.h header file
>>> is missing in ginfast.c.
>> 
>> Thanks.  Fixed.
>> 
>>> gin_clean_pending_list() must check whether the server is in
>>> recovery or not. If it's in recovery, the function must exit
>>> with an error. That is, IMO, something like the following check
>>> must be added.
>>> 
>>> if (RecoveryInProgress()) ereport(ERROR,
>>> 
>>> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), 
>>> errmsg("recovery is in progress"), errhint("GIN pending list
>>> cannot be cleaned up during recovery.")));
>>> 
>>> It's better to make gin_clean_pending_list() check whether the
>>> target index is temporary index of other sessions or not, like
>>> pgstatginindex() does.
>> 
>> I've added both of these checks.  Sorry I overlooked your early
>> email in this thread about those.
>> 
>>> 
>>> +    Relation    indexRel = index_open(indexoid,
>>> AccessShareLock);
>>> 
>>> ISTM that AccessShareLock is not safe when updating the pending
>>> list and GIN index main structure. Probaby we should use
>>> RowExclusiveLock?
>> 
>> Other callers of the ginInsertCleanup function also do so while 
>> holding only the AccessShareLock on the index.  It turns out
>> that there is a bug around this, as discussed in "Potential GIN
>> vacuum bug" 
>> (http://www.postgresql.org/message-id/flat/CAMkU=1xALfLhUUohFP5v33RzedLVb5aknNUjcEuM9KNBKrB6-Q@mail.gmail.com)
>>
>>
>> 
But, that bug has to be solved at a deeper level than this patch.
>> 
>> I've also cleaned up some other conflicts, and chose a more
>> suitable OID for the new catalog function.
>> 
>> The number of new header includes needed to implement this makes
>> me wonder if I put this code in the correct file, but I don't see
>> a better location for it.
>> 
>> New version attached.
> 
> Thanks for updating the patch! It looks good to me.
> 
> Based on your patch, I just improved the doc. For example, I added
> the following note into the doc.
> 
> +    These functions cannot be executed during recovery. +    Use
> of these functions is restricted to superusers and the owner +
> of the given index.
> 
> If there is no problem, I'm thinking to commit this version.
> 

Just a detail:

+    Note that the cleanup does not happen and the return value is 0
+    if the argument is the GIN index built with <literal>fastupdate</>
+    option disabled because it doesn't have a pending list.

It should be "if the argument is *a* GIN index"

I find this sentence a little confusing, maybe rephrase like this would
be better:

-    Note that the cleanup does not happen and the return value is 0
-    if the argument is the GIN index built with <literal>fastupdate</>
-    option disabled because it doesn't have a pending list.
+    Note that if the argument is a GIN index built with
<literal>fastupdate</>
+    option disabled, the cleanup does not happen and the return value
is 0
+    because the index doesn't have a pending list.

Otherwise, I don't see any problem on this version.

Regards.

> Regards,
> 
> 
> 
> 


-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org



Re: GIN pending list clean up exposure to SQL

От
Fujii Masao
Дата:
On Thu, Jan 28, 2016 at 5:54 AM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:
> On 27/01/2016 10:27, Fujii Masao wrote:
>> On Mon, Jan 25, 2016 at 3:54 PM, Jeff Janes <jeff.janes@gmail.com>
>> wrote:
>>> On Wed, Jan 20, 2016 at 6:17 AM, Fujii Masao
>>> <masao.fujii@gmail.com> wrote:
>>>> On Sat, Jan 16, 2016 at 7:42 AM, Julien Rouhaud
>>>> <julien.rouhaud@dalibo.com> wrote:
>>>>> On 15/01/2016 22:59, Jeff Janes wrote:
>>>>>> On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
>>>>>> <julien.rouhaud@dalibo.com> wrote:
>>>>>
>>>>> All looks fine to me, I flag it as ready for committer.
>>>>
>>>> When I compiled the PostgreSQL with the patch, I got the
>>>> following error. ISTM that the inclusion of pg_am.h header file
>>>> is missing in ginfast.c.
>>>
>>> Thanks.  Fixed.
>>>
>>>> gin_clean_pending_list() must check whether the server is in
>>>> recovery or not. If it's in recovery, the function must exit
>>>> with an error. That is, IMO, something like the following check
>>>> must be added.
>>>>
>>>> if (RecoveryInProgress()) ereport(ERROR,
>>>>
>>>> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>>>> errmsg("recovery is in progress"), errhint("GIN pending list
>>>> cannot be cleaned up during recovery.")));
>>>>
>>>> It's better to make gin_clean_pending_list() check whether the
>>>> target index is temporary index of other sessions or not, like
>>>> pgstatginindex() does.
>>>
>>> I've added both of these checks.  Sorry I overlooked your early
>>> email in this thread about those.
>>>
>>>>
>>>> +    Relation    indexRel = index_open(indexoid,
>>>> AccessShareLock);
>>>>
>>>> ISTM that AccessShareLock is not safe when updating the pending
>>>> list and GIN index main structure. Probaby we should use
>>>> RowExclusiveLock?
>>>
>>> Other callers of the ginInsertCleanup function also do so while
>>> holding only the AccessShareLock on the index.  It turns out
>>> that there is a bug around this, as discussed in "Potential GIN
>>> vacuum bug"
>>> (http://www.postgresql.org/message-id/flat/CAMkU=1xALfLhUUohFP5v33RzedLVb5aknNUjcEuM9KNBKrB6-Q@mail.gmail.com)
>>>
>>>
>>>
> But, that bug has to be solved at a deeper level than this patch.
>>>
>>> I've also cleaned up some other conflicts, and chose a more
>>> suitable OID for the new catalog function.
>>>
>>> The number of new header includes needed to implement this makes
>>> me wonder if I put this code in the correct file, but I don't see
>>> a better location for it.
>>>
>>> New version attached.
>>
>> Thanks for updating the patch! It looks good to me.
>>
>> Based on your patch, I just improved the doc. For example, I added
>> the following note into the doc.
>>
>> +    These functions cannot be executed during recovery. +    Use
>> of these functions is restricted to superusers and the owner +
>> of the given index.
>>
>> If there is no problem, I'm thinking to commit this version.
>>
>
> Just a detail:
>
> +    Note that the cleanup does not happen and the return value is 0
> +    if the argument is the GIN index built with <literal>fastupdate</>
> +    option disabled because it doesn't have a pending list.
>
> It should be "if the argument is *a* GIN index"
>
> I find this sentence a little confusing, maybe rephrase like this would
> be better:
>
> -    Note that the cleanup does not happen and the return value is 0
> -    if the argument is the GIN index built with <literal>fastupdate</>
> -    option disabled because it doesn't have a pending list.
> +    Note that if the argument is a GIN index built with
> <literal>fastupdate</>
> +    option disabled, the cleanup does not happen and the return value
> is 0
> +    because the index doesn't have a pending list.
>
> Otherwise, I don't see any problem on this version.

Thanks for the review! I adopted the description you suggested.
Just pushed the patch.

Regards,

-- 
Fujii Masao



Re: GIN pending list clean up exposure to SQL

От
Jeff Janes
Дата:
On Wed, Jan 27, 2016 at 12:54 PM, Julien Rouhaud
<julien.rouhaud@dalibo.com> wrote:
> On 27/01/2016 10:27, Fujii Masao wrote:
>>
>> Thanks for updating the patch! It looks good to me.
>>
>> Based on your patch, I just improved the doc. For example, I added
>> the following note into the doc.
>>
>> +    These functions cannot be executed during recovery. +    Use
>> of these functions is restricted to superusers and the owner +
>> of the given index.
>>
>> If there is no problem, I'm thinking to commit this version.
>>
>
> Just a detail:
>
> +    Note that the cleanup does not happen and the return value is 0
> +    if the argument is the GIN index built with <literal>fastupdate</>
> +    option disabled because it doesn't have a pending list.
>
> It should be "if the argument is *a* GIN index"
>
> I find this sentence a little confusing, maybe rephrase like this would
> be better:
>
> -    Note that the cleanup does not happen and the return value is 0
> -    if the argument is the GIN index built with <literal>fastupdate</>
> -    option disabled because it doesn't have a pending list.
> +    Note that if the argument is a GIN index built with
> <literal>fastupdate</>
> +    option disabled, the cleanup does not happen and the return value
> is 0
> +    because the index doesn't have a pending list.
>
> Otherwise, I don't see any problem on this version.

This is a corner case that probably does not need to be in the docs,
but I wanted to clarify it here in case you disagree: If the index
ever had fastupdate turned on, when it is turned off the index will
keep whatever pending_list it had until something cleans it up one
last time.

Thanks for the review and changes and commit.

Cheers,

Jeff



Re: GIN pending list clean up exposure to SQL

От
Fujii Masao
Дата:
On Mon, Feb 8, 2016 at 9:53 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Wed, Jan 27, 2016 at 12:54 PM, Julien Rouhaud
> <julien.rouhaud@dalibo.com> wrote:
>> On 27/01/2016 10:27, Fujii Masao wrote:
>>>
>>> Thanks for updating the patch! It looks good to me.
>>>
>>> Based on your patch, I just improved the doc. For example, I added
>>> the following note into the doc.
>>>
>>> +    These functions cannot be executed during recovery. +    Use
>>> of these functions is restricted to superusers and the owner +
>>> of the given index.
>>>
>>> If there is no problem, I'm thinking to commit this version.
>>>
>>
>> Just a detail:
>>
>> +    Note that the cleanup does not happen and the return value is 0
>> +    if the argument is the GIN index built with <literal>fastupdate</>
>> +    option disabled because it doesn't have a pending list.
>>
>> It should be "if the argument is *a* GIN index"
>>
>> I find this sentence a little confusing, maybe rephrase like this would
>> be better:
>>
>> -    Note that the cleanup does not happen and the return value is 0
>> -    if the argument is the GIN index built with <literal>fastupdate</>
>> -    option disabled because it doesn't have a pending list.
>> +    Note that if the argument is a GIN index built with
>> <literal>fastupdate</>
>> +    option disabled, the cleanup does not happen and the return value
>> is 0
>> +    because the index doesn't have a pending list.
>>
>> Otherwise, I don't see any problem on this version.
>
> This is a corner case that probably does not need to be in the docs,
> but I wanted to clarify it here in case you disagree: If the index
> ever had fastupdate turned on, when it is turned off the index will
> keep whatever pending_list it had until something cleans it up one
> last time.

I agree to add that note to the doc. Or we should remove the above
description that I added?

Regards,

-- 
Fujii Masao