Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

Поиск
Список
Период
Сортировка
От Pavan Deolasee
Тема Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)
Дата
Msg-id CABOikdO26jksD0HwHCU3UXejjd6GO-ZF3qe+Aat+3y400v6LEQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers


On Thu, Jan 26, 2017 at 2:38 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Looking at your 0002 patch now.  It no longer applies, but the conflicts
are trivial to fix.  Please rebase and resubmit.


Thanks.
 

 Maybe I will spend some time trying to
convert it to Perl using PostgresNode.


Agree. I put together a test harness to hammer the WARM code as much as we can. This harness has already discovered some bugs, especially around index creation part. It also discovered one outstanding bug in master, so it's been useful. But I agree to rewrite it using perl.
 

I think having the "recheck" index methods create an ExecutorState looks
out of place.  How difficult is it to pass the estate from the calling
code?

I couldn't find an easy way given the place where recheck is required. Can you suggest something?
 

IMO heap_get_root_tuple_one should be called just heap_get_root_tuple().
That function and its plural sibling heap_get_root_tuples() should
indicate in their own comments what the expectations are regarding the
root_offsets output argument, rather than deferring to the comments in
the "internal" function, since they differ on that point; for the rest
of the invariants I think it makes sense to say "Also see the comment
for heap_get_root_tuples_internal".  I wonder if heap_get_root_tuple
should just return the ctid instead of assigning the value to a
passed-in pointer, i.e.
OffsetNumber
heap_get_root_tuple(Page page, OffsetNumber target_offnum)
{
        OffsetNumber    off;
        heap_get_root_tuples_internal(page, target_offnum, &off);
        return off;
}


Yes, all of that makes sense. Will fix.
 

The simple_heap_update + CatalogUpdateIndexes pattern is getting
obnoxious.  How about creating something like catalog_heap_update which
does both things at once, and stop bothering each callsite with the WARM
stuff?  In fact, given that CatalogUpdateIndexes is used in other
places, maybe we should leave its API alone and create another function,
so that we don't have to change the many places that only do
simple_heap_insert.  (Places like OperatorCreate which do either insert
or update could just move the index update call into each branch.)


What I ended up doing is I added two new APIs.
- CatalogUpdateHeapAndIndex
- CatalogInsertHeapAndIndex

I could replace almost all occurrences of simple_heap_update + CatalogUpdateIndexes with the first API and simple_heap_insert + CatalogUpdateIndexes with the second API. This looks like a good improvement to me anyways since there are about 180 places where these functions are called almost in the same pattern. May be it will also avoid a bug when someone forgets to update the index after inserting/updating heap.
 
.
I wonder if heap_hot_search_buffer() and heap_hot_search() should return
a tri-valued enum instead of boolean; that idea looks reasonable in
theory but callers have to do more work afterwards, so maybe not.


Ok. I'll try to rearrange it a bit. May be we just have one API after all? There are only a very few callers of these APIs.
 
Thanks,
Pavan


--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Improvements in psql hooks for variables
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: [HACKERS] Subtle bug in "Simplify tape block format" commit