Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
Дата
Msg-id CAD21AoATWq-xCNFUh_R_Cza=SyjD=9zKWbbc7QCUEKjvu5b-2g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem  (Claudio Freire <klaussfreire@gmail.com>)
Ответы Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem  (Claudio Freire <klaussfreire@gmail.com>)
Список pgsql-hackers
On Thu, Jan 26, 2017 at 5:11 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Wed, Jan 25, 2017 at 1:54 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>> Thank you for updating the patch!
>>
>> +       /*
>> +        * Quickly rule out by lower bound (should happen a lot) Upper bound was
>> +        * already checked by segment search
>> +        */
>> +       if (vac_cmp_itemptr((void *) itemptr, (void *) rseg->dead_tuples) < 0)
>> +               return false;
>>
>> I think that if the above result is 0, we can return true as itemptr
>> matched lower bound item pointer in rseg->dead_tuples.
>
> That's right. Possibly not a great speedup but... why not?
>
>>
>>  +typedef struct DeadTuplesSegment
>> +{
>> +       int                     num_dead_tuples;        /* # of
>> entries in the segment */
>> +       int                     max_dead_tuples;        /* # of
>> entries allocated in the segment */
>> +       ItemPointerData last_dead_tuple;        /* Copy of the last
>> dead tuple (unset
>> +
>>           * until the segment is fully
>> +
>>           * populated) */
>> +       unsigned short padding;
>> +       ItemPointer dead_tuples;        /* Array of dead tuples */
>> +}      DeadTuplesSegment;
>> +
>> +typedef struct DeadTuplesMultiArray
>> +{
>> +       int                     num_entries;    /* current # of entries */
>> +       int                     max_entries;    /* total # of slots
>> that can be allocated in
>> +                                                                * array */
>> +       int                     num_segs;               /* number of
>> dead tuple segments allocated */
>> +       int                     last_seg;               /* last dead
>> tuple segment with data (or 0) */
>> +       DeadTuplesSegment *dead_tuples;         /* array of num_segs segments */
>> +}      DeadTuplesMultiArray;
>>
>> It's a matter of personal preference but some same dead_tuples
>> variables having different meaning confused me.
>> If we want to access first dead tuple location of first segment, we
>> need to do 'vacrelstats->dead_tuples.dead_tuples.dead_tuples'. For
>> example, 'vacrelstats->dead_tuples.dt_segment.dt_array' is better to
>> me.
>
> Yes, I can see how that could be confusing.
>
> I went for vacrelstats->dead_tuples.dt_segments[i].dt_tids[j]

Thank you for updating.
Looks good to me.

>> +                                       nseg->num_dead_tuples = 0;
>> +                                       nseg->max_dead_tuples = 0;
>> +                                       nseg->dead_tuples = NULL;
>> +                                       vacrelstats->dead_tuples.num_segs++;
>> +                               }
>> +                               seg = DeadTuplesCurrentSegment(vacrelstats);
>> +                       }
>> +                       vacrelstats->dead_tuples.last_seg++;
>> +                       seg = DeadTuplesCurrentSegment(vacrelstats);
>>
>> Because seg is always set later I think the first line starting with
>> "seg = ..." is not necessary. Thought?
>
> That's correct.
>
> Attached a v6 with those changes (and rebased).
>
> Make check still passes.

Here is review comment of v6 patch.

----* We are willing to use at most maintenance_work_mem (or perhaps* autovacuum_work_mem) memory space to keep track
ofdead tuples.  We* initially allocate an array of TIDs of that size, with an upper limit that* depends on table size
(thislimit ensures we don't allocate a huge area* uselessly for vacuuming small tables).  If the array threatens to
overflow,

I think that we need to update the above paragraph comment at top of
vacuumlazy.c file.

----
+                               numtuples = Max(numtuples,
MaxHeapTuplesPerPage);
+                               numtuples = Min(numtuples, INT_MAX / 2);
+                               numtuples = Min(numtuples, 2 *
pseg->max_dead_tuples);
+                               numtuples = Min(numtuples,
MaxAllocSize / sizeof(ItemPointerData));
+                               seg->dt_tids = (ItemPointer)
palloc(sizeof(ItemPointerData) * numtuples);

Why numtuples is limited to "INT_MAX / 2" but not INT_MAX?

----
@@ -1376,35 +1411,43 @@ lazy_vacuum_heap(Relation onerel, LVRelStats
*vacrelstats)       pg_rusage_init(&ru0);       npages = 0;

-       tupindex = 0;
-       while (tupindex < vacrelstats->num_dead_tuples)
+       segindex = 0;
+       tottuples = 0;
+       for (segindex = tupindex = 0; segindex <=
vacrelstats->dead_tuples.last_seg; tupindex = 0, segindex++)       {
-               BlockNumber tblk;
-               Buffer          buf;
-               Page            page;
-               Size            freespace;

This is a minute thing but tupindex can be define inside of for loop.

----
@@ -1129,10 +1159,13 @@ lazy_scan_heap(Relation onerel, int options,
LVRelStats *vacrelstats,         * instead of doing a second scan.         */        if (nindexes == 0 &&
-            vacrelstats->num_dead_tuples > 0)
+            vacrelstats->dead_tuples.num_entries > 0)        {            /* Remove tuples from heap */
-            lazy_vacuum_page(onerel, blkno, buf, 0, vacrelstats, &vmbuffer);
+            Assert(vacrelstats->dead_tuples.last_seg == 0);        /*
Should not need more
+                                                                 *
than one segment per
+                                                                 * page */

I'm not sure we need to add Assert() here but it seems to me that the
comment and code is not properly correspond and the comment for
Assert() should be wrote above of Assert() line.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: [HACKERS] Transactions involving multiple postgres foreign servers
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: [HACKERS] [PATCH]: fix bug in SP-GiST box_ops