Re: Freeze avoidance of very large table.

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Freeze avoidance of very large table.
Дата
Msg-id CAA4eK1K219GWYHEEMKn0kYK0wmh_9jsDgrmhaQGwCffcts9Scg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Freeze avoidance of very large table.  (Sawada Masahiko <sawada.mshk@gmail.com>)
Ответы Re: Freeze avoidance of very large table.  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Thu, Jul 2, 2015 at 9:00 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
>
>
> Thank you for bug report, and comments.
>
> Fixed version is attached, and source code comment is also updated.
> Please review it.
>

I am looking into this patch and would like to share my findings with
you:

1.
@@ -2131,8 +2133,9 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 
CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
 
  /*
- * Find buffer to insert this 
tuple into.  If the page is all visible,
- * this will also pin the requisite visibility map page.
+
 * Find buffer to insert this tuple into.  If the page is all visible
+ * of all frozen, this will also pin 
the requisite visibility map and
+ * frozen map page.
  */

typo in comments.

/of all frozen/or all frozen

2.
visibilitymap.c
+ * The visibility map is a bitmap with two bits (all-visible and all-frozen
+ * per heap page.

/and all-frozen/and all-frozen)
closing round bracket is missing.

3.
visibilitymap.c
-/*#define TRACE_VISIBILITYMAP */
+#define TRACE_VISIBILITYMAP

why is this hash define opened?

4.
-visibilitymap_count(Relation rel)
+visibilitymap_count(Relation rel, bool for_visible)

This API needs to count set bits for either visibility info, frozen info
or both (if required), it seems better to have second parameter as
uint8 flags rather than bool. Also, if it is required to be called at most
places for both visibility and frozen bits count, why not get them
in one call?

5.
Clearing visibility and frozen bit separately for the dml
operations would lead locking/unlocking the corresponding buffer
twice, can we do it as a one operation.  I think this is suggested
by Simon as well.

6.
- * Before locking the buffer, pin the visibility map page if it appears to
- * be necessary.  
Since we haven't got the lock yet, someone else might be
+ * Before locking the buffer, pin the 
visibility map if it appears to be
+ * necessary.  Since we haven't got the lock yet, someone else might 
be

Why you have deleted 'page' in above comment?

7.
@@ -3490,21 +3532,23 @@ l2:
  UnlockTupleTuplock(relation, &(oldtup.t_self), *lockmode);
 
if (vmbuffer != InvalidBuffer)
  ReleaseBuffer(vmbuffer);
+
  bms_free
(hot_attrs);

Seems unnecessary change.

8.
@@ -1919,11 +1919,18 @@ index_update_stats(Relation rel,
  {
  BlockNumber relpages = 
RelationGetNumberOfBlocks(rel);
  BlockNumber relallvisible;
+ BlockNumber 
relallfrozen;
 
  if (rd_rel->relkind != RELKIND_INDEX)
- relallvisible = 
visibilitymap_count(rel);
+ {
+ relallvisible = visibilitymap_count(rel, 
true);
+ relallfrozen = visibilitymap_count(rel, false);
+ }
  else
/* don't bother for indexes */
+ {
  relallvisible = 0;
+
relallfrozen = 0;
+ }

I think in this function, you have forgotten to update the
relallfrozen value in pg_class.

9.
vacuumlazy.c

@@ -253,14 +258,16 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
  * NB: We 
need to check this before truncating the relation, because that
  * will change ->rel_pages.
  */
-
if (vacrelstats->scanned_pages < vacrelstats->rel_pages)
+ if ((vacrelstats->scanned_pages + 
vacrelstats->vmskipped_pages)
+ < vacrelstats->rel_pages)
  {
- Assert(!scan_all);

Why you have removed this Assert, won't the count of
vacrelstats->scanned_pages + vacrelstats->vmskipped_pages be
equal to vacrelstats->rel_pages when scall_all = true.

10.
vacuumlazy.c
lazy_vacuum_rel()
..
+ scanned_all |= scan_all;
+

Why this new assignment is added, please add a comment to
explain it.

11.
lazy_scan_heap()
..
+ * Also, skipping even a single page accorind to all-visible bit of
+ * visibility map means that we can't update relfrozenxid, so we only want
+ * to do it if we can skip a goodly number. On the other hand, we count
+ * both how many pages we skipped according to all-frozen bit of visibility
+ * map and how many pages we freeze page, so we can update relfrozenxid if
+ * the sum of their is as many as tuples per page.

a.
typo
/accorind/according
b.
is the second part of comment (starting from On the other hand)
right?  I mean you are comparing sum of pages skipped due to
all_frozen bit and number of pages freezed with tuples per page.
I don't understand how are they related?


12.
@@ -918,8 +954,13 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
  else
  {
  num_tuples += 1;
+ ntup_in_blk += 1;
  hastup = true;
 
+ /* If current tuple is already frozen, count it up */
+ if (HeapTupleHeaderXminFrozen(tuple.t_data))
+ already_nfrozen += 1;
+
  /*
  * Each non-removable tuple must be checked to see if it needs
  * freezing.  Note we already have exclusive buffer lock.

Here, if tuple is already_frozen, can't we just continue and
check for next tuple?

13.
+extern XLogRecPtr log_heap_frozenmap(RelFileNode rnode, Buffer heap_buffer,
+ Buffer fm_buffer);

It seems like this function is not used.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: Comfortably check BackendPID with psql
Следующее
От: Andres Freund
Дата:
Сообщение: Re: replication slot restart_lsn initialization