Re: [PoC] Improve dead tuple storage for lazy vacuum

Поиск
Список
Период
Сортировка
От John Naylor
Тема Re: [PoC] Improve dead tuple storage for lazy vacuum
Дата
Msg-id CANWCAZbSp1txmTopBmn9Y7spQ0yM-wzh13OFbMTFFMswCpQL0w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PoC] Improve dead tuple storage for lazy vacuum  (John Naylor <johncnaylorls@gmail.com>)
Ответы Re: [PoC] Improve dead tuple storage for lazy vacuum
Список pgsql-hackers
On Wed, Jan 3, 2024 at 9:10 PM John Naylor <johncnaylorls@gmail.com> wrote:
>
> On Tue, Jan 2, 2024 at 8:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> > I agree that we expose RT_LOCK_* functions and have tidstore use them,
> > but am not sure the if (TidStoreIsShared(ts) LWLockAcquire(..., ...)"
> > calls part. I think that even if we expose them, we will still need to
> > do something like "if (TidStoreIsShared(ts))
> > shared_rt_lock_share(ts->tree.shared)", no?
>
> I'll come back to this topic separately.

To answer your question, sure, but that "if (TidStoreIsShared(ts))"
part would be pushed down into a function so that only one place has
to care about it.

However, I'm starting to question whether we even need that. Meaning,
lock the tidstore separately. To "lock the tidstore" means to take a
lock, _separate_ from the radix tree's internal lock, to control
access to two fields in a separate "control object":

+typedef struct TidStoreControl
+{
+ /* the number of tids in the store */
+ int64 num_tids;
+
+ /* the maximum bytes a TidStore can use */
+ size_t max_bytes;

I'm pretty sure max_bytes does not need to be in shared memory, and
certainly not under a lock: Thinking of a hypothetical
parallel-prune-phase scenario, one way would be for a leader process
to pass out ranges of blocks to workers, and when the limit is
exceeded, stop passing out blocks and wait for all the workers to
finish.

As for num_tids, vacuum previously put the similar count in

@@ -176,7 +179,8 @@ struct ParallelVacuumState
  PVIndStats *indstats;

  /* Shared dead items space among parallel vacuum workers */
- VacDeadItems *dead_items;
+ TidStore *dead_items;

VacDeadItems contained "num_items". What was the reason to have new
infrastructure for that count? And it doesn't seem like access to it
was controlled by a lock -- can you confirm? If we did get parallel
pruning, maybe the count would belong inside PVShared?

The number of tids is not that tightly bound to the tidstore's job. I
believe tidbitmap.c (a possible future client) doesn't care about the
global number of tids -- not only that, but AND/OR operations can
change the number in a non-obvious way, so it would not be convenient
to keep an accurate number anyway. But the lock would still be
mandatory with this patch.

If we can make vacuum work a bit closer to how it does now, it'd be a
big step up in readability, I think. Namely, getting rid of all the
locking logic inside tidstore.c and let the radix tree's locking do
the right thing. We'd need to make that work correctly when receiving
pointers to values upon lookup, and I already shared ideas for that.
But I want to see if there is any obstacle in the way of removing the
tidstore control object and it's separate lock.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Следующее
От: Geoff Winkless
Дата:
Сообщение: Re: weird GROUPING SETS and ORDER BY behaviour