Обсуждение: Lowering the minimum value for maintenance_work_mem

Поиск
Список
Период
Сортировка

Lowering the minimum value for maintenance_work_mem

От
Andres Freund
Дата:
Hi,

In the subthread at [1] I needed to trigger multiple rounds of index vacuuming
within one vacuum.

It turns out that with the new dead tuple implementation, that got actually
somewhat expensive. Particularly if all tuples on all pages get deleted, the
representation is just "too dense". Normally that's obviously very good, but
for testing, not so much:

With the minimum setting of maintenance_work_mem=1024kB, a simple table with
narrow rows, where all rows are deleted, the first cleanup happens after
3697812 dead tids. The table for that has to be > ~128MB.

Needing a ~128MB table to be able to test multiple cleanup passes makes it
much more expensive to test and consequently will lead to worse test coverage.

I think we should consider lowering the minimum setting of
maintenance_work_mem to the minimum of work_mem. For real-life workloads
maintenance_work_mem=1024kB is going to already be quite bad, so we don't
protect users much by forbidding a setting lower than 1MB.


Just for comparison, with a limit of 1MB, < 17 needed to do the first cleanup
pass after 174472 dead tuples. That's a 20x improvement. Really nice.

Greetings,

Andres Freund

[1\ https://postgr.es/m/20240516193953.zdj545efq6vabymd%40awork3.anarazel.de



Re: Lowering the minimum value for maintenance_work_mem

От
Masahiko Sawada
Дата:
On Fri, May 17, 2024 at 5:55 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> In the subthread at [1] I needed to trigger multiple rounds of index vacuuming
> within one vacuum.
>
> It turns out that with the new dead tuple implementation, that got actually
> somewhat expensive. Particularly if all tuples on all pages get deleted, the
> representation is just "too dense". Normally that's obviously very good, but
> for testing, not so much:
>
> With the minimum setting of maintenance_work_mem=1024kB, a simple table with
> narrow rows, where all rows are deleted, the first cleanup happens after
> 3697812 dead tids. The table for that has to be > ~128MB.
>
> Needing a ~128MB table to be able to test multiple cleanup passes makes it
> much more expensive to test and consequently will lead to worse test coverage.
>
> I think we should consider lowering the minimum setting of
> maintenance_work_mem to the minimum of work_mem.

+1 for lowering the minimum value of maintenance_work_mem. I've faced
the same situation.

Even if a shared tidstore is empty, TidStoreMemoryUsage() returns
256kB because it's the minimum segment size of DSA, i.e.
DSA_MIN_SEGMENT_SIZE. So we can lower the minimum maintenance_work_mem
down to 256kB, from a vacuum perspective.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Lowering the minimum value for maintenance_work_mem

От
John Naylor
Дата:
On Mon, May 20, 2024 at 11:59 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, May 17, 2024 at 5:55 AM Andres Freund <andres@anarazel.de> wrote:

> > I think we should consider lowering the minimum setting of
> > maintenance_work_mem to the minimum of work_mem.
>
> +1 for lowering the minimum value of maintenance_work_mem. I've faced
> the same situation.
>
> Even if a shared tidstore is empty, TidStoreMemoryUsage() returns
> 256kB because it's the minimum segment size of DSA, i.e.
> DSA_MIN_SEGMENT_SIZE. So we can lower the minimum maintenance_work_mem
> down to 256kB, from a vacuum perspective.

I've verified 256kB works with both local and shared memory with the
below commands, and 200k records are enough to cause a second round of
index cleanup. I don't think we can go much smaller than that without
changing how we size the blocks in the node slab contexts (or when
they're created), which is currently somewhat arbitrary. That'll need
some thought, at least when we get a use case with work_mem as the
limit.

set maintenance_work_mem = '256kB';

drop table if exists test;
create unlogged table test (a int) with (autovacuum_enabled=false);
insert into test (a) select i from generate_series(1,200_000) i;
create index on test (a);
--create index on test (a); -- toggle for parallel vacuum

delete from test;
vacuum (verbose) test;

Side note: I'm confused why shared memory works at all in this case,
since it failed for 1MB init segments until we allowed callers to
specify a smaller init size. The overhead for DSA seems to be
significant for small sizes, as evidenced from the amount of usable
memory:

shared:
INFO:  finished vacuuming "john.public.test": index scans: 56

local:
INFO:  finished vacuuming "john.public.test": index scans: 2