Re: Heavily modified big table bloat even in auto vacuum is running

Поиск
Список
Период
Сортировка
От Haribabu kommi
Тема Re: Heavily modified big table bloat even in auto vacuum is running
Дата
Msg-id 8977CB36860C5843884E0A18D8747B0372BF34D5@szxeml558-mbs.china.huawei.com
обсуждение исходный текст
Ответ на Re: Heavily modified big table bloat even in auto vacuum is running  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Heavily modified big table bloat even in auto vacuum is running
Re: Heavily modified big table bloat even in auto vacuum is running
Список pgsql-hackers
On 06 December 2013 11:57 Amit Kapila wrote:
> On Fri, Nov 29, 2013 at 6:55 PM, Haribabu kommi
> <haribabu.kommi@huawei.com> wrote:
> > On 29 November 2013 12:00 Amit Kapila wrote:
> >> On Tue, Nov 26, 2013 at 7:26 PM, Haribabu kommi
> >> <haribabu.kommi@huawei.com> wrote:
> >> Few questions about your latest patch:
> >> a. Is there any reason why you are doing estimation of dead tuples
> >> only for Autovacuum and not for Vacuum.
> >
> > No, changed.
> >
> >> /* clear and get the new stats for calculating proper dead tuples */
> >> pgstat_clear_snapshot(); tabentry =
> >> pgstat_fetch_stat_tabentry(RelationGetRelid(onerel));
> >> b. In the above code, to get latest data you are first clearing
> >> snapshot and then calling pgstat function. It will inturn perform
> I/O
> >>     (read of stats file) and send/receive message from stats
> >> collector to ensure it can read latest data. I think it will add
> overhead
> >>     to Vacuum, especially if 'nkeep' calculated in function
> >> lazy_scan_heap() can serve the purpose. In my simple test[1], I
> >> observed
> >>     that value of keep can serve the purpose.
> >>
> >> Can you please once try the test on 'nkeep' approach patch.
> >
> > Using the nkeep and snapshot approach, I ran the test for 40 mins
> with
> > a high analyze_threshold and results are below.
> >
> >                         Auto vacuum count               Bloat size
> > Master           11                      220MB
> > Patched_nkeep      14                      215MB
> > Patched_snapshot           18                              198MB
> >
> > Both the approaches are showing good improvement in the test.
> > Updated patches, test script and configuration is attached in the
> mail.
>
> I think updating dead tuple count using nkeep is good idea as similar
> thing is done for Analyze as well in acquire_sample_rows().
> One minor point, I think it is better to log dead tuples is below error
> message:
> ereport(LOG,
> (errmsg("automatic vacuum of table \"%s.%s.%s\": index scans: %d\n"
> "pages: %d removed, %d remain\n"
> "tuples: %.0f removed, %.0f remain\n"
>
> "tuples: %.0f removed, %.0f remain, %.0f dead\n"
>
>
> About your test, how to collect the data by running this script, are
> you manually stopping it after 40 mins, because I ran it for more than
> an hour, the final result didn't came.
> As I mentioned you last time, please simplify your test, for other
> person in its current form, it is difficult to make meaning out of it.
> Write comments on top of it in steps form to explain what exactly it is
> doing and how to take data using it (for example, do I need to wait,
> till script ends; how long this test can take to complete).

A simplified test and updated patch by taking care the above comment are attached in the mail.
I am not able to reduce the test duration but changed as the test automatically exists after 45 mins run.
Please check vacuum_test.sh file more details for running the test.

              Auto vacuum count   Bloat size
Master           15                220MB
Patched_nkeep    18                213MB

Please let me know your suggestions.

Regards,
Hari babu.


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: knizhnik
Дата:
Сообщение: Re: In-Memory Columnar Store
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: pgbench with large scale factor