Re: Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan

Поиск
Список
Период
Сортировка
От Amit Khandekar
Тема Re: Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan
Дата
Msg-id CACoZds1nh6YD_X_Kdy7Yx=QVquULTX3nvJbmz9G7uW7Zf9aGMA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan  ("Etsuro Fujita" <fujita.etsuro@lab.ntt.co.jp>)
Ответы Re: Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan  ("Etsuro Fujita" <fujita.etsuro@lab.ntt.co.jp>)
Список pgsql-hackers



On 1 November 2013 16:32, Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp> wrote:
> From: Fujii Masao [mailto:masao.fujii@gmail.com]

> This is what I'm looking for! This feature is really useful for tuning
work_mem
> when using full text search with pg_trgm.
>
> I'm not sure if it's good idea to show the number of the fetches because it
> seems difficult to tune work_mem from that number. How can we calculate how
> much to increase work_mem to avoid lossy bitmap from the number of the fetches
> in EXPLAIN output?

We can calculate that from the following equation in tbm_create():

  nbuckets = maxbytes /
    (MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(sizeof(PagetableEntry))
    + sizeof(Pointer) + sizeof(Pointer)),

where maxbytes is the size of memory used for the hashtable in a TIDBitmap,
designated by work_mem, and nbuckets is the estimated number of hashtable
entries we can have within maxbytes.  From this, the size of work_mem within
which we can have every hashtable entry as an exact bitmap is calculated as
follows:

  work_mem = (the number of exact pages + the number of lossy pages) *
    (MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(sizeof(PagetableEntry))
    + sizeof(Pointer) + sizeof(Pointer)) /
    (1024 * 1024).

I am yet to give more thought on the above formula (particularly exact_pages + lossy_pages), but  I was also wondering if the user would indeed be able to figure out the above way to estimate the memory, or the explain itself should show the estimated memory  required for the bitmap. For hash joins we do show the memory taken by the hash table in show_hash_info(). We can show the memory requirement in addition to the number of exact/lossy pages. 


I'll show you an example.  The following is the result for work_mem = 1MB:

> > postgres=# EXPLAIN ANALYZE SELECT * FROM demo WHERE col2 between 0.01 and
> 0.02;
> >                                                            QUERY PLAN
> > ----------------------------------------------------------------------
> > ----------
> > ------------------------------------------------
> >  Bitmap Heap Scan on demo  (cost=2716.54..92075.46 rows=105766
> > width=34) (actual
> > time=24.907..1119.961 rows=100047 loops=1)
> >    Recheck Cond: ((col2 >= 0.01::double precision) AND (col2 <=
> > 0.02::double
> > precision))
> >    Rows Removed by Index Recheck: 5484114
> >    Heap Blocks: exact=11975 lossy=46388
> >    ->  Bitmap Index Scan on demo_idx  (cost=0.00..2690.09 rows=105766
> > width=0) (actual time=22.821..22.821 rows=100047 loops=1)
> >          Index Cond: ((col2 >= 0.01::double precision) AND (col2 <=
> > 0.02::double
> > precision))
> >  Total runtime: 1129.334 ms
> > (7 rows)

So, by setting work_mem to

  work_mem = (11975 + 46388) *
    (MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(sizeof(PagetableEntry))
    + sizeof(Pointer) + sizeof(Pointer)) /
    (1024 * 1024),

which is about 5MB, we have the following (Note that no lossy heap pages!):

postgres=# EXPLAIN ANALYZE SELECT * FROM demo WHERE col2 between 0.01 and 0.02;
                                                           QUERY PLAN

--------------------------------------------------------------------------------
----------------------------
--------------------
 Bitmap Heap Scan on demo  (cost=2716.54..92075.46 rows=105766 width=34) (actual
time=42.981..120.252 rows=1
00047 loops=1)
   Recheck Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double
precision))
   Heap Blocks: exact=58363
   ->  Bitmap Index Scan on demo_idx  (cost=0.00..2690.09 rows=105766 width=0)
(actual time=26.023..26.023 r
ows=100047 loops=1)
         Index Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double
precision))
 Total runtime: 129.304 ms
(6 rows)

BTW, as the EXPLAIN ANALYZE output, the number of exact/lossy heap pages would
be fine with me.

> Anyway, could you add the patch into next CF?

Done.

Thanks,

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Abhijit Menon-Sen
Дата:
Сообщение: Re: [PATCH] Use MAP_HUGETLB where supported (v3)
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Re: Server is not getting started with log level as debug5 on master after commit 3147ac