Обсуждение: Bitmap Heap scan 8.1/8.2
I have to PG servers, one ver. 8.1.9 and the other 8.2.4.
I was checking a query out and found that with the exact same DB (same
data in it) and the same query I get different plans, and significantly
higher time in 8.2:
On 8.1 I get:
test=> explain analyze SELECT * FROM prestamos WHERE biblioteca = 19 AND
vencimiento < now() AND NOT devuelto ORDER BY vencimiento DESC;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=2793.74..2795.82 rows=832 width=20) (actual
time=25.795..25.832 rows=49 loops=1)
Sort Key: vencimiento
-> Bitmap Heap Scan on prestamos (cost=850.43..2753.39 rows=832
width=20) (actual time=20.747..25.529 rows=49 loops=1)
Filter: ((biblioteca = 19) AND (vencimiento < now()) AND (NOT
devuelto))
-> Bitmap Index Scan on prestamos_objetos_devuelto_idx
(cost=0.00..850.43 rows=2200 width=0) (actual time=20.265..20.265
rows=2301 loops=1)
Index Cond: (devuelto = false)
Total runtime: 25.971 ms
(7 filas)
On 8.2:
test=> explain analyze SELECT * FROM prestamos WHERE biblioteca = 19 AND
vencimiento < now() AND NOT devuelto ORDER BY vencimiento DESC;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Sort (cost=4100.77..4102.77 rows=800 width=20) (actual
time=95.082..95.103 rows=49 loops=1)
Sort Key: vencimiento
-> Seq Scan on prestamos (cost=0.00..4062.20 rows=800 width=20)
(actual time=7.293..82.778 rows=49 loops=1)
Filter: ((biblioteca = 19) AND (vencimiento < now()) AND (NOT
devuelto))
Total runtime: 95.165 ms
(5 filas)
It's like it's ignoring the HEAP scan. Why? It's obviously (unless I'm
totally wrong) faster in this case.
--
21:50:04 up 2 days, 9:07, 0 users, load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Martín Marqués | SELECT 'mmarques' ||
Centro de Telemática | '@' || 'unl.edu.ar';
Universidad Nacional | DBA, Programador,
del Litoral | Administrador
---------------------------------------------------------
Hello I am unsure, did you check config values? Pavel 2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>: > I have to PG servers, one ver. 8.1.9 and the other 8.2.4. > > I was checking a query out and found that with the exact same DB (same > data in it) and the same query I get different plans, and significantly > higher time in 8.2: > > On 8.1 I get: > > test=> explain analyze SELECT * FROM prestamos WHERE biblioteca = 19 AND > vencimiento < now() AND NOT devuelto ORDER BY vencimiento DESC; > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------- > Sort (cost=2793.74..2795.82 rows=832 width=20) (actual > time=25.795..25.832 rows=49 loops=1) > Sort Key: vencimiento > -> Bitmap Heap Scan on prestamos (cost=850.43..2753.39 rows=832 > width=20) (actual time=20.747..25.529 rows=49 loops=1) > Filter: ((biblioteca = 19) AND (vencimiento < now()) AND (NOT > devuelto)) > -> Bitmap Index Scan on prestamos_objetos_devuelto_idx > (cost=0.00..850.43 rows=2200 width=0) (actual time=20.265..20.265 > rows=2301 loops=1) > Index Cond: (devuelto = false) > Total runtime: 25.971 ms > (7 filas) > > On 8.2: > > test=> explain analyze SELECT * FROM prestamos WHERE biblioteca = 19 AND > vencimiento < now() AND NOT devuelto ORDER BY vencimiento DESC; > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------- > Sort (cost=4100.77..4102.77 rows=800 width=20) (actual > time=95.082..95.103 rows=49 loops=1) > Sort Key: vencimiento > -> Seq Scan on prestamos (cost=0.00..4062.20 rows=800 width=20) > (actual time=7.293..82.778 rows=49 loops=1) > Filter: ((biblioteca = 19) AND (vencimiento < now()) AND (NOT > devuelto)) > Total runtime: 95.165 ms > (5 filas) > > It's like it's ignoring the HEAP scan. Why? It's obviously (unless I'm > totally wrong) faster in this case. > > -- > 21:50:04 up 2 days, 9:07, 0 users, load average: 0.92, 0.37, 0.18 > --------------------------------------------------------- > Lic. Martín Marqués | SELECT 'mmarques' || > Centro de Telemática | '@' || 'unl.edu.ar'; > Universidad Nacional | DBA, Programador, > del Litoral | Administrador > --------------------------------------------------------- > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend >
Pavel Stehule wrote:
> Hello
>
> I am unsure, did you check config values?
Don't know which ones you are talking about, but all enable_* are set to on.
Anything else?
--
21:50:04 up 2 days, 9:07, 0 users, load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Martín Marqués | SELECT 'mmarques' ||
Centro de Telemática | '@' || 'unl.edu.ar';
Universidad Nacional | DBA, Programador,
del Litoral | Administrador
---------------------------------------------------------
2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>: > Pavel Stehule wrote: > > Hello > > > > I am unsure, did you check config values? > > Don't know which ones you are talking about, but all enable_* are set to on. > > Anything else? > shared_buffers work_mem effective_cache_size Pavel > -- > 21:50:04 up 2 days, 9:07, 0 users, load average: 0.92, 0.37, 0.18 > --------------------------------------------------------- > Lic. Martín Marqués | SELECT 'mmarques' || > Centro de Telemática | '@' || 'unl.edu.ar'; > Universidad Nacional | DBA, Programador, > del Litoral | Administrador > --------------------------------------------------------- >
> 2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>: >> Pavel Stehule wrote: >>> Hello >>> >>> I am unsure, did you check config values? >> Don't know which ones you are talking about, but all enable_* are set to on. >> >> Anything else? >> > > shared_buffers > work_mem > effective_cache_size > > Pavel Well, the cost_* values might be interesting too. That is seq_page_cost random_page_cost cpu_tuple_cost cpu_index_tuple_cost cpu_operator_cost effective_cache_size (the first one is available in 8.2 only). My guess is that there are different values, and the 8.2 overestimates the index scan - which seems to be incorrect. Try to disable the seqscan in the 8.2 database (set enable_seqscan = off), and run the explain analyze again. This time it should choose different query plan - maybe the index scan as in 8.1. Another thing you might try is setting the cost values to the same values in both databases - it might help. And what does it mean by 'same data' - have you vacuumed / analyzed both of them? What does this return: select relname, relpages, reltuples from pg_class where relname = 'prestamos' or relname='prestamos_objetos_devuelto_idx'; That should return number of tuples / occupied pages in the table and index. Tomas
Pavel Stehule wrote:
> 2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>:
>> Pavel Stehule wrote:
>>> Hello
>>>
>>> I am unsure, did you check config values?
>> Don't know which ones you are talking about, but all enable_* are set to on.
>>
>> Anything else?
>>
>
> shared_buffers
8.1:
16000
8.2:
400MB
> work_mem
8.1:
8192
8.2:
4MB
> effective_cache_size
8.1:
1000
8.2:
128MB
--
21:50:04 up 2 days, 9:07, 0 users, load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Martín Marqués | SELECT 'mmarques' ||
Centro de Telemática | '@' || 'unl.edu.ar';
Universidad Nacional | DBA, Programador,
del Litoral | Administrador
---------------------------------------------------------
2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>: > Pavel Stehule wrote: > > 2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>: > >> Pavel Stehule wrote: > >>> Hello > >>> > >>> I am unsure, did you check config values? > >> Don't know which ones you are talking about, but all enable_* are set to on. > >> > >> Anything else? > >> > > > > shared_buffers > > 8.1: > > 16000 ~ 128M > > > 8.2: > > 400MB > > > > work_mem > > 8.1: > > 8192 8M !!!! 8>4 > > 8.2: > > 4MB > > > > effective_cache_size > > 8.1: > > 1000 8M > > 8.2: > > 128MB > > try set work_mem to '8MB'; and explain analyze select ..
Tomas Vondra wrote:
>
>> 2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>:
>>> Pavel Stehule wrote:
>>>> Hello
>>>>
>>>> I am unsure, did you check config values?
>>> Don't know which ones you are talking about, but all enable_* are set
>>> to on.
>>>
>>> Anything else?
>>>
>>
>> shared_buffers
>> work_mem
>> effective_cache_size
>>
>> Pavel
>
> Well, the cost_* values might be interesting too. That is
>
> seq_page_cost
> random_page_cost
> cpu_tuple_cost
> cpu_index_tuple_cost
> cpu_operator_cost
> effective_cache_size
>
> (the first one is available in 8.2 only). My guess is that there are
> different values, and the 8.2 overestimates the index scan - which seems
> to be incorrect.
>
> Try to disable the seqscan in the 8.2 database (set enable_seqscan =
> off), and run the explain analyze again. This time it should choose
> different query plan - maybe the index scan as in 8.1.
siprebi-1.4=> SHOW enable_seqscan ;
enable_seqscan
----------------
off
(1 fila)
siprebi-1.4=> explain analyze SELECT usuarios,nticket,objeto,vencimiento
FROM prestamos WHERE biblioteca = 19 AND vencimiento < now() AND NOT
devuelto ORDER BY vencimiento DESC;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=4365.26..4367.26 rows=800 width=20) (actual
time=30.736..30.755 rows=49 loops=1)
Sort Key: vencimiento
-> Bitmap Heap Scan on prestamos (cost=2502.69..4326.68 rows=800
width=20) (actual time=28.983..30.644 rows=49 loops=1)
Filter: ((biblioteca = 19) AND (vencimiento < now()) AND (NOT
devuelto))
-> Bitmap Index Scan on prestamos_usuarios_devuelto_idx
(cost=0.00..2502.49 rows=1976 width=0) (actual time=28.874..28.874
rows=2300 loops=1)
Index Cond: (devuelto = false)
Total runtime: 45.725 ms
Here I see that the time has been halved. So this plan is much better.
> Another thing you might try is setting the cost values to the same
> values in both databases - it might help.
Which ones and how?
> And what does it mean by 'same data' - have you vacuumed / analyzed both
> of them? What does this return:
dumped the 8.1 DB and restored in the 8.2. Both were vacummed analyze to
have accurate stats.
> select relname, relpages, reltuples from pg_class where relname =
> 'prestamos' or relname='prestamos_objetos_devuelto_idx';
>
> That should return number of tuples / occupied pages in the table and
> index.
select relname, relpages, reltuples from pg_class where relname =
'prestamos' or relname='prestamos_objetos_devuelto_idx';
relname | relpages | reltuples
--------------------------------+----------+-----------
prestamos_objetos_devuelto_idx | 373 | 134697
prestamos | 1705 | 134697
--
21:50:04 up 2 days, 9:07, 0 users, load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Martín Marqués | SELECT 'mmarques' ||
Centro de Telemática | '@' || 'unl.edu.ar';
Universidad Nacional | DBA, Programador,
del Litoral | Administrador
---------------------------------------------------------
Pavel Stehule wrote:
> 2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>:
>> Pavel Stehule wrote:
>>> 2007/10/22, Martin Marques <martin@bugs.unl.edu.ar>:
>>>> Pavel Stehule wrote:
>>>>> Hello
>>>>>
>>>>> I am unsure, did you check config values?
>>>> Don't know which ones you are talking about, but all enable_* are set to on.
>>>>
>>>> Anything else?
>>>>
>>> shared_buffers
>> 8.1:
>>
>> 16000
> ~ 128M
>>
>> 8.2:
>>
>> 400MB
>>
>>
>>> work_mem
>> 8.1:
>>
>> 8192
> 8M !!!! 8>4
>> 8.2:
>>
>> 4MB
>>
>>
>>> effective_cache_size
>> 8.1:
>>
>> 1000
> 8M
>> 8.2:
>>
>> 128MB
>>
>>
>
> try
>
> set work_mem to '8MB';
> and
> explain analyze select ..
These things didn't help. What changed the plan completely was this:
seq_page_cost = 5.0 # measured on an arbitrary scale
cpu_tuple_cost = 0.05 # same scale as above
Specially the first one. Now I get this:
explain analyze SELECT usuarios,nticket,objeto,vencimiento FROM
prestamos WHERE biblioteca = 19 AND vencimiento < now() AND NOT
devuelto ORDER BY vencimiento DESC;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=7058.86..7060.86 rows=800 width=20) (actual
time=22.850..22.888 rows=95 loops=1)
Sort Key: vencimiento
-> Index Scan using prestamos_objetos_devuelto_idx on prestamos
(cost=0.00..7020.28 rows=800 width=20) (actual time=0.346..22.590
rows=95 loops=1)
Index Cond: (devuelto = false)
Filter: ((biblioteca = 19) AND (vencimiento < now()) AND (NOT
devuelto))
Total runtime: 22.973 ms
--
21:50:04 up 2 days, 9:07, 0 users, load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Martín Marqués | SELECT 'mmarques' ||
Centro de Telemática | '@' || 'unl.edu.ar';
Universidad Nacional | DBA, Programador,
del Litoral | Administrador
---------------------------------------------------------
Martin Marques escribió: > Pavel Stehule wrote: >> >> try >> >> set work_mem to '8MB'; >> and >> explain analyze select .. > > These things didn't help. What changed the plan completely was this: > > seq_page_cost = 5.0 # measured on an arbitrary scale > cpu_tuple_cost = 0.05 # same scale as above Can someone explain how this parameters are measured? What is 5.0 in this case for seq_page_cost?
2007/10/23, Martin Marques <martin@bugs.unl.edu.ar>: > Martin Marques escribió: > > Pavel Stehule wrote: > >> > >> try > >> > >> set work_mem to '8MB'; > >> and > >> explain analyze select .. > > > > These things didn't help. What changed the plan completely was this: > > > > seq_page_cost = 5.0 # measured on an arbitrary scale > > cpu_tuple_cost = 0.05 # same scale as above > > Can someone explain how this parameters are measured? What is 5.0 in > this case for seq_page_cost? > http://www.postgresql.org/docs/8.2/interactive/runtime-config-query.html 5.0 means so seq scan will be expensive for optimaliser, and optimaliser will prefer index scan.
Martin Marques escribió: > Martin Marques escribió: >> Pavel Stehule wrote: >>> >>> try >>> >>> set work_mem to '8MB'; >>> and >>> explain analyze select .. >> These things didn't help. What changed the plan completely was this: >> seq_page_cost = 5.0 # measured on an arbitrary scale >> cpu_tuple_cost = 0.05 # same scale as above > > Can someone explain how this parameters are measured? What is 5.0 in this > case for seq_page_cost? It's an arbitrary number, based on which all the other numbers are measured. What people generally do around here is mess with random_page_cost, and leave seq_page_cost alone. Often, it's the ratio seq_page_cost/random_page_cost what's most important to the cost equations results. (seq_page_cost wasn't tunable at all until recently, say 8.1 or 8.2 AFAIR). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Martin Marques escribi�:
>>> seq_page_cost = 5.0 # measured on an arbitrary scale
> What people generally do around here is mess with random_page_cost, and
> leave seq_page_cost alone.
It's also worth pointing out that having seq_page_cost >
random_page_cost is simply not sane, and is very likely to result in
performance problems in other queries.
regards, tom lane
Alvaro Herrera wrote:
>
> It's an arbitrary number, based on which all the other numbers are
> measured.
Now that I read more intensively he docs I see that all the cost
parameters are related one with the other.
> What people generally do around here is mess with random_page_cost, and
> leave seq_page_cost alone. Often, it's the ratio
> seq_page_cost/random_page_cost what's most important to the cost
> equations results. (seq_page_cost wasn't tunable at all until
> recently, say 8.1 or 8.2 AFAIR).
Ok, now the 8.1 server has a RAID1 hardware board with SCSI disks, and
the 8.2 is just a PentiumD with SATA disks (it's my desktop PC where I
do tests). Should I have a lower random_page_cost on a machine that is
likely to have a lower disk IO speed?
--
21:50:04 up 2 days, 9:07, 0 users, load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Martín Marqués | SELECT 'mmarques' ||
Centro de Telemática | '@' || 'unl.edu.ar';
Universidad Nacional | DBA, Programador,
del Litoral | Administrador
---------------------------------------------------------
Martin Marques <martin@bugs.unl.edu.ar> writes:
> Ok, now the 8.1 server has a RAID1 hardware board with SCSI disks, and
> the 8.2 is just a PentiumD with SATA disks (it's my desktop PC where I
> do tests). Should I have a lower random_page_cost on a machine that is
> likely to have a lower disk IO speed?
I'd guess the other way round: cheaper disks are likely to have worse
seek speeds, which translates to random accesses being proportionally
slower.
regards, tom lane