osdl-dbt3 run results - puzzled by the execution plans

От: Jenny Zhang
Тема: osdl-dbt3 run results - puzzled by the execution plans
Дата: ,
Msg-id: 1063924610.31150.75.camel@ibm-a.pdx.osdl.net
(см: обсуждение, исходный текст)
Ответы: Re: osdl-dbt3 run results - puzzled by the execution plans  ("Matt Clark")
Re: osdl-dbt3 run results - puzzled by the execution plans  (Tom Lane)
Re: osdl-dbt3 run results - puzzled by the execution plans  (Manfred Koizar)
Список: pgsql-performance

Скрыть дерево обсуждения

osdl-dbt3 run results - puzzled by the execution plans  (Jenny Zhang, )
 Re: osdl-dbt3 run results - puzzled by the execution plans  ("Matt Clark", )
  Re: osdl-dbt3 run results - puzzled by the execution  (Jenny Zhang, )
   Re: osdl-dbt3 run results - puzzled by the execution  ("Matt Clark", )
 Re: osdl-dbt3 run results - puzzled by the execution plans  (Tom Lane, )
  Re: osdl-dbt3 run results - puzzled by the execution plans  (Greg Stark, )
   Re: osdl-dbt3 run results - puzzled by the execution  (Jenny Zhang, )
  Re: osdl-dbt3 run results - puzzled by the execution  (Jenny Zhang, )
 Re: osdl-dbt3 run results - puzzled by the execution plans  (Manfred Koizar, )
  Re: osdl-dbt3 run results - puzzled by the execution  (Jenny Zhang, )
   Re: osdl-dbt3 run results - puzzled by the execution  (Manfred Koizar, )
    Re: [HACKERS] osdl-dbt3 run results - puzzled by the execution  (Tom Lane, )

Our hardware/software configuration:
kernel: 2.5.74
distro: RH7.2
pgsql:  7.3.3
CPUS:   8
MHz:    700.217
model:  Pentium III (Cascades)
memory: 8298888 kB
shmmax: 3705032704

We did several sets of runs(repeating runs with the same database
parameters) and have the following observation:

1. With everything else the same, we did two run sets with
small effective_cache_size (default=1000) and large (655360 i.e. 5GB
or 60% of the system memory 8GB).  It seems to me that small
effective_cache_size favors the choice of nested loop joins (NLJ)
while the big effective_cache_size is in favor of merge joins (MJ).
We thought the large effective_cache_size should lead us to better
plans. But we found the opposite.

Three plans out of 22 are different.  Two of those plans are worse
in execution time by 2 times and 8 times.   For example, one plan,
that included  NLJ ran in 4 seconds but the other, switching to an
MJ, ran in  32 seconds.  Please refer to the link at the end of
this mail for the query and plans.  Did we miss something, or
improvements are needed for the optimizer?

2. Thanks to all the response we got from this mailing list, we
decided to use SETSEED(0) default_statistics_target=1000 to reduce
the variation.   We get now the exact the same execution plans
and costs with repeated runs and that reduced the variation a lot.
However, within the same run set consist of 6 runs, we see 2-3%
standard deviation for the run metrics associated with the multiple
stream part of the test (as opposed to the single stream part).

We would like to reduce the variation to be less than 1% so that a
2% change between two different kernels would be significant.
Is there anything else we can do?

query: http://developer.osdl.org/~jenny/11.sql
plan with small effective_cache_size:
http://developer.osdl.org/~jenny/small_effective_cache_size_plan
plan with large effective_cache_size:
http://developer.osdl.org/~jenny/large_effective_cache_size_plan

Thanks,
Jenny



В списке pgsql-performance по дате сообщения:

От: Tom Lane
Дата:
Сообщение: Re: osdl-dbt3 run results - puzzled by the execution plans
От: Manfred Koizar
Дата:
Сообщение: Re: rewrite in to exists?