Re: experiments in query optimization

Поиск

Список

Период

Сортировка

От	Faheem Mitha
Тема	Re: experiments in query optimization
Дата	30 марта 2010 г. 14:29:52
Msg-id	alpine.DEB.2.00.1003302242210.13883@orwell.homelinux.org обсуждение исходный текст
Ответ на	Re: experiments in query optimization ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Список	pgsql-performance

Дерево обсуждения

On thing which I haven't really mentioned in this thread or in my writeup,
is that the planners value for the number of rows in geno is way off base
some of the time. It is around 800 million, it thinks it is 100 million. I
don't know if this is significant or not, or what to do about it.

eg. in the ped_bigjoin EXPLAIN ANALYZE VERBOSE:

   ->  Sort  (cost=56855882.72..57144683.54 rows=115520330 width=42)
(actual time=23027732.092..37113627.380 rows=823086774 loops=1)
         Output: (CASE WHEN (hapmap.geno.snpval_id = (-1)) THEN '0 0'::text
WHEN (hapmap.geno.snpval_id = 0) THEN
(((dedup_patient_anno.allelea_id)::text || ' '::text) ||
(dedup_patient_anno.allelea_id)::text) WHEN (hapmap.geno.snpval_id = 1)
THEN (((dedup_patient_anno.allelea_id)::text || ' '::text) ||
(dedup_patient_anno.alleleb_id)::text) WHEN (hapmap.geno.snpval_id = 2)
THEN (((dedup_patient_anno.alleleb_id)::text || ' '::text) ||
(dedup_patient_anno.alleleb_id)::text) ELSE NULL::text END),
hapmap.geno.idlink_id, hapmap.geno.anno_id, pheno.patientid,
pheno.phenotype, sex.code

                                                       Faheem.

On Tue, 30 Mar 2010, Kevin Grittner wrote:

> Faheem Mitha <faheem@email.unc.edu> wrote:
>
>>> If you're concerned about memory usage, try reducing work_mem;
>>> you've probably got it set to something huge.
>>
>> work_mem = 1 GB (see diag.{tex/pdf}).
>>
>> The point isn't that I'm using so much memory. Again, my question
>> is, why are these changes affecting memory usage so drastically?
>
> Because the planner looks at a very wide variety of plans, some of
> which may use many allocations of work_mem size, and some of which
> don't.  The costs are compared and the lowest cost one is chosen. If
> you are close to the "tipping point" then even a very small change
> might affect which is chosen.  It pays to keep the work_mem setting
> sane so that unexpected plan changes don't cause problems.
>
> Look at the plans and their costs to get a feel for what's being
> chosen and why.  Although it's a very bad idea to use these in
> production, you can often shift the plan to something you *think*
> would be better using the enable_* settings, to see what the planner
> thinks such a plan will cost and where it thinks the cost would be;
> that can help in tuning the settings.
>
>>> You might need to create some indices, too.
>>
>> Ok. To what purpose? This query picks up everything from the
>> tables and the planner does table scans, so conventional wisdom
>> and indeed my experience, says that indexes are not going to be so
>> useful.
>
> There are situations where scanning the entire table to build up a
> hash table is more expensive than using an index.  Why not test it?
>
> -Kevin
>
>

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: experiments in query optimization