Re: query performance with hstore vs. non-hstore

Поиск
Список
Период
Сортировка
От David G Johnston
Тема Re: query performance with hstore vs. non-hstore
Дата
Msg-id 1409629101885-5817281.post@n5.nabble.com
обсуждение исходный текст
Ответ на Re: query performance with hstore vs. non-hstore  ("Huang, Suya" <Suya.Huang@au.experian.com>)
Ответы Re: query performance with hstore vs. non-hstore
Список pgsql-performance
Huang, Suya wrote
> See output of explain (analyze,timing off), the total runtime is close to
> the one enable timing.

Calling 43s "close to" 70s doesn't sound right...


> dev=# explain (analyze, timing off) select cha_type, sum(visits) from
> (select (each(visits)).key as cha_type,(each(visits)).value::numeric as
> visits from weekly_hstore a  join seg1 b on a.ref_id=b.ref_id )foo  group
> by cha_type  order by sum(visits) desc;

What version of PostgreSQL are you using?

Two calls to each() and cast to numeric are not free.

Your sequential scan savings is nearly 9 seconds but you lose all of that,
and more, when PostgreSQL evaluates the result of the scan and has to
process the each() and the cast before it performs the join against the
expanded result.  There is no planner node for this activity but it does
cost time - in this case more time than it would take to simply store the
native data types in separate rows.

You really should expand the hstore after the join (i.e., in the top-most
select-list) but in this case since the join removed hardly any rows the
gain from doing so would be minimal.  The idea being you should not expand
the hstore of any row that fails the join condition since it will not end up
in the final result anyway.

Also, in this specific case, the call to each(...).key is pointless - you
never use the data.

If you did need to use both columns, and are using 9.3, you should re-write
this to use LATERAL.

In 9.2- you, possibly using a CTE, could do something like this:

SELECT (each).* FROM (
SELECT each(hs) FROM ( VALUES('k=>1'::hstore) ) h (hs)
) src

This is a single call to each(), in a subquery, which result is then
expanded using (col).* notation in the parent query.  This avoids calling
each twice - and note that (each(...).*) does not work to avoid the
double-call - you have to use a subquery / a CTE one to ensure that it is
not collapsed (offset 0 should work too but I find the CTE one a little
cleaner personally).

David J.





--
View this message in context:
http://postgresql.1045698.n5.nabble.com/query-performance-with-hstore-vs-non-hstore-tp5817109p5817281.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


В списке pgsql-performance по дате отправления:

Предыдущее
От: "Huang, Suya"
Дата:
Сообщение: Re: query performance with hstore vs. non-hstore
Следующее
От: Xiaoyulei
Дата:
Сообщение: why after increase the hash table partitions, tpmc decrease