Re: A costing analysis tool

Поиск
Список
Период
Сортировка
От Josh Berkus
Тема Re: A costing analysis tool
Дата
Msg-id 200510161422.23958.josh@agliodbs.com
обсуждение исходный текст
Ответ на A costing analysis tool  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Список pgsql-hackers
Kevin,

> I have to keep a very narrow focus on this, or there is likely that
> nothing will come of it.  The particular area which is my target
> here is the accuracy of the cost values on the subplans
> considered by the optimizer.

Sure.  What the rest of us are focused on is helping you build a generally
useful tool which can be used to solve future problems and unexpected
performance issues as well.  I really see needing to collect all of the
information possible from EXPLAIN ANALYZE ... not collecting just three bits
and throwing the rest of the stuff away.

I'd use a structure more like:

query_instances (
query     text
run_instance     int
estimated_cost float
analyze_time  float
actual_time  float
time_run   timestamp
GUCs text[]
)

query_steps (
query_instance  FK
step_id SERIAL
parent_step FK
node_name  text
node_type text FK
cost_start float
cost_end float
est_rows INT8
time_start float
time_end  float
actual_rows INT8
loops INT8
db_object name
condition_type text FK
condition_detail text
)

so, for example, the query step:
"                    ->  Seq Scan on detail0009  (cost=0.00..20500.11
rows=26 width=1005) (actual time=453.000..5983.000 rows=53588 loops=1)"
"                          Filter: ((txcontenttype ~~* '%html%'::text)
AND ((vchost)::text ~~* '%www.%'::text))"

Could be displayed as:

query_instance      12008
step_id             14701
parent_step         14698
node_name          Seq Scan on detail0009
node_type            Seq Scan
cost_start            0
cost_end            20500.11
est_rows            26
time_start            453.0
time_end             5983.0
actual_rows        53588
loops                1
db_object            detail009
condition_type        Filter
condition_detail        ((txcontenttype ~~* '%html%'::text) AND ((vchost)::text ~~*
'%www.%'::text))

By collecting all of this data, you make it possible to perform other sorts of
analysis on the cost estimates.   For example, statistical analysis might
tell us that 3-or-more-condition filters take significantly longer to execute
than single-condition filters, which would be important to know for the cost
model.  Limiting it to collecting only 3 of the 13 bits of node data produced
by EA would very much limit the usefulness of the tool and the reliability of
its statistics.

--
Josh Berkus
Aglio Database Solutions
San Francisco


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Josh Berkus
Дата:
Сообщение: Re: A costing analysis tool
Следующее
От: Martijn van Oosterhout
Дата:
Сообщение: Re: Question about Ctrl-C and less