Re: TPC-R benchmarks

Поиск
Список
Период
Сортировка
От Oleg Lebedev
Тема Re: TPC-R benchmarks
Дата
Msg-id 993DBE5B4D02194382EC8DF8554A52731D7624@postoffice.waterford.org
обсуждение исходный текст
Ответ на TPC-R benchmarks  (Oleg Lebedev <oleg.lebedev@waterford.org>)
Ответы Re: TPC-R benchmarks  ("scott.marlowe" <scott.marlowe@ihs.com>)
Список pgsql-performance
As Scott recommended, I did the following:
# set enable_nestloop = false;
# vacuum full analyze;

After this I re-ran the query and its execution time went down from 2
hours to 2 minutes. I attached the new query plan to this posting.
Is there any way to optimize it even further?
What should I do to make this query run fast without hurting the
performance of the other queries?
Thanks.

Oleg

-----Original Message-----
From: scott.marlowe [mailto:scott.marlowe@ihs.com]
Sent: Wednesday, October 01, 2003 4:00 PM
To: Oleg Lebedev
Cc: Josh Berkus; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] TPC-R benchmarks


For troubleshooting, can you try it with "set enable_nestloop = false"
and
rerun the query and see how long it takes?

It looks like the estimates of rows returned is WAY off (estimate is too

low compared to what really comes back.)

Also, you might try to alter the table.column to have a higher target on

the rows p_partkey and ps_partkey and any others where the estimate is
so
far off of the reality.

On Wed, 1 Oct 2003, Oleg Lebedev wrote:

> All right, my query just finished running with EXPLAIN ANALYZE. I show

> the plan below and also attached it as a file. Any ideas?
>
>    ->  Sort  (cost=54597.49..54597.50 rows=1 width=121) (actual
> time=6674562.03..6674562.15 rows=175 loops=1)
>          Sort Key: nation.n_name, date_part('year'::text,
> orders.o_orderdate)
>          ->  Aggregate  (cost=54597.45..54597.48 rows=1 width=121)
> (actual time=6668919.41..6674522.48 rows=175 loops=1)
>                ->  Group  (cost=54597.45..54597.47 rows=3 width=121)
> (actual time=6668872.68..6672136.96 rows=348760 loops=1)
>                      ->  Sort  (cost=54597.45..54597.46 rows=3
> width=121) (actual time=6668872.65..6669499.95 rows=348760 loops=1)
>                            Sort Key: nation.n_name,
> date_part('year'::text, orders.o_orderdate)
>                            ->  Hash Join  (cost=54596.00..54597.42
> rows=3
> width=121) (actual time=6632768.89..6650192.67 rows=348760 loops=1)
>                                  Hash Cond: ("outer".n_nationkey =
> "inner".s_nationkey)
>                                  ->  Seq Scan on nation
> (cost=0.00..1.25 rows=25 width=33) (actual time=6.75..7.13 rows=25
> loops=1)
>                                  ->  Hash  (cost=54596.00..54596.00
> rows=3
> width=88) (actual time=6632671.96..6632671.96 rows=0 loops=1)
>                                        ->  Nested Loop
> (cost=0.00..54596.00 rows=3 width=88) (actual time=482.41..6630601.46
> rows=348760 loops=1)
>                                              Join Filter:
> ("inner".s_suppkey = "outer".l_suppkey)
>                                              ->  Nested Loop
> (cost=0.00..54586.18 rows=3 width=80) (actual time=383.87..6594984.40
> rows=348760 loops=1)
>                                                    ->  Nested Loop
> (cost=0.00..54575.47 rows=4 width=68) (actual time=199.95..3580882.07
> rows=348760 loops=1)
>                                                          Join Filter:
> ("outer".p_partkey = "inner".ps_partkey)
>                                                          ->  Nested
> Loop (cost=0.00..22753.33 rows=9343 width=49) (actual
> time=146.85..3541433.10 rows=348760 loops=1)
>                                                                ->  Seq

> Scan on part  (cost=0.00..7868.00 rows=320 width=4) (actual
> time=33.64..15651.90 rows=11637 loops=1)
>
> Filter: (p_name ~~ '%green%'::text)
>                                                                ->
> Index Scan using i_l_partkey on lineitem  (cost=0.00..46.15 rows=29
> width=45) (actual time=10.71..302.67 rows=30 loops=11637)
>
> Index
> Cond: ("outer".p_partkey = lineitem.l_partkey)
>                                                          ->  Index
> Scan using pk_partsupp on partsupp  (cost=0.00..3.39 rows=1 width=19)
> (actual time=0.09..0.09 rows=1 loops=348760)
>                                                                Index
> Cond: ((partsupp.ps_partkey = "outer".l_partkey) AND
> (partsupp.ps_suppkey =
> "outer".l_suppkey))
>                                                    ->  Index Scan
> using pk_orders on orders  (cost=0.00..3.01 rows=1 width=12) (actual
> time=8.62..8.62 rows=1 loops=348760)
>                                                          Index Cond:
> (orders.o_orderkey = "outer".l_orderkey)
>                                              ->  Index Scan using
> pk_supplier on supplier  (cost=0.00..3.01 rows=1 width=8) (actual
> time=0.08..0.08 rows=1 loops=348760)
>                                                    Index Cond:
> ("outer".ps_suppkey = supplier.s_suppkey)  Total runtime: 6674724.23
> msec (28 rows)
>
>
> -----Original Message-----
> From: Oleg Lebedev
> Sent: Wednesday, October 01, 2003 12:00 PM
> To: Josh Berkus; scott.marlowe
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] TPC-R benchmarks
> Importance: Low
>
>
> Sure, below is the query. I attached the plan to this posting.
>
> select
>     nation,
>     o_year,
>     sum(amount) as sum_profit
> from
>     (
>         select
>             n_name as nation,
>             extract(year from o_orderdate) as o_year,
>             l_extendedprice * (1 - l_discount) -
> ps_supplycost * l_quantity as amount
>         from
>             part,
>             supplier,
>             lineitem,
>             partsupp,
>             orders,
>             nation
>         where
>             s_suppkey = l_suppkey
>             and ps_suppkey = l_suppkey
>             and ps_partkey = l_partkey
>             and p_partkey = l_partkey
>             and o_orderkey = l_orderkey
>             and s_nationkey = n_nationkey
>             and p_name like '%green%'
>     ) as profit
> group by
>     nation,
>     o_year
> order by
>     nation,
>     o_year desc;
>
>
> -----Original Message-----
> From: Josh Berkus [mailto:josh@agliodbs.com]
> Sent: Wednesday, October 01, 2003 11:42 AM
> To: Oleg Lebedev; scott.marlowe
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] TPC-R benchmarks
>
>
> Oleg,
>
> > The output of the query should contain about 200 rows. So, I guess
> > the
>
> > planer is off assuming that the query should return 1 row.
>
> Oh, also did you post the query before?   Can you re-post it with the
> planner
> results?
>
>

*************************************

This e-mail may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited.
We reserve the right to monitor e-mail sent through our network.

*************************************

Вложения

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: runtime of the same query in function differs on 2 degree!
Следующее
От: Josh Berkus
Дата:
Сообщение: Re: TPC-R benchmarks