Обсуждение: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE ASwith same SELECT takes 7 minutes

Поиск
Список
Период
Сортировка

SELECT statement returns in 10seconds, but INSERT/CREATE TABLE ASwith same SELECT takes 7 minutes

От
Arjun Ranade
Дата:
I have a strange performance situation that I cannot resolve with my usual process.

I have a SELECT statement that completes in about 12 seconds for the full result (~1100 rows). 

If I create an empty table first, and then INSERT with the SELECT query, it takes 6.5 minutes.

When I look at the EXPLAIN ANALYZE output, it seems that it's using a drastically different query plan for the INSERT+SELECT than SELECT by itself.

Here's the explain plan for the SELECT() by itself: https://explain.depesz.com/s/8Qmr

Here's the explain plan for INSERT INTO x SELECT(): https://explain.depesz.com/s/qifT

I am running Postgresql 10(PostgreSQL 10.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18), 64-bit). 

Shared Buffers = 4gb
effective_cache_size = 4gb
work_mem = 8gb
wal_buffers = -1
max_wal_sze = 2gb
wal_level = replica
archiving on
Total RAM on machine: 252GB

This machine is VACUUM FULL,ANALYZE once a week.  Autovac is ON with PG10 default settings.

The machine has 12 Intel(R) Xeon(R) CPU E5-2643 v3 @ 3.40GHz, and 15k RPM disks for Postgres.  I have tested write speed to all filesystems and speeds are as expected.  The pg_wal is on a separate disk resource, however, these disks are also 15k in speed and setup the same way as Postgres data disks.

The queries are sensitive so I had to obfuscate them in the explain plans.  I am reluctant to provide full metadata for all the objects involved, but will if it comes to that.  I first want to understand why the query plan would be so different for a SELECT vs INSERT into X SELECT.  I also tried CREATE TABLE x as SELECT() but it also takes 6+ minutes.

Is there any advice as to the general case on why SELECT can finish in 10seconds but CREATE TABLE as SELECT() runs in 7 minutes? 

Any advice would be much appreciated.

Thanks,
Arjun Ranade

Arjun Ranade <ranade@nodalexchange.com> writes:
> I have a strange performance situation that I cannot resolve with my usual
> process.
> I have a SELECT statement that completes in about 12 seconds for the full
> result (~1100 rows).
> If I create an empty table first, and then INSERT with the SELECT query, it
> takes 6.5 minutes.

> When I look at the EXPLAIN ANALYZE output, it seems that it's using a
> drastically different query plan for the INSERT+SELECT than SELECT by
> itself.

The reason for the plan shape difference is probably that the bare SELECT
is allowed to use parallelism while INSERT/SELECT isn't.  I'm not sure
to what extent we could relax that without creating semantic gotchas.

However, your real problem with either query is that the planner's
rowcount estimates are off by several orders of magnitude.  If you could
improve that, you'd likely get better plan choices in both cases.

I also notice that this seems to be a 14-way join, which means you're
probably getting an artificially poor plan as a result of 
from_collapse_limit and/or join_collapse_limit constraining the planner's
search space.  Maybe raising those limits would help, although I'm not
sure how much it'd help if the rowcount estimates aren't improved.

Since you haven't told us much of anything about the actual query or the
data, it's hard to offer concrete advice beyond that.

            regards, tom lane


Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLEAS with same SELECT takes 7 minutes

От
Justin Pryzby
Дата:
On Thu, Sep 27, 2018 at 01:08:05PM -0400, Arjun Ranade wrote:
> When I look at the EXPLAIN ANALYZE output, it seems that it's using a
> drastically different query plan for the INSERT+SELECT than SELECT by
> itself.

The fast, SELECT plan is using parallel query, which isn't available for INSERT+SELECT:

https://www.postgresql.org/docs/current/static/when-can-parallel-query-be-used.html
|Even when it is in general possible for parallel query plans to be generated, the planner will not generate them for a
givenquery if any of the following are true:
 
|The query writes any data or locks any database rows.

Using parallel query in this case happens to mitigate the effects of the bad
plan.

I see Tom responded, and you got an improvement by changing join threshold.

But I think you could perhaps get an better plan if the rowcount estimates were
fixed.  That's more important than probably anything else - changing settings
is only a workaround for bad estimates.

In the slow/INSERT plan, this join is returning 55000x more rows than expected
(not 55k more: 55k TIMES more).

7.     26,937.132     401,503.136     ↓ 55,483.7     332,902     1     
Nested Loop (cost=1,516.620..42,244.240 rows=6 width=84) (actual time=311.021..401,503.136 rows=332,902 loops=1)
    Join Filter: (((papa_echo.oscar_bravo)::text = (five_hotel.tango_november)::text) AND ((papa_echo.lima_tango)::text
=(five_hotel.lima_mike)::text) AND ((xray_juliet1.juliet)::text = (five_hotel.papa_victor)::text))
 
    Rows Removed by Join Filter: 351664882
    Buffers: shared hit=8570619 read=6

First question is if all those conditions are independent?  Or if one of those
conditions also implies another, which is confusing the planner.

Justin


Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE ASwith same SELECT takes 7 minutes

От
Arjun Ranade
Дата:
Hi Tom,

Thank you for your suggestions.  I tried increasing from_collapse_limit and join_collapse_limit to 16 in a specific session and that significantly improved my query performance (it takes < 2s now).  Now, my instinct is to increase this globally but I'm sure there are some drawbacks to this so I will need to read more about it.

Your point about parallelism is interesting, I hadn't considered that.  

Even after working with Postgres for years, there really is a lot to learn about query optimization that is new for me.  I'd never heard of these parameters before your email since almost every performance issue I've had thus far was resolved by creating an index or smarter query re-writing.

I'm reading the documentation regarding these specific parameters, but it's written as a reference page as opposed to an explanation into query planning and optimization.  I wonder if there is a class or book these details better. 

Anyway, thank you so much for pointing me in the right direction.

Best,
Arjun

On Thu, Sep 27, 2018 at 1:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Arjun Ranade <ranade@nodalexchange.com> writes:
> I have a strange performance situation that I cannot resolve with my usual
> process.
> I have a SELECT statement that completes in about 12 seconds for the full
> result (~1100 rows).
> If I create an empty table first, and then INSERT with the SELECT query, it
> takes 6.5 minutes.

> When I look at the EXPLAIN ANALYZE output, it seems that it's using a
> drastically different query plan for the INSERT+SELECT than SELECT by
> itself.

The reason for the plan shape difference is probably that the bare SELECT
is allowed to use parallelism while INSERT/SELECT isn't.  I'm not sure
to what extent we could relax that without creating semantic gotchas.

However, your real problem with either query is that the planner's
rowcount estimates are off by several orders of magnitude.  If you could
improve that, you'd likely get better plan choices in both cases.

I also notice that this seems to be a 14-way join, which means you're
probably getting an artificially poor plan as a result of
from_collapse_limit and/or join_collapse_limit constraining the planner's
search space.  Maybe raising those limits would help, although I'm not
sure how much it'd help if the rowcount estimates aren't improved.

Since you haven't told us much of anything about the actual query or the
data, it's hard to offer concrete advice beyond that.

                        regards, tom lane

Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLEAS with same SELECT takes 7 minutes

От
Justin Pryzby
Дата:
On Thu, Sep 27, 2018 at 03:37:57PM -0400, Arjun Ranade wrote:
> Yes, that join is concerning (red text below).  The conditions all need to
> be checked so they are independent.

You can play with the join conditions to see which test is getting such a bad
estimate, or if it's a combination of tests (as I suspected) giving a bad
estimate.

There's a good chance this one isn't doing very well:

>    vw2.product_group_name ||'.'|| vw2.product_node_name = i.product_node_name

As a workaround/test, you could maybe add an expression index
ON( (vw2.product_group_name ||'.'|| vw2.product_node_name) )

..and then ANALYZE.  Eventually, you'd want to consider splitting
i.product_node_name into separate columns.  

Justin


Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE ASwith same SELECT takes 7 minutes

От
Arjun Ranade
Дата:
Yes, that join is concerning (red text below).  The conditions all need to be checked so they are independent.

The query (with consistent obfuscation) is below :

select distinct
a.sale_id
, a.test_date
, a.product_id as original_product_id
,vw2.product_id
, a.volume as volume
,b.pair_rank
from not_sold_locations a
inner join vw_product vw2 using (product_group_name,product_class_code,product_type_code,sale_end_date)
inner join product_mapping b on a.product_group_name = b.left_product_group_name and
a.product_node_name = b.left_product_node and
a.product_type_code = b.left_product and
vw2.product_node_name = b.right_product_node and
vw2.product_group_name = b.right_product_group_name
and
vw2.product_type_code = b.right_product
inner join mapping_ref i on vw2.product_group_name || '.' || vw2.product_node_name = i.product_node_name and
vw2.product_class_code = i.product_class_code and
vw2.product_type_code = i.product_type_code and
vw2.sale_end_date between i.first_product_date and i.last_product_date;

not_sold_locations(a) has 836 rows
vw_product (vw2) has 785k rows and is a view that joins 11 tables together to have a consolidated view of all products, sales locations, etc
product_mapping (b) has 2520 rows
mapping_ref (i) has 178 rows


On Thu, Sep 27, 2018 at 2:52 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Thu, Sep 27, 2018 at 01:08:05PM -0400, Arjun Ranade wrote:
> When I look at the EXPLAIN ANALYZE output, it seems that it's using a
> drastically different query plan for the INSERT+SELECT than SELECT by
> itself.

The fast, SELECT plan is using parallel query, which isn't available for INSERT+SELECT:

https://www.postgresql.org/docs/current/static/when-can-parallel-query-be-used.html
|Even when it is in general possible for parallel query plans to be generated, the planner will not generate them for a given query if any of the following are true:
|The query writes any data or locks any database rows.

Using parallel query in this case happens to mitigate the effects of the bad
plan.

I see Tom responded, and you got an improvement by changing join threshold.

But I think you could perhaps get an better plan if the rowcount estimates were
fixed.  That's more important than probably anything else - changing settings
is only a workaround for bad estimates.

In the slow/INSERT plan, this join is returning 55000x more rows than expected
(not 55k more: 55k TIMES more).

7.      26,937.132      401,503.136     ↓ 55,483.7      332,902         1       
Nested Loop (cost=1,516.620..42,244.240 rows=6 width=84) (actual time=311.021..401,503.136 rows=332,902 loops=1)
    Join Filter: (((papa_echo.oscar_bravo)::text = (five_hotel.tango_november)::text) AND ((papa_echo.lima_tango)::text = (five_hotel.lima_mike)::text) AND ((xray_juliet1.juliet)::text = (five_hotel.papa_victor)::text))
    Rows Removed by Join Filter: 351664882
    Buffers: shared hit=8570619 read=6

First question is if all those conditions are independent?  Or if one of those
conditions also implies another, which is confusing the planner.

Justin

Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE ASwith same SELECT takes 7 minutes

От
Arjun Ranade
Дата:
"As a workaround/test, you could maybe add an expression index
ON( (vw2.product_group_name ||'.'|| vw2.product_node_name) )"

Unfortunately, vw2 is a view, but I had a similar thought.  I'm looking into splitting i.product-node_name into separate columns though, thanks!


On Thu, Sep 27, 2018 at 3:33 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Thu, Sep 27, 2018 at 03:37:57PM -0400, Arjun Ranade wrote:
> Yes, that join is concerning (red text below).  The conditions all need to
> be checked so they are independent.

You can play with the join conditions to see which test is getting such a bad
estimate, or if it's a combination of tests (as I suspected) giving a bad
estimate.

There's a good chance this one isn't doing very well:

>    vw2.product_group_name ||'.'|| vw2.product_node_name = i.product_node_name

As a workaround/test, you could maybe add an expression index
ON( (vw2.product_group_name ||'.'|| vw2.product_node_name) )

..and then ANALYZE.  Eventually, you'd want to consider splitting
i.product_node_name into separate columns. 

Justin

Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE ASwith same SELECT takes 7 minutes

От
Vladimir Ryabtsev
Дата:
> The reason for the plan shape difference is probably that the bare SELECT
> is allowed to use parallelism while INSERT/SELECT isn't.
In case parallelism is used, should it report in the plan as something like "workers planned: N"?

Vlad
Vladimir Ryabtsev <greatvovan@gmail.com> writes:
>> The reason for the plan shape difference is probably that the bare SELECT
>> is allowed to use parallelism while INSERT/SELECT isn't.

> In case parallelism is used, should it report in the plan as something like
> "workers planned: N"?

It did --- see the Gather node.

            regards, tom lane


Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLE ASwith same SELECT takes 7 minutes

От
Vladimir Ryabtsev
Дата:
> It did --- see the Gather node.
But "workers launched: 1"...
To my opinion, such a dramatic difference cannot be explained with avoiding parallelism, the query was just stuck in a very inefficient plan (even though almost all source data is read from cache).

Additionally, I think author can try CREATE STATISTICS on the bunch of columns used in join. Very low rows estimate for this join may come from multiplying selectivities for each column assuming they are independent.

Vlad

Re: SELECT statement returns in 10seconds, but INSERT/CREATE TABLEAS with same SELECT takes 7 minutes

От
Justin Pryzby
Дата:
On Thu, Sep 27, 2018 at 04:50:36PM -0700, Vladimir Ryabtsev wrote:
> Additionally, I think author can try CREATE STATISTICS on the bunch of
> columns used in join. Very low rows estimate for this join may come from
> multiplying selectivities for each column assuming they are independent.

MV statistics don't currently help for joins:

https://www.postgresql.org/message-id/flat/CAKJS1f-6B7KnDFrh6SFhYn-YbHYOXmDDAfd0XC%3DjJKZMCrfQyg%40mail.gmail.com#925e19951fabc9a480b804d661d83be8

Justin