Обсуждение: Which is more efficient?

От:
"Mike G."
Дата:

Hi,

I have data that I am taking from 2 tables, pulling out specific columns and inserting into one table.

Is it more efficient to do:
a)  insert into x
    select z from y;
    insert into x
    select z from a;

b)  insert into x
    select z from y
    union all
    select z from a;

I have run both through explain.
a) 650ms
b) 741.57ms

According to the planner option a, select z from y takes 545.93 ms
Under option b select z from y takes 553.34 ms

Shouldn't the time predicted for the select z from y be the same?

I would believe b would be more efficient as the inserts could be done in a batch rather than individual transactions
butthe planner doesn't recognize that.  When I run option a through the planner I have to highlight each insert
separatelysince the planner stops executing after the first ; it comes across. 

Mike

От:
Christopher Browne
Дата:

A long time ago, in a galaxy far, far away,  ("Mike G.") wrote:
> Hi,
>
> I have data that I am taking from 2 tables, pulling out specific columns and inserting into one table.
>
> Is it more efficient to do:
> a)  insert into x
>     select z from y;
>     insert into x
>     select z from a;
>
> b)  insert into x
>     select z from y
>     union all
>     select z from a;
>
> I have run both through explain.
> a) 650ms
> b) 741.57ms
>
> According to the planner option a, select z from y takes 545.93 ms
> Under option b select z from y takes 553.34 ms
>
> Shouldn't the time predicted for the select z from y be the same?

No, these are approximations.  They can't be expected to be identical,
and as you can see there's no material difference, as 545.93 only
differs from 553.34 by 1.34%.

The point of EXPLAIN is to show the query _plans_ so you can evaluate
how sane they seem.  They're pretty well identical, so EXPLAIN's doing
what might be expected.

> I would believe b would be more efficient as the inserts could be
> done in a batch rather than individual transactions but the planner
> doesn't recognize that.  When I run option a through the planner I
> have to highlight each insert separately since the planner stops
> executing after the first ; it comes across.

The case where there would be a _material_ difference would be where
there were hardly any rows in either of the tables you're adding in,
and in that case, query planning becomes a significant cost, at which
point simpler is probably better.

If you do the queries in separate transactions, there's some addition
of cost of COMMIT involved, but if they can be kept in a single
transaction, the approaches oughtn't be materially different in cost,
and that's what you're finding.
--
select 'cbbrowne' || '@' || 'gmail.com';
http://www3.sympatico.ca/cbbrowne/x.html
MICROS~1:  Where do you  want to  go today?   Linux: Been  there, done
that.