Обсуждение: [BUGS] BUG #14505: explain verbose for postgresql_fdw

Поиск
Список
Период
Сортировка

[BUGS] BUG #14505: explain verbose for postgresql_fdw

От
jeff.janes@gmail.com
Дата:
The following bug has been logged on the website:

Bug reference:      14505
Logged by:          Jeff Janes
Email address:      jeff.janes@gmail.com
PostgreSQL version: 9.6.1
Operating system:   Linux (CentOS)
Description:

The "Output:" line of explain verbose lists columns of foreign table that
are not actually output, or at least ought not be output.  This extends back
to at least 9.4 as well

explain verbose select local12,a3 from local1 join remote4 on (foobar=a7);

                       QUERY PLAN
-------------------------------------------------------------------------
 Hash Join  (cost=121.25..242.85 rows=2325 width=70)
   Output: local1.local12, remote4.a3
   Hash Cond: (remote4.a7 = local1.foobar)
   ->  Foreign Scan on public.remote4  (cost=100.00..137.90 rows=930
width=64)
         Output: remote4.id, remote4.code, remote4.a1, remote4.a2,
remote4.a3, remote4.a4, remote4.a5, remote4.a6, remote4.a7, remote4.a8,
remote4.a9
         Remote SQL: SELECT a3, a7 FROM remote.remote4
   ->  Hash  (cost=15.00..15.00 rows=500 width=70)
         Output: local1.local12, local1.foobar
         ->  Seq Scan on public.local1  (cost=0.00..15.00 rows=500
width=70)
               Output: local1.local12, local1.foobar


If the remote SQL only selects a3 and a7, how can all columns be output?

Full reproduction:


drop database foobar;
create database foobar;
\c foobar

CREATE EXTENSION IF NOT EXISTS postgres_fdw WITH SCHEMA public;

CREATE SERVER remote_server FOREIGN DATA WRAPPER postgres_fdw options
(dbname 'foobar') ;
CREATE USER MAPPING FOR postgres SERVER remote_server OPTIONS (
    "user" 'postgres'
);

CREATE TABLE local1 (
    local12 character varying(40) NOT NULL,
    foobar text,
    local13 text,
    local19 text
);

CREATE schema remote;

CREATE TABLE remote.remote4 (
    id bigint NOT NULL,
    code character varying(255) NOT NULL,
    a1 text,
    a2 text,
    a3 text,
    a4 text,
    a5 text,
    a6 text,
    a7 text,
    a8 text,
    a9 text
);

CREATE FOREIGN TABLE remote4 (
    id bigint NOT NULL,
    code character varying(255) NOT NULL,
    a1 text,
    a2 text,
    a3 text,
    a4 text,
    a5 text,
    a6 text,
    a7 text,
    a8 text,
    a9 text
)
SERVER remote_server
OPTIONS (
    schema_name 'remote',
    table_name 'remote4'
);

explain verbose select local12,a3 from local1 join remote4 on (foobar=a7);

-- doesn't happen on local table

explain verbose select local12,a3 from local1 join remote.remote4 on
(foobar=a7);



--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14505: explain verbose for postgresql_fdw

От
Tom Lane
Дата:
jeff.janes@gmail.com writes:
> The "Output:" line of explain verbose lists columns of foreign table that
> are not actually output, or at least ought not be output.

That's an optimization to avoid an unnecessary projection step.  It has
nothing to do with what's retrieved from the remote, and for that matter
it's not specific to foreign tables.

            regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14505: explain verbose for postgresql_fdw

От
Jeff Janes
Дата:
On Wed, Jan 18, 2017 at 5:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
jeff.janes@gmail.com writes:
> The "Output:" line of explain verbose lists columns of foreign table that
> are not actually output, or at least ought not be output.

That's an optimization to avoid an unnecessary projection step.  It has
nothing to do with what's retrieved from the remote, and for that matter
it's not specific to foreign tables.

Is all of that data actually being transferred from the foreign to the local?  If so, I don't think the optimization is a good choice in this case.  Do you know where can I find the relevant source code which decides on this?
 
Cheers,

Jeff

Re: [BUGS] BUG #14505: explain verbose for postgresql_fdw

От
Tom Lane
Дата:
Jeff Janes <jeff.janes@gmail.com> writes:
> On Wed, Jan 18, 2017 at 5:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> jeff.janes@gmail.com writes:
>>> The "Output:" line of explain verbose lists columns of foreign table that
>>> are not actually output, or at least ought not be output.

>> That's an optimization to avoid an unnecessary projection step.  It has
>> nothing to do with what's retrieved from the remote, and for that matter
>> it's not specific to foreign tables.

> Is all of that data actually being transferred from the foreign to the
> local?

Well, that would depend on the particular FDW, but postgres_fdw doesn't
transfer unneeded columns, as you can tell by looking at the "Remote SQL"
command.

The key point here is that the output of the ForeignScan plan node has to
match the declared rowtype of the foreign table.  So what postgres_fdw is
going to do, after retrieving the subset of columns that it detected were
used in the query, is construct a dummy tuple that has all the columns but
fills in nulls for the unused ones.  The point of what the planner is
doing with the node output is that we don't need to run ExecProject just
to throw away those nulls again; we can pass the dummy tuple up to the
join node as-is.  (In some cases, like if we had to pass the tuples
through a sort, it'd be worth projecting to remove unused columns ... but
the planner knows that, and generates a minimal output tlist in those
cases.  This isn't one of those cases.)

> If so, I don't think the optimization is a good choice in this case.

I think you're mistaken.

The alternative that we could have used is to require FDWs to emit tuples
containing only used columns --- but that would *require* all FDWs to
contain the logic about identifying used columns, and for at least some
kinds of FDW it would add processing not subtract it.  It'd also
complicate matters elsewhere, because ForeignScan would act quite a bit
different from scans on regular tables.  In those scans, the raw tuples
are necessarily going to contain all columns.

> Do you know where can I find the relevant source code which decides
> on this?

See use_physical_tlist in createplan.c.

            regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs