Обсуждение: Querying foreign table with SP-GiST index results in “ERROR: cache lookup failed for type 0”

Поиск
Список
Период
Сортировка

Querying foreign table with SP-GiST index results in “ERROR: cache lookup failed for type 0”

От
Joao Ferreira
Дата:
Greetings,

I'm using PostgreSQL 11.3 and I want to access a table in another Postgres database using postgres_fdw but since the original table has an SP-GiST index on a geometry column, the query on the foreign table fails with "ERROR: cache lookup failed for type 0", even if the table is empty (Step 7). This doesn't happen if the query is done directly to the remote table (Steps 3 and 6) or if the remote table doesn't have the index (Steps 3 and 4).

OS: Windows Server 2016 Standard
RAM: 16GB

Postgres installer: postgresql-11.3-1-windows-x64.exe
Postgis installer: postgis-bundle-pg11x64-setup-2.5.2-1.exe
Postgres version (using 'SELECT version()'): PostgreSQL 11.3, compiled by Visual C++ build 1914, 64-bit 
Postgis version (using 'SELECT postgis_version()')2.5 USE_GEOS=1 USE_PROJ=1 USE_STATS=1



Bug reproduction steps:

1) Create remote database: 
psql -f remote_db.sql postgres postgres

2) Create local database:
psql -f local_db.sql postgres postgres

3) Select directly on remote database:
psql -c "SELECT * FROM remote.locations" remote postgres
Output:
 id | name | lonlat
----+------+--------
(0 rows)

4) Select through foreign table:
psql -c "SELECT * FROM localdb.locations" localdb postgres
Output:
 id | name | lonlat
----+------+--------
(0 rows)

5) Add SP-GiST index to remote database:
psql -c "CREATE INDEX remote_locations_lonlat_idx ON remote.locations USING spgist (lonlat)" remote postgres
Output:
CREATE INDEX

6) Select directly on remote database:
psql -c "SELECT * FROM remote.locations" remote postgres
Output:
 id | name | lonlat
----+------+--------
(0 rows)

7) Select through foreign table (with log error verbosity set to verbose):
psql -c "SET log_error_verbosity TO verbose; SELECT * FROM localdb.locations;" localdb postgres
Output:
ERROR:  cache lookup failed for type 0
CONTEXT:  remote SQL command: SELECT id, name, lonlat FROM remote.locations


From the logs I have:
2019-06-18 12:20:03.195 BST [5688] ERROR:  XX000: cache lookup failed for type 0
2019-06-18 12:20:03.195 BST [5688] CONTEXT:  remote SQL command: SELECT id, name, lonlat FROM remote.locations
2019-06-18 12:20:03.195 BST [5688] LOCATION:  pgfdw_report_error, d:\pginstaller.auto\postgres.windows-x64\contrib\postgres_fdw\connection.c:633
2019-06-18 12:20:03.195 BST [5688] STATEMENT:  SET log_error_verbosity TO verbose; SELECT * FROM localdb.locations;


Hope I have provided all the needed information, if not, feel free to ask for it. Thank you in advance for all the help.

Best regards,

João Ferreira
Вложения
On Tue, Jun 18, 2019 at 8:02 AM Joao Ferreira <jpgferreira@yahoo.com> wrote:
Greetings,

I'm using PostgreSQL 11.3 and I want to access a table in another Postgres database using postgres_fdw but since the original table has an SP-GiST index on a geometry column, the query on the foreign table fails with "ERROR: cache lookup failed for type 0", even if the table is empty (Step 7). This doesn't happen if the query is done directly to the remote table (Steps 3 and 6) or if the remote table doesn't have the index (Steps 3 and 4).

I've verified this on ubuntu 18.04 with 11.3 installed from PGDG apt repo, and attached a single-file reproduction.

If I change from sp-gist to just gist, there is no problem.  And if I change to indexing a built-in sp-gist operator class (point rather than geom), there is also no problem.

Getting postgis to compile is a bit hard and slow, so I haven't completed any tests against the unreleased versions yet.
 
Cheers,

Jeff
Вложения
Thank you Jeff.

I'll create the index on the geometry column as gist instead of spgist. On my first impression it seems to work fine ;)

Cheers,

João Ferreira

On Tuesday, June 18, 2019, 01:22:27 PM GMT+1, Jeff Janes <jeff.janes@gmail.com> wrote:


On Tue, Jun 18, 2019 at 8:02 AM Joao Ferreira <jpgferreira@yahoo.com> wrote:
Greetings,

I'm using PostgreSQL 11.3 and I want to access a table in another Postgres database using postgres_fdw but since the original table has an SP-GiST index on a geometry column, the query on the foreign table fails with "ERROR: cache lookup failed for type 0", even if the table is empty (Step 7). This doesn't happen if the query is done directly to the remote table (Steps 3 and 6) or if the remote table doesn't have the index (Steps 3 and 4).

I've verified this on ubuntu 18.04 with 11.3 installed from PGDG apt repo, and attached a single-file reproduction.

If I change from sp-gist to just gist, there is no problem.  And if I change to indexing a built-in sp-gist operator class (point rather than geom), there is also no problem.

Getting postgis to compile is a bit hard and slow, so I haven't completed any tests against the unreleased versions yet.
 
Cheers,

Jeff
On Tue, Jun 18, 2019 at 9:22 PM Jeff Janes <jeff.janes@gmail.com> wrote:
> On Tue, Jun 18, 2019 at 8:02 AM Joao Ferreira <jpgferreira@yahoo.com> wrote:
>> I'm using PostgreSQL 11.3 and I want to access a table in another Postgres database using postgres_fdw but since the
originaltable has an SP-GiST index on a geometry column, the query on the foreign table fails with "ERROR: cache lookup
failedfor type 0", even if the table is empty (Step 7). This doesn't happen if the query is done directly to the remote
table(Steps 3 and 6) or if the remote table doesn't have the index (Steps 3 and 4). 

I started working on this.

> I've verified this on ubuntu 18.04 with 11.3 installed from PGDG apt repo, and attached a single-file reproduction.

I've also verified this on my environment with PG 11.4.

> If I change from sp-gist to just gist, there is no problem.  And if I change to indexing a built-in sp-gist operator
class(point rather than geom), there is also no problem. 

Verified.

Will continue investigations.

Best regards,
Etsuro Fujita



On Thu, Jun 20, 2019 at 6:16 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> On Tue, Jun 18, 2019 at 9:22 PM Jeff Janes <jeff.janes@gmail.com> wrote:
> > On Tue, Jun 18, 2019 at 8:02 AM Joao Ferreira <jpgferreira@yahoo.com> wrote:
> >> I'm using PostgreSQL 11.3 and I want to access a table in another Postgres database using postgres_fdw but since
theoriginal table has an SP-GiST index on a geometry column, the query on the foreign table fails with "ERROR: cache
lookupfailed for type 0", even if the table is empty (Step 7). This doesn't happen if the query is done directly to the
remotetable (Steps 3 and 6) or if the remote table doesn't have the index (Steps 3 and 4). 

> > I've verified this on ubuntu 18.04 with 11.3 installed from PGDG apt repo, and attached a single-file reproduction.
>
> I've also verified this on my environment with PG 11.4.
>
> > If I change from sp-gist to just gist, there is no problem.  And if I change to indexing a built-in sp-gist
operatorclass (point rather than geom), there is also no problem. 
>
> Verified.
>
> Will continue investigations.

I found that this error is thrown by index_can_return()
(spgcanreturn()) called from get_relation_info() for the SP-GiST index
on the locations table.  IIUC, the cause of that is: the SP-GiST
config function defined in PostGIS shown below called from
spgGetCache() in spgcanreturn() doesn't work well for the postgres_fdw
case, in which case cfg->prefixType and cfg->leafType are set to
InvalidOid, causing the "ERROR: cache lookup failed for type 0" error
in fillTypeDesc() callled later in spgGetCache() for the
cfg->prefixType, because TypenameGetTypid() returns InvalidOid in that
case since that in postgres_fdw we run the connection with search_path
restricted to pg_catalog while box2df is created in the public schema
in an environment built using a PostGIS default configuration.

PGDLLEXPORT Datum gserialized_spgist_config_2d(PG_FUNCTION_ARGS)
{
    spgConfigOut *cfg = (spgConfigOut *)PG_GETARG_POINTER(1);

--> Oid boxoid = TypenameGetTypid("box2df");
    cfg->prefixType = boxoid;
    cfg->labelType = VOIDOID; /* We don't need node labels. */
    cfg->leafType = boxoid;
    cfg->canReturnData = false;
    cfg->longValuesOK = false;

    PG_RETURN_VOID();
}

I'm a newbie to PostGIS, so maybe I'm missing something, but isn't it
a bit fragile to use TypenameGetTypid() here?  I just thought it would
be better to set boxoid in a more hard-coded way in this function, not
using TypenameGetTypid().

Best regards,
Etsuro Fujita



Etsuro Fujita <etsuro.fujita@gmail.com> writes:
> I found that this error is thrown by index_can_return()
> (spgcanreturn()) called from get_relation_info() for the SP-GiST index
> on the locations table.

> PGDLLEXPORT Datum gserialized_spgist_config_2d(PG_FUNCTION_ARGS)
> {
>     spgConfigOut *cfg = (spgConfigOut *)PG_GETARG_POINTER(1);

> --> Oid boxoid = TypenameGetTypid("box2df");
>     cfg->prefixType = boxoid;
>     cfg->labelType = VOIDOID; /* We don't need node labels. */
>     cfg->leafType = boxoid;
>     cfg->canReturnData = false;
>     cfg->longValuesOK = false;

>     PG_RETURN_VOID();
> }

> I'm a newbie to PostGIS, so maybe I'm missing something, but isn't it
> a bit fragile to use TypenameGetTypid() here?

More than a bit :-(.  That's guaranteed to fail if type box2df isn't
in the search path, which it likely won't be during a remote query,
since postgres_fdw locks down the search path to just pg_catalog.

This is a Postgis bug.  It's probably not quite trivial to solve,
since this code can't know what schema the type is installed in
at runtime.  This seems related to the discussions we had awhile
ago about how planner support functions can find out the OIDs of
extension objects reliably.  I'm not sure where Paul is on that...

            regards, tom lane





On Jun 26, 2019, at 2:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Etsuro Fujita <etsuro.fujita@gmail.com> writes:
I found that this error is thrown by index_can_return()
(spgcanreturn()) called from get_relation_info() for the SP-GiST index
on the locations table.

PGDLLEXPORT Datum gserialized_spgist_config_2d(PG_FUNCTION_ARGS)
{
   spgConfigOut *cfg = (spgConfigOut *)PG_GETARG_POINTER(1);

--> Oid boxoid = TypenameGetTypid("box2df");
   cfg->prefixType = boxoid;
   cfg->labelType = VOIDOID; /* We don't need node labels. */
   cfg->leafType = boxoid;
   cfg->canReturnData = false;
   cfg->longValuesOK = false;

   PG_RETURN_VOID();
}

I'm a newbie to PostGIS, so maybe I'm missing something, but isn't it
a bit fragile to use TypenameGetTypid() here?

More than a bit :-(.  That's guaranteed to fail if type box2df isn't
in the search path, which it likely won't be during a remote query,
since postgres_fdw locks down the search path to just pg_catalog.

This is a Postgis bug.  It's probably not quite trivial to solve,
since this code can't know what schema the type is installed in
at runtime.  This seems related to the discussions we had awhile
ago about how planner support functions can find out the OIDs of
extension objects reliably.  I'm not sure where Paul is on that...

Well, I seemed able to skate around the supportfn problem without having to lookup types.
TypenameGetTypid() does show up in a number of places in the PostGIS code base, so this is a little concerning.
The SPGist code definitely has it, probably calls to ST_EstimatedExtent, ST_AsMVT and ST_AsGeoBuf would also fail in an FDW context, if the problem manifests the same everywhere.

There’s some code in other places that attempts to cache the relevant Oids for geometry and geography on first function call, but that wouldn’t really solve the problem.

We had a similar problem, if I recall correctly, doing lookups of spatial reference information, for similar reasons: calling over FDW the search_path was so limited the internal lookup functions couldn’t find the spatial_ref_sys table. I just looked up the work-around, and it looks like we infer that the table we’re looking for (part of PostGIS) resides in the same schema as the calling function (part of PostGIS), and then cache that schema name for future use.


I’m not sure if this same ugly trick would work for the spgist problem, but maybe so?

P
Paul Ramsey <pramsey@cleverelephant.ca> writes:
>> On Jun 26, 2019, at 2:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> More than a bit :-(.  That's guaranteed to fail if type box2df isn't
>> in the search path, which it likely won't be during a remote query,
>> since postgres_fdw locks down the search path to just pg_catalog.

> We had a similar problem, if I recall correctly, doing lookups of spatial reference information, for similar reasons:
callingover FDW the search_path was so limited the internal lookup functions couldn’t find the spatial_ref_sys table. I
justlooked up the work-around, and it looks like we infer that the table we’re looking for (part of PostGIS) resides in
thesame schema as the calling function (part of PostGIS), and then cache that schema name for future use. 

Yeah.  The variant I was thinking about was to look up the namespace
of the geometry type (which in this example you could get from attType
of the spgConfigIn parameter struct) and assume that box2df is in that
schema.  But working from the function's own schema might be better
because you could do the same thing in all these places, rather than
needing to find an associated object in some call-site-specific way.

            regards, tom lane



On Wed, Jun 26, 2019 at 4:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Paul Ramsey <pramsey@cleverelephant.ca> writes:
> >> On Jun 26, 2019, at 2:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> More than a bit :-(.  That's guaranteed to fail if type box2df isn't
> >> in the search path, which it likely won't be during a remote query,
> >> since postgres_fdw locks down the search path to just pg_catalog.
>
> > We had a similar problem, if I recall correctly, doing lookups of spatial reference information, for similar
reasons:calling over FDW the search_path was so limited the internal lookup functions couldn’t find the spatial_ref_sys
table.I just looked up the work-around, and it looks like we infer that the table we’re looking for (part of PostGIS)
residesin the same schema as the calling function (part of PostGIS), and then cache that schema name for future use. 
>
> Yeah.  The variant I was thinking about was to look up the namespace
> of the geometry type (which in this example you could get from attType
> of the spgConfigIn parameter struct) and assume that box2df is in that
> schema.  But working from the function's own schema might be better
> because you could do the same thing in all these places, rather than
> needing to find an associated object in some call-site-specific way.

I've patched trunk, 2.5 and 2.4, so a fix should be in the next patch
release or in 3.0 if you wait that long.

https://trac.osgeo.org/postgis/ticket/4440

Thanks for the good reproduction case.

ATB,
P