Обсуждение: [7.0.2] node type 17 not supported ...

Поиск
Список
Период
Сортировка

[7.0.2] node type 17 not supported ...

От
The Hermit Hacker
Дата:
I have a database where, periodically, I get a query that is producing
pg_noname files that are >1gig in size ... according to syslog, for that
process:

Sep  7 18:36:39 pgsql postgres[47078]: DEBUG:  ExecRestrPos: node type 17 not supported
Sep  7 18:36:39 pgsql postgres[47078]: DEBUG:  ExecRestrPos: node type 17 not supported
Sep  7 18:36:40 pgsql postgres[47078]: DEBUG:  ExecRestrPos: node type 17 not supported
Sep  7 18:36:56 pgsql postgres[47078]: DEBUG:  ExecMarkPos: node type 17 not supported
%

the query that appears to be causing this, in this particular case, is:

SELECT distinct s.gid, s.created ,       geo_distance(pd.location, '(-97.4382912597586,37.7021126098755)')  FROM status
s,personal_data pd, relationship_wanted rw,       personal_ethnicity pe, personal_religion pr, personal_bodytype pb,
 personal_smoking ps WHERE s.active   AND s.status != 0                  AND (s.gid = pd.gid AND pd.gender = 0)  AND
(s.gid= rw.gid AND rw.gender = 1)                  AND geo_distance(pd.location,
'(-97.4382912597586,37.7021126098755)')<= 500
 
ORDER BY geo_distance( pd.location, '(-97.4382912597586,37.7021126098755)'),          s.created desc;

now, its a reasonable oft run query, and from a debugging log that I keep,
it normally takes <1sec to run:

[0.38 secs]: SELECT distinct s.gid, s.created , geo_distance(pd.location, '(-97.4382912597586,37.7021126098755)')
     FROM status s, personal_data pd, relationship_wanted rw            WHERE s.active AND s.status != 0
AND(s.gid = pd.gid AND pd.gender = 0)              AND (s.gid = rw.gid AND rw.gender = 1 )              AND
geo_distance(pd.location, '(-97.4382912597586,37.7021126098755)' ) <= 500         ORDER BY geo_distance( pd.location,
'(-97.4382912597586,37.7021126098755)'), s.created desc;
 

So, I'm curious as to why it periodically just hangs ... how do you debug
something like this? :(  Its been happening ~once per day, so should be
reasonably debugging (unless, of course, now that I mention something it
never comes back *sigh*) ...

Thoughts? 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 




Re: [7.0.2] node type 17 not supported ...

От
Tom Lane
Дата:
The Hermit Hacker <scrappy@hub.org> writes:
> I have a database where, periodically, I get a query that is producing
> pg_noname files that are >1gig in size ... according to syslog, for that
> process:

> Sep  7 18:36:39 pgsql postgres[47078]: DEBUG:  ExecRestrPos: node type 17 not supported
> Sep  7 18:36:39 pgsql postgres[47078]: DEBUG:  ExecRestrPos: node type 17 not supported
> Sep  7 18:36:40 pgsql postgres[47078]: DEBUG:  ExecRestrPos: node type 17 not supported
> Sep  7 18:36:56 pgsql postgres[47078]: DEBUG:  ExecMarkPos: node type 17 not supported
> %

This is the planner bug that I was just alluding to in other email ---
the planner is trying to use a nestloop as the inner input to a
mergejoin, and that doesn't work :-(.  But you only see the problem
if the outer side contains multiple matches to a single inside tuple.

I have a fix for current sources; let me see if I can retrofit it for
7.0.*.
        regards, tom lane


Re: [7.0.2] node type 17 not supported ...

От
The Hermit Hacker
Дата:
On Thu, 7 Sep 2000, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> > I have a database where, periodically, I get a query that is producing
> > pg_noname files that are >1gig in size ... according to syslog, for that
> > process:
> 
> > Sep  7 18:36:39 pgsql postgres[47078]: DEBUG:  ExecRestrPos: node type 17 not supported
> > Sep  7 18:36:39 pgsql postgres[47078]: DEBUG:  ExecRestrPos: node type 17 not supported
> > Sep  7 18:36:40 pgsql postgres[47078]: DEBUG:  ExecRestrPos: node type 17 not supported
> > Sep  7 18:36:56 pgsql postgres[47078]: DEBUG:  ExecMarkPos: node type 17 not supported
> > %
> 
> This is the planner bug that I was just alluding to in other email ---
> the planner is trying to use a nestloop as the inner input to a
> mergejoin, and that doesn't work :-(.  But you only see the problem if
> the outer side contains multiple matches to a single inside tuple.
> 
> I have a fix for current sources; let me see if I can retrofit it for
> 7.0.*.

that would be perfect ... if we can get that retrofit'd, I'd be quite
tempted to put out a 7.0.3 for this, considering that its obviously not an
isolated incident ;(

Thanks ...



Re: [7.0.2] node type 17 not supported ...

От
Tom Lane
Дата:
The Hermit Hacker <scrappy@hub.org> writes:
> On Thu, 7 Sep 2000, Tom Lane wrote:
>> This is the planner bug that I was just alluding to in other email ---
>> the planner is trying to use a nestloop as the inner input to a
>> mergejoin, and that doesn't work :-(.  But you only see the problem if
>> the outer side contains multiple matches to a single inside tuple.
>> 
>> I have a fix for current sources; let me see if I can retrofit it for
>> 7.0.*.

> that would be perfect ... if we can get that retrofit'd, I'd be quite
> tempted to put out a 7.0.3 for this, considering that its obviously not an
> isolated incident ;(

I have committed a fix into REL7_0 branch.  Although it seems to work,
I don't trust it really far because it depends on heap_markpos() and
heap_restrpos(), which haven't been used in a long time and are full
of alarmed-sounding comments.  (The equivalent fix in current sources
does not use these routines, but that's because nodeMaterial.c has been
completely rewritten, so back-patching that code doesn't seem like a
risk-free choice either.)

I'd suggest running the REL7_0 sources on your machine for awhile before
deciding it's safe to call it 7.0.3.
        regards, tom lane


Re: [7.0.2] node type 17 not supported ...

От
The Hermit Hacker
Дата:
On Thu, 7 Sep 2000, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> > On Thu, 7 Sep 2000, Tom Lane wrote:
> >> This is the planner bug that I was just alluding to in other email ---
> >> the planner is trying to use a nestloop as the inner input to a
> >> mergejoin, and that doesn't work :-(.  But you only see the problem if
> >> the outer side contains multiple matches to a single inside tuple.
> >> 
> >> I have a fix for current sources; let me see if I can retrofit it for
> >> 7.0.*.
> 
> > that would be perfect ... if we can get that retrofit'd, I'd be quite
> > tempted to put out a 7.0.3 for this, considering that its obviously not an
> > isolated incident ;(
> 
> I have committed a fix into REL7_0 branch.  Although it seems to work,
> I don't trust it really far because it depends on heap_markpos() and
> heap_restrpos(), which haven't been used in a long time and are full
> of alarmed-sounding comments.  (The equivalent fix in current sources
> does not use these routines, but that's because nodeMaterial.c has been
> completely rewritten, so back-patching that code doesn't seem like a
> risk-free choice either.)
> 
> I'd suggest running the REL7_0 sources on your machine for awhile before
> deciding it's safe to call it 7.0.3.

Okay, I'm going to upgrade to it on Friday night, most likely, and will
let her run for a few days ...

Do you have any thoughts as to what sorts of problems *might*
arise?  Like, are we talking database corruption possibilities, or bad
results, or ... ?  Just want to have an idea of what to try and keep an
eye out for ...

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: [7.0.2] node type 17 not supported ...

От
Tom Lane
Дата:
The Hermit Hacker <scrappy@hub.org> writes:
> On Thu, 7 Sep 2000, Tom Lane wrote:
>> I have committed a fix into REL7_0 branch.  Although it seems to work,
>> I don't trust it really far because it depends on heap_markpos() and
>> heap_restrpos(), which haven't been used in a long time and are full
>> of alarmed-sounding comments.

> Do you have any thoughts as to what sorts of problems *might*
> arise?  Like, are we talking database corruption possibilities, or bad
> results, or ... ?  Just want to have an idea of what to try and keep an
> eye out for ...

I may be overstating the cause for worry.  All of the "alarmed-sounding
comments" appear to date back to the original Postgres95 sources, and
are probably obsolete.  The only thing I really have any concern about
is whether buffer pin/unpin bookkeeping is correct.  If it's not,
you'd see an Assert failure from too many unpins (you are running with
--enable-cassert I hope) or "Buffer Leak" notices in the log from too
many pins.
        regards, tom lane


Re: [7.0.2] node type 17 not supported ...

От
The Hermit Hacker
Дата:
On Thu, 7 Sep 2000, Tom Lane wrote:

> The Hermit Hacker <scrappy@hub.org> writes:
> > On Thu, 7 Sep 2000, Tom Lane wrote:
> >> I have committed a fix into REL7_0 branch.  Although it seems to work,
> >> I don't trust it really far because it depends on heap_markpos() and
> >> heap_restrpos(), which haven't been used in a long time and are full
> >> of alarmed-sounding comments.
> 
> > Do you have any thoughts as to what sorts of problems *might*
> > arise?  Like, are we talking database corruption possibilities, or bad
> > results, or ... ?  Just want to have an idea of what to try and keep an
> > eye out for ...
> 
> I may be overstating the cause for worry.  All of the "alarmed-sounding
> comments" appear to date back to the original Postgres95 sources, and
> are probably obsolete.  The only thing I really have any concern about
> is whether buffer pin/unpin bookkeeping is correct.  If it's not,
> you'd see an Assert failure from too many unpins (you are running with
> --enable-cassert I hope) or "Buffer Leak" notices in the log from too
> many pins.

Haven't been running it with cassert, but will enable it *nod*

Thanks for the backpatch ...:)




Re: [7.0.2] node type 17 not supported ...

От
"Ross J. Reedstrom"
Дата:
Marc - 
If you're going to consider a point release, should we try to collect up
any other small patches that would go cleanly into the 7.0 tree? Just for
instance, the revised view rule name truncation patch I posted to PATCHES
that no one has commented on, yeah or neah. I don't think there are very
many of these: it seems to me that most small fixes that could be patched
to both trees, have been, but I haven't been trying to keep an accurate
count.  (Hmm, Bruce's has been quiet this week. Is he on vacation?)

Ross

On Fri, Sep 08, 2000 at 12:06:38AM -0300, The Hermit Hacker wrote:
> On Thu, 7 Sep 2000, Tom Lane wrote:
> 
> > The Hermit Hacker <scrappy@hub.org> writes:
> > > On Thu, 7 Sep 2000, Tom Lane wrote:
> > >> I have committed a fix into REL7_0 branch.  Although it seems to work,
> > >> I don't trust it really far because it depends on heap_markpos() and
> > >> heap_restrpos(), which haven't been used in a long time and are full
> > >> of alarmed-sounding comments.
> > 
> 
> Thanks for the backpatch ...:)
> 
> 

-- 
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu> 
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005


Re: [7.0.2] node type 17 not supported ...

От
The Hermit Hacker
Дата:
On Fri, 8 Sep 2000, Ross J. Reedstrom wrote:

> Marc - 

> If you're going to consider a point release, should we try to collect
> up any other small patches that would go cleanly into the 7.0 tree?
> Just for instance, the revised view rule name truncation patch I
> posted to PATCHES that no one has commented on, yeah or neah. I don't
> think there are very many of these: it seems to me that most small
> fixes that could be patched to both trees, have been, but I haven't
> been trying to keep an accurate count.  (Hmm, Bruce's has been quiet
> this week. Is he on vacation?)

Unless its considered critical to the stable running of the server, like
the patch Tom just committed is, it won't go in ... I'm planning on
running this patch through the weekend and watching things, if all goes
well by Mon/Tues, I'll put out v7.0.3 ... since the above named patch
isn't even in the -CURRENT tree yet, I'm leary of slapping it into
something we consider to be stable, no? :)


> > Ross
> 
> On Fri, Sep 08, 2000 at 12:06:38AM -0300, The Hermit Hacker wrote:
> > On Thu, 7 Sep 2000, Tom Lane wrote:
> > 
> > > The Hermit Hacker <scrappy@hub.org> writes:
> > > > On Thu, 7 Sep 2000, Tom Lane wrote:
> > > >> I have committed a fix into REL7_0 branch.  Although it seems to work,
> > > >> I don't trust it really far because it depends on heap_markpos() and
> > > >> heap_restrpos(), which haven't been used in a long time and are full
> > > >> of alarmed-sounding comments.
> > > 
> > 
> > Thanks for the backpatch ...:)
> > 
> > 
> 
> -- 
> Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu> 
> NSBRI Research Scientist/Programmer
> Computer and Information Technology Institute
> Rice University, 6100 S. Main St.,  Houston, TX 77005
> 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org