Обсуждение: Snapshot too old logging

Поиск
Список
Период
Сортировка

Snapshot too old logging

От
Magnus Hagander
Дата:
Is there a reason why we don't log which relation triggered the snapshot too old error when it happens?

Since we do have Relation available in TestForOldSnapshot_impl(), shouldn't we be able to include it?

Having that in the log message would be very helpful, but is there a reason why it wouldn't work?

--

Re: Snapshot too old logging

От
Kevin Grittner
Дата:
On Tue, Nov 15, 2016 at 3:38 AM, Magnus Hagander <magnus@hagander.net> wrote:
> Is there a reason why we don't log which relation triggered the snapshot too
> old error when it happens?
>
> Since we do have Relation available in TestForOldSnapshot_impl(), shouldn't
> we be able to include it?
>
> Having that in the log message would be very helpful, but is there a reason
> why it wouldn't work?

I would probably not want to mess with the text of the error
itself, in case any client-side software bases recovery on that
rather than the SQLSTATE value; but including relation (and maybe
the timestamp on the snapshot?) in detail or context portions seems
like a great idea.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Snapshot too old logging

От
Tom Lane
Дата:
Kevin Grittner <kgrittn@gmail.com> writes:
> On Tue, Nov 15, 2016 at 3:38 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> Is there a reason why we don't log which relation triggered the snapshot too
>> old error when it happens?

> I would probably not want to mess with the text of the error
> itself, in case any client-side software bases recovery on that
> rather than the SQLSTATE value;

Any such code is broken on its face because of localization.
Perhaps including the relname in the main message would make it
unduly long, but if not I'd vote for doing it that way.
        regards, tom lane



Re: Snapshot too old logging

От
Magnus Hagander
Дата:
On Tue, Nov 15, 2016 at 5:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Kevin Grittner <kgrittn@gmail.com> writes:
> On Tue, Nov 15, 2016 at 3:38 AM, Magnus Hagander <magnus@hagander.net> wrote:
>> Is there a reason why we don't log which relation triggered the snapshot too
>> old error when it happens?

> I would probably not want to mess with the text of the error
> itself, in case any client-side software bases recovery on that
> rather than the SQLSTATE value;

Any such code is broken on its face because of localization.
Perhaps including the relname in the main message would make it
unduly long, but if not I'd vote for doing it that way.


Agreed.

Is there value in showing which snapshot as well? Something like:
DETAIL: snapshot <xyz> is too old to access relation <relation>

Putting both those into the main message will probably make it too long. 


--

Re: Snapshot too old logging

От
Brad DeJong
Дата:
On Tue, Nov 15, 2016 at 10:30 AM, Tom Lane wrote:
>
> Kevin Grittner <kgrittn@gmail.com> writes:
> > On Tue, Nov 15, 2016 at 3:38 AM, Magnus Hagander <magnus@hagander.net>
> wrote:
> >> Is there a reason why we don't log which relation triggered the
> >> snapshot too old error when it happens?
>
> > I would probably not want to mess with the text of the error itself,
> > in case any client-side software bases recovery on that rather than
> > the SQLSTATE value;
>
> Any such code is broken on its face because of localization.
> Perhaps including the relname in the main message would make it unduly
> long, but if not I'd vote for doing it that way.

+1 for relname in the main message.



Re: Snapshot too old logging

От
Robert Haas
Дата:
On Tue, Nov 15, 2016 at 1:18 PM, Magnus Hagander <magnus@hagander.net> wrote:
> Is there value in showing which snapshot as well? Something like:
> DETAIL: snapshot <xyz> is too old to access relation <relation>

IIUC, the granularity is per-block, not per-relation, so that might be
misleading.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Snapshot too old logging

От
Magnus Hagander
Дата:
On Tue, Nov 15, 2016 at 7:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Nov 15, 2016 at 1:18 PM, Magnus Hagander <magnus@hagander.net> wrote:
> Is there value in showing which snapshot as well? Something like:
> DETAIL: snapshot <xyz> is too old to access relation <relation>

IIUC, the granularity is per-block, not per-relation, so that might be
misleading.

Does it help to let the user know which number? I'm not really sure what I'd do with that information, whereas knowing the relation would be very useful.

But we could certainly say "snapshot <xyz> is too old to access block <nnn> of relation <rel>". 

--

Re: Snapshot too old logging

От
Brad DeJong
Дата:
<div class="WordSection1"><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">On Tue,
Nov15, 2016 at 12:20 PM Magnus Hagander wrote:</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif">>Is there value in showing the snapshot as well?</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri",sans-serif"> </span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif">Idon't think so. Knowing the relname let's you look at your
report/joband figure out if the access to that relation can be moved. Having the exact snapshot version isn't going to
changethat. And if you're doing auto-retry, you probably only care if it is happening on the same relation
consistently.</span></div>

Re: Snapshot too old logging

От
Robert Haas
Дата:
On Tue, Nov 15, 2016 at 1:30 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Tue, Nov 15, 2016 at 7:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> On Tue, Nov 15, 2016 at 1:18 PM, Magnus Hagander <magnus@hagander.net>
>> wrote:
>> > Is there value in showing which snapshot as well? Something like:
>> > DETAIL: snapshot <xyz> is too old to access relation <relation>
>>
>> IIUC, the granularity is per-block, not per-relation, so that might be
>> misleading.
>
> Does it help to let the user know which number? I'm not really sure what I'd
> do with that information, whereas knowing the relation would be very useful.
>
> But we could certainly say "snapshot <xyz> is too old to access block <nnn>
> of relation <rel>".

I think it would be better not to include either the snapshot or the
block number, and just find some way to reword the error message so
that it mentions which relation was involved without implying that all
access to the relation would necessarily fail.  For example:

ERROR: snapshot too old
DETAIL: One or more rows required by this query have already been
removed from "%s".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Snapshot too old logging

От
Tom Lane
Дата:
Magnus Hagander <magnus@hagander.net> writes:
> Is there value in showing which snapshot as well? Something like:
> DETAIL: snapshot <xyz> is too old to access relation <relation>

Snapshots don't have names, and I can't think of a useful way of
identifying them to users.
        regards, tom lane



Re: Snapshot too old logging

От
Kevin Grittner
Дата:
On Tue, Nov 15, 2016 at 12:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> Is there value in showing which snapshot as well? Something like:
>> DETAIL: snapshot <xyz> is too old to access relation <relation>
>
> Snapshots don't have names, and I can't think of a useful way of
> identifying them to users.

The time that the snapshot was taken might be useful; I can't think
of anything else that would be.  The main thing would be to include
the relation, with an indication that it's not necessarily true
that access to *any* part of the relation would be a problem.
Mentioning the block number would tend to support that idea, and
might be of some use.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Snapshot too old logging

От
Magnus Hagander
Дата:
On Tue, Nov 15, 2016 at 7:43 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Nov 15, 2016 at 1:30 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Tue, Nov 15, 2016 at 7:27 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> On Tue, Nov 15, 2016 at 1:18 PM, Magnus Hagander <magnus@hagander.net>
>> wrote:
>> > Is there value in showing which snapshot as well? Something like:
>> > DETAIL: snapshot <xyz> is too old to access relation <relation>
>>
>> IIUC, the granularity is per-block, not per-relation, so that might be
>> misleading.
>
> Does it help to let the user know which number? I'm not really sure what I'd
> do with that information, whereas knowing the relation would be very useful.
>
> But we could certainly say "snapshot <xyz> is too old to access block <nnn>
> of relation <rel>".

I think it would be better not to include either the snapshot or the
block number, and just find some way to reword the error message so
that it mentions which relation was involved without implying that all
access to the relation would necessarily fail.  For example:

ERROR: snapshot too old
DETAIL: One or more rows required by this query have already been
removed from "%s".


Sounds good to me. I've only really found use for the relation name so far :) 

How about the attached?

And if ppl are OK with it, thoughts on backpatching to 9.6?

--
Вложения

Re: Snapshot too old logging

От
Kevin Grittner
Дата:
On Tue, Nov 15, 2016 at 12:43 PM, Robert Haas <robertmhaas@gmail.com> wrote:

> I think it would be better not to include either the snapshot or the
> block number, and just find some way to reword the error message so
> that it mentions which relation was involved without implying that all
> access to the relation would necessarily fail.  For example:
>
> ERROR: snapshot too old
> DETAIL: One or more rows required by this query have already been
> removed from "%s".

That particular language would be misleading.  All we know about
the page is that it was modified since the referencing (old)
snapshot was taken.  We don't don't know in what way it was
modified, so we must assume that it *might* have been pruned of
rows that the snapshot should still be able to see.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Snapshot too old logging

От
Robert Haas
Дата:
On Tue, Nov 15, 2016 at 2:21 PM, Kevin Grittner <kgrittn@gmail.com> wrote:
> On Tue, Nov 15, 2016 at 12:43 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
>> I think it would be better not to include either the snapshot or the
>> block number, and just find some way to reword the error message so
>> that it mentions which relation was involved without implying that all
>> access to the relation would necessarily fail.  For example:
>>
>> ERROR: snapshot too old
>> DETAIL: One or more rows required by this query have already been
>> removed from "%s".
>
> That particular language would be misleading.  All we know about
> the page is that it was modified since the referencing (old)
> snapshot was taken.  We don't don't know in what way it was
> modified, so we must assume that it *might* have been pruned of
> rows that the snapshot should still be able to see.

Oh, yeah.  So maybe "may have already been removed".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Snapshot too old logging

От
Magnus Hagander
Дата:
On Tue, Nov 15, 2016 at 8:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Nov 15, 2016 at 2:21 PM, Kevin Grittner <kgrittn@gmail.com> wrote:
> On Tue, Nov 15, 2016 at 12:43 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
>> I think it would be better not to include either the snapshot or the
>> block number, and just find some way to reword the error message so
>> that it mentions which relation was involved without implying that all
>> access to the relation would necessarily fail.  For example:
>>
>> ERROR: snapshot too old
>> DETAIL: One or more rows required by this query have already been
>> removed from "%s".
>
> That particular language would be misleading.  All we know about
> the page is that it was modified since the referencing (old)
> snapshot was taken.  We don't don't know in what way it was
> modified, so we must assume that it *might* have been pruned of
> rows that the snapshot should still be able to see.

Oh, yeah.  So maybe "may have already been removed".

Just to be clear, you're suggesting 'One or more rows may have already been removed from "%s"?


--

Re: Snapshot too old logging

От
Alvaro Herrera
Дата:
Magnus Hagander wrote:
> On Tue, Nov 15, 2016 at 8:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> 
> > On Tue, Nov 15, 2016 at 2:21 PM, Kevin Grittner <kgrittn@gmail.com> wrote:

> > > That particular language would be misleading.  All we know about
> > > the page is that it was modified since the referencing (old)
> > > snapshot was taken.  We don't don't know in what way it was
> > > modified, so we must assume that it *might* have been pruned of
> > > rows that the snapshot should still be able to see.
> >
> > Oh, yeah.  So maybe "may have already been removed".
> 
> Just to be clear, you're suggesting 'One or more rows may have already been
> removed from "%s"?

Focusing on the relation itself for a second, I think the name should be
schema-qualified.  What about using errtable()?

Can this happen for relation types other than tables, say materialized
views?  (Your suggested wording omits relation type so it wouldn't be
affected, but it's worth considering I think.)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Snapshot too old logging

От
Brad DeJong
Дата:
<div class="WordSection1"><p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">Magnus
wrote:</span><pclass="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri",sans-serif">> Just to be clear,
you'resuggesting 'One or more rows may have already been removed from "%s"?</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif"> </span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif">Perhapsjust 'This query attempted to access a page in "%s"
thatwas modified after the snapshot was acquired.'</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif"> </span></div>

Re: Snapshot too old logging

От
Magnus Hagander
Дата:
On Tue, Nov 15, 2016 at 9:23 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Magnus Hagander wrote:
> On Tue, Nov 15, 2016 at 8:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> > On Tue, Nov 15, 2016 at 2:21 PM, Kevin Grittner <kgrittn@gmail.com> wrote:

> > > That particular language would be misleading.  All we know about
> > > the page is that it was modified since the referencing (old)
> > > snapshot was taken.  We don't don't know in what way it was
> > > modified, so we must assume that it *might* have been pruned of
> > > rows that the snapshot should still be able to see.
> >
> > Oh, yeah.  So maybe "may have already been removed".
>
> Just to be clear, you're suggesting 'One or more rows may have already been
> removed from "%s"?

Focusing on the relation itself for a second, I think the name should be
schema-qualified.  What about using errtable()?

Can this happen for relation types other than tables, say materialized
views?  (Your suggested wording omits relation type so it wouldn't be
affected, but it's worth considering I think.)

I'm fairly certain it can hit other things, including indexes and definitely matviews, but I won't say I'm 100% sure :) The check is at block level. Does errtable() work for that? (I've never used it, and it seems it's only actually use din a single place in the codebase..)


--

Re: Snapshot too old logging

От
Robert Haas
Дата:
On Tue, Nov 15, 2016 at 2:25 PM, Magnus Hagander <magnus@hagander.net> wrote:
> On Tue, Nov 15, 2016 at 8:22 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> On Tue, Nov 15, 2016 at 2:21 PM, Kevin Grittner <kgrittn@gmail.com> wrote:
>> > On Tue, Nov 15, 2016 at 12:43 PM, Robert Haas <robertmhaas@gmail.com>
>> > wrote:
>> >
>> >> I think it would be better not to include either the snapshot or the
>> >> block number, and just find some way to reword the error message so
>> >> that it mentions which relation was involved without implying that all
>> >> access to the relation would necessarily fail.  For example:
>> >>
>> >> ERROR: snapshot too old
>> >> DETAIL: One or more rows required by this query have already been
>> >> removed from "%s".
>> >
>> > That particular language would be misleading.  All we know about
>> > the page is that it was modified since the referencing (old)
>> > snapshot was taken.  We don't don't know in what way it was
>> > modified, so we must assume that it *might* have been pruned of
>> > rows that the snapshot should still be able to see.
>>
>> Oh, yeah.  So maybe "may have already been removed".
>
>
> Just to be clear, you're suggesting 'One or more rows may have already been
> removed from "%s"?

I think I was suggesting: One or more rows required by this query may
already have been removed from "%s".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Snapshot too old logging

От
Kevin Grittner
Дата:
On Wed, Nov 16, 2016 at 6:12 AM, Magnus Hagander <magnus@hagander.net> wrote:
> On Tue, Nov 15, 2016 at 9:23 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

>> Can this happen for relation types other than tables, say materialized
>> views?  (Your suggested wording omits relation type so it wouldn't be
>> affected, but it's worth considering I think.)
>
>
> I'm fairly certain it can hit other things, including indexes and definitely
> matviews, but I won't say I'm 100% sure :) The check is at block level. Does
> errtable() work for that? (I've never used it, and it seems it's only
> actually use din a single place in the codebase..)

Yes, it can happen for indexes and for materialized views.

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Snapshot too old logging

От
Brad DeJong
Дата:
On Wed, Nov 16, 2016 at 6:43 AM, Robert Haas wrote:
> I think I was suggesting: One or more rows required by this query may
> already have been removed from "%s".

I keep reading that as "you have data corruption because something removed rows that your query needs" rather than
"thisquery took too long to complete but may succeed if you resubmit it".
 



Re: Snapshot too old logging

От
Robert Haas
Дата:
On Fri, Nov 18, 2016 at 11:49 AM, Brad DeJong <Brad.Dejong@infor.com> wrote:
> On Wed, Nov 16, 2016 at 6:43 AM, Robert Haas wrote:
>> I think I was suggesting: One or more rows required by this query may
>> already have been removed from "%s".
>
> I keep reading that as "you have data corruption because something removed rows that your query needs" rather than
"thisquery took too long to complete but may succeed if you resubmit it".
 

I see your point, but remember that this would only the detail line.
The error would still be "snapshot too old".

ERROR:  snapshot too old
DETAIL: One or more rows required by this query may already have been
removed from "%s".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company