Обсуждение: Do not understand why this happens

Поиск
Список
Период
Сортировка

Do not understand why this happens

От
Aln Kapa
Дата:
HI.

I connect to the database using the PGDAC, and then send a NOTIFY to
myself every minute.  In this case, the logs get that.

2013-03-10 10:34:36 19797 LOG:  process 19797 still waiting for
AccessExclusiveLock on object 0 of class 1262 of database 0 after
3000.100 ms
2013-03-10 10:34:36 19797 STATEMENT:  NOTIFY test105, ''
2013-03-10 10:34:37 19797 LOG:  process 19797 acquired
AccessExclusiveLock on object 0 of class 1262 of database 0 after
3315.206 ms
2013-03-10 10:34:37 19797 STATEMENT:  NOTIFY test105, ''
2013-03-10 10:34:37 19797 LOG:  duration: 3315.322 ms  statement:
NOTIFY test105, ''
2013-03-10 14:27:43 19797 LOG:  could not receive data from client:
Connection reset by peer
2013-03-10 14:27:43 19797 LOG:  unexpected EOF on client connection

Tell me what's the problem?


Re: Do not understand why this happens

От
Bill Moran
Дата:
On Thu, 14 Mar 2013 15:24:45 +0400 Aln Kapa <alnkapa@gmail.com> wrote:
>
> I connect to the database using the PGDAC, and then send a NOTIFY to
> myself every minute.  In this case, the logs get that.
>
> 2013-03-10 10:34:36 19797 LOG:  process 19797 still waiting for
> AccessExclusiveLock on object 0 of class 1262 of database 0 after
> 3000.100 ms
> 2013-03-10 10:34:36 19797 STATEMENT:  NOTIFY test105, ''
> 2013-03-10 10:34:37 19797 LOG:  process 19797 acquired
> AccessExclusiveLock on object 0 of class 1262 of database 0 after
> 3315.206 ms
> 2013-03-10 10:34:37 19797 STATEMENT:  NOTIFY test105, ''
> 2013-03-10 10:34:37 19797 LOG:  duration: 3315.322 ms  statement:
> NOTIFY test105, ''
> 2013-03-10 14:27:43 19797 LOG:  could not receive data from client:
> Connection reset by peer
> 2013-03-10 14:27:43 19797 LOG:  unexpected EOF on client connection
>
> Tell me what's the problem?

Are the last two lines your perceived problem?  Is the NOTIFY working?

There's really not enough information here to actually understand
what you're asking.  Based on the log information, it looks like
a client program issues a notify, then drops the connection.  Could
be because in intervening network control device times out the TCP
state, or could be because the client drops the conneciton, or
because the client crashed, or is poorly implementd in that it looses
its TCP socket.

In any event, the 4 hour lag between the NOTIFY and the conneciton drop
during which nothing happens seems to indicate that the two events
are probably not related.

Is any of that helpful?  I feel like I don't understand your question
and suspect that you didn't receive an answer to your first post
because most people didn't understand it.  If my comments don't
address your question, perhaps try describing it differently.

--
Bill Moran <wmoran@potentialtech.com>


unexpected lock waits (was Re: Do not understand why this happens)

От
Bill Moran
Дата:
I don't know the answer to your question.  Please keep mailing list
conversations on the list, otherwise you won't get the benefit of
someone else who may know the answer.

I do wonder what else is happening in the transaction that you're
calling NOTIFY within; and that some other statement could be causing
the lock wait.

On Fri, 15 Mar 2013 10:53:56 +0400 Aln Kapa <alnkapa@gmail.com> wrote:

> I realized that my explanation of the problem is not accurate. I do
> not understand what is happening  and why process is waiting for
> AccessExclusiveLock.
> More logs:
> --
> 682058674-2013-03-14 16:41:48 MSK web@web 8025 LOG:  process 8025
> still waiting for AccessExclusiveLock on object 0 of class 1262 of
> database 0 after 3000.114 ms
> 682058826:2013-03-14 16:41:48 MSK web@web 8025 STATEMENT:  NOTIFY test71, ''
> 682058893-2013-03-14 16:41:48 MSK web@web 15984 LOG:  duration:
> 4017.622 ms  statement: COMMIT
> --
> 682059265-2013-03-14 16:41:48 MSK web@web 8025 LOG:  process 8025
> acquired AccessExclusiveLock on object 0 of class 1262 of database 0
> after 3228.124 ms
> 682059408:2013-03-14 16:41:48 MSK web@web 8025 STATEMENT:  NOTIFY test71, ''
> 682059475-2013-03-14 16:41:48 MSK web@web 8025 LOG:  duration:
> 3228.257 ms  statement: NOTIFY test71, ''
> --
> 816903276-2013-03-15 09:18:04 MSK web@web 11373 LOG:  process 11373
> still waiting for AccessExclusiveLock on object 0 of class 1262 of
> database 0 after 3000.129 ms
> 816903430:2013-03-15 09:18:04 MSK web@web 11373 STATEMENT:  NOTIFY test102, ''
> 816903499-2013-03-15 09:18:05 MSK web@web 15984 LOG:  duration:
> 7075.972 ms  statement: COMMIT
> --
> 816903867-2013-03-15 09:18:05 MSK web@web 11373 LOG:  process 11373
> acquired AccessExclusiveLock on object 0 of class 1262 of database 0
> after 3747.147 ms
> 816904012:2013-03-15 09:18:05 MSK web@web 11373 STATEMENT:  NOTIFY test102, ''
> 816904081-2013-03-15 09:18:05 MSK web@web 11373 LOG:  duration:
> 3747.254 ms  statement: NOTIFY test102, ''
> --
> 817081702-2013-03-15 09:19:25 MSK web@web 11491 LOG:  process 11491
> still waiting for AccessExclusiveLock on object 0 of class 1262 of
> database 0 after 3000.117 ms
> 817081856:2013-03-15 09:19:25 MSK web@web 11491 STATEMENT:  NOTIFY test87, ''
> 817081924-2013-03-15 09:19:25 MSK web@web 15984 LOG:  duration:
> 4004.117 ms  statement: COMMIT
> 817082009-2013-03-15 09:19:25 MSK web@web 11491 LOG:  process 11491
> acquired AccessExclusiveLock on object 0 of class 1262 of database 0
> after 3141.744 ms
> 817082154:2013-03-15 09:19:25 MSK web@web 11491 STATEMENT:  NOTIFY test87, ''
> 817082222-2013-03-15 09:19:25 MSK web@web 11491 LOG:  duration:
> 3164.606 ms  statement: NOTIFY test87, ''
> --
> 818804853-2013-03-15 09:38:02 MSK OhGha5ya@rzdvo 11759 LOG:  process
> 11759 still waiting for AccessExclusiveLock on object 0 of class 1262
> of database 0 after 3000.111 ms
> 818805014:2013-03-15 09:38:02 MSK OhGha5ya@rzdvo 11759 STATEMENT:
> NOTIFY test, ''
> 818805087-2013-03-15 09:38:02 MSK web@web 15984 WARNING:
> key:union_transport_stat_web75
> --
> 818831490-2013-03-15 09:38:03 MSK OhGha5ya@rzdvo 11759 LOG:  process
> 11759 acquired AccessExclusiveLock on object 0 of class 1262 of
> database 0 after 3960.587 ms
> 818831642:2013-03-15 09:38:03 MSK OhGha5ya@rzdvo 11759 STATEMENT:
> NOTIFY test, ''
> 818831715-2013-03-15 09:38:03 MSK OhGha5ya@rzdvo 11759 LOG:  duration:
> 3990.403 ms  statement: NOTIFY test, ''
> --
> 820776461-2013-03-15 09:50:59 MSK web@web 11630 LOG:  process 11630
> still waiting for AccessExclusiveLock on object 0 of class 1262 of
> database 0 after 3000.120 ms
> 820776615:2013-03-15 09:50:59 MSK web@web 11630 STATEMENT:  NOTIFY test71, ''
> 820776683-2013-03-15 09:50:59 MSK ukraine@ukraine 8168 LOG:  process
> 8168 still waiting for AccessExclusiveLock on object 0 of class 1262
> of database 0 after 3000.118 ms
> --
> 820777290-2013-03-15 09:51:03 MSK web@web 11630 LOG:  process 11630
> acquired AccessExclusiveLock on object 0 of class 1262 of database 0
> after 6374.424 ms
> 820777435:2013-03-15 09:51:03 MSK web@web 11630 STATEMENT:  NOTIFY test71, ''
> 820777503-2013-03-15 09:51:03 MSK web@web 11630 LOG:  duration:
> 6374.557 ms  statement: NOTIFY test71, ''
> --
> 821538779-2013-03-15 09:57:00 MSK OhGha5ya@rzdvo 14501 LOG:  process
> 14501 still waiting for AccessExclusiveLock on object 0 of class 1262
> of database 0 after 3000.096 ms
> 821538940:2013-03-15 09:57:00 MSK OhGha5ya@rzdvo 14501 STATEMENT:
> NOTIFY test, ''
> 821539013-2013-03-15 09:57:00 MSK web@web 7995 LOG:  process 7995
> still waiting for AccessExclusiveLock on object 0 of class 1262 of
> database 0 after 3000.111 ms
> --
> 821539305-2013-03-15 09:57:01 MSK OhGha5ya@rzdvo 14501 LOG:  process
> 14501 acquired AccessExclusiveLock on object 0 of class 1262 of
> database 0 after 4007.728 ms
> 821539457:2013-03-15 09:57:01 MSK OhGha5ya@rzdvo 14501 STATEMENT:
> NOTIFY test, ''
> 821539530-2013-03-15 09:57:01 MSK OhGha5ya@rzdvo 14501 LOG:  duration:
> 4007.905 ms  statement: NOTIFY test, ''
> --
> 822220899-2013-03-15 10:00:54 MSK OhGha5ya@rzdvo 10849 LOG:  process
> 10849 still waiting for AccessExclusiveLock on object 0 of class 1262
> of database 0 after 3000.138 ms
> 822221060:2013-03-15 10:00:54 MSK OhGha5ya@rzdvo 10849 STATEMENT:
> NOTIFY test138, ''
> 822221136-2013-03-15 10:00:54 MSK web@web 7987 LOG:  duration:
> 4011.984 ms  statement: COMMIT
> 822221220-2013-03-15 10:00:54 MSK OhGha5ya@rzdvo 10849 LOG:  process
> 10849 acquired AccessExclusiveLock on object 0 of class 1262 of
> database 0 after 3100.606 ms
> 822221372:2013-03-15 10:00:54 MSK OhGha5ya@rzdvo 10849 STATEMENT:
> NOTIFY test138, ''
> 822221448-2013-03-15 10:00:54 MSK OhGha5ya@rzdvo 10849 LOG:  duration:
> 3100.799 ms  statement: NOTIFY test138, ''
> --
> 825472959-2013-03-15 10:28:16 MSK OhGha5ya@rzdvo 5616 LOG:  process
> 5616 still waiting for AccessExclusiveLock on object 0 of class 1262
> of database 0 after 3000.112 ms
> 825473118:2013-03-15 10:28:16 MSK OhGha5ya@rzdvo 5616 STATEMENT:
> NOTIFY test, ''
> 825473190-2013-03-15 10:28:17 MSK web@web 15984 LOG:  duration:
> 4019.454 ms  statement: COMMIT
> 825473275-2013-03-15 10:28:17 MSK OhGha5ya@rzdvo 5616 LOG:  process
> 5616 acquired AccessExclusiveLock on object 0 of class 1262 of
> database 0 after 3909.825 ms
> 825473425:2013-03-15 10:28:17 MSK OhGha5ya@rzdvo 5616 STATEMENT:
> NOTIFY test, ''
> 825473497-2013-03-15 10:28:17 MSK OhGha5ya@rzdvo 5616 LOG:  duration:
> 3909.937 ms  statement: NOTIFY test, ''
>
>
> 2013/3/14 Bill Moran <wmoran@potentialtech.com>:
> > On Thu, 14 Mar 2013 15:24:45 +0400 Aln Kapa <alnkapa@gmail.com> wrote:
> >>
> >> I connect to the database using the PGDAC, and then send a NOTIFY to
> >> myself every minute.  In this case, the logs get that.
> >>
> >> 2013-03-10 10:34:36 19797 LOG:  process 19797 still waiting for
> >> AccessExclusiveLock on object 0 of class 1262 of database 0 after
> >> 3000.100 ms
> >> 2013-03-10 10:34:36 19797 STATEMENT:  NOTIFY test105, ''
> >> 2013-03-10 10:34:37 19797 LOG:  process 19797 acquired
> >> AccessExclusiveLock on object 0 of class 1262 of database 0 after
> >> 3315.206 ms
> >> 2013-03-10 10:34:37 19797 STATEMENT:  NOTIFY test105, ''
> >> 2013-03-10 10:34:37 19797 LOG:  duration: 3315.322 ms  statement:
> >> NOTIFY test105, ''
> >> 2013-03-10 14:27:43 19797 LOG:  could not receive data from client:
> >> Connection reset by peer
> >> 2013-03-10 14:27:43 19797 LOG:  unexpected EOF on client connection
> >>
> >> Tell me what's the problem?
> >
> > Are the last two lines your perceived problem?  Is the NOTIFY working?
> >
> > There's really not enough information here to actually understand
> > what you're asking.  Based on the log information, it looks like
> > a client program issues a notify, then drops the connection.  Could
> > be because in intervening network control device times out the TCP
> > state, or could be because the client drops the conneciton, or
> > because the client crashed, or is poorly implementd in that it looses
> > its TCP socket.
> >
> > In any event, the 4 hour lag between the NOTIFY and the conneciton drop
> > during which nothing happens seems to indicate that the two events
> > are probably not related.
> >
> > Is any of that helpful?  I feel like I don't understand your question
> > and suspect that you didn't receive an answer to your first post
> > because most people didn't understand it.  If my comments don't
> > address your question, perhaps try describing it differently.
> >
> > --
> > Bill Moran <wmoran@potentialtech.com>


--
Bill Moran <wmoran@potentialtech.com>


Re: unexpected lock waits (was Re: Do not understand why this happens)

От
Tom Lane
Дата:
Bill Moran <wmoran@potentialtech.com> writes:
> I do wonder what else is happening in the transaction that you're
> calling NOTIFY within; and that some other statement could be causing
> the lock wait.

FWIW, the lock seems to be the one taken to serialize insertions into
the shared NOTIFY queue, from this bit in commands/async.c:

        /*
         * Serialize writers by acquiring a special lock that we hold till
         * after commit.  This ensures that queue entries appear in commit
         * order, and in particular that there are never uncommitted queue
         * entries ahead of committed ones, so an uncommitted transaction
         * can't block delivery of deliverable notifications.
         *
         * We use a heavyweight lock so that it'll automatically be released
         * after either commit or abort.  This also allows deadlocks to be
         * detected, though really a deadlock shouldn't be possible here.
         *
         * The lock is on "database 0", which is pretty ugly but it doesn't
         * seem worth inventing a special locktag category just for this.
         * (Historical note: before PG 9.0, a similar lock on "database 0" was
         * used by the flatfiles mechanism.)
         */
        LockSharedObject(DatabaseRelationId, InvalidOid, 0,
                         AccessExclusiveLock);

This lock is held while inserting the transaction's notify message(s),
after which the transaction commits and releases the lock.  There's not
very much code in that window.  So what we can conclude is that some
other transaction also doing NOTIFY hung up within that sequence for
something in excess of 3 seconds.  We have been shown no data whatsoever
that would allow us to speculate about what's causing that other
transaction to take so long to get through its commit sequence.

            regards, tom lane


Re: unexpected lock waits (was Re: [GENERAL] Do not understand whythis happens)

От
Martijn van Oosterhout
Дата:
(sorry, very old thread)

On Fri, Mar 15, 2013 at 09:38:06AM -0400, Tom Lane wrote:
> Bill Moran <wmoran@potentialtech.com> writes:
> > I do wonder what else is happening in the transaction that you're
> > calling NOTIFY within; and that some other statement could be causing
> > the lock wait.
> 
> FWIW, the lock seems to be the one taken to serialize insertions into
> the shared NOTIFY queue, from this bit in commands/async.c:

[SNIP]

> This lock is held while inserting the transaction's notify message(s),
> after which the transaction commits and releases the lock.  There's not
> very much code in that window.  So what we can conclude is that some
> other transaction also doing NOTIFY hung up within that sequence for
> something in excess of 3 seconds.  We have been shown no data whatsoever
> that would allow us to speculate about what's causing that other
> transaction to take so long to get through its commit sequence.

I just want to add that after running into this very same issue (see
[1]) that in our case the above conclusion is incorrect.  It is not the
NOTIFYing transactions that are holding the lock too long, but the
LISTENing backends. In our case it is because we have lots of databases
and all databases share a single global NOTIFY queue.

To verify this I made some small patches that significantly reduce the
time LISTENing backends hold the lock and they reduce the problem
significantly for us, see [2].  A slow commit does have a bit of
impact, but the bulk of the time is elsehwere.

[1]: https://www.postgresql.org/message-id/CADWG95t0j9zF0uwdcMH81KMnDsiTAVHxmBvgYqrRJcD-iLwQhw@mail.gmail.com

[2]: https://www.postgresql.org/message-id/CADWG95uLhar1uq6PQLoY1mTQYeN23c1dvOr2tVjcXUBZ1ge9XA@mail.gmail.com

Hope this helps.
-- 
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> The combine: one man, one day, wheat for half a million loaves of bread.