Обсуждение: select now() problem?

Поиск
Список
Период
Сортировка

select now() problem?

От
Eduardo Piombino
Дата:
Hello list, I'm having a problem with a production system.
I have an application that runs 7x24. It is multithreaded, and every thread has its own connection to the database.
We have two types of devices connected to each thread, human operated PCs for billing purposes, and rabbit microcontrollers controlling some electronic devices.

The problem we are having is that once a month approximately, all the microcontrollers attached to their threads (its a common pool of threads), stop responding. Having a closer look at the logs, we found that these threads are stalled at a SELECT NOW() immediately after a BEGIN TRANSACTION. And then the thread stops responding, since it is waiting for that sql command to finish.

The curious part is that all PCs still work ok, even though they also execute SELECT NOW().

This log represents the session for one PC as an example.

02/09/2010 09:26:32.593 | 192.168.203.104:2255 | Thread 17: (CAJA 4@Parking B->Exterior)Thread 17: [TRANSACTION BEGIN]
02/09/2010 09:26:32.593 | (17) | ROLLBACK
02/09/2010 09:26:32.593 | (17) | BEGIN TRANSACTION
02/09/2010 09:26:32.593 | (17) | SELECT NOW()
02/09/2010 09:26:32.593 | 192.168.203.104:2255 | Thread 17: (CAJA 4@Parking B->Exterior)0 ms (errorCode=[200] bytesRead=[17] bytesWritten=[3])

This log represents the session for one thread attached to a device.

02/09/2010 09:15:01.265 | 192.168.203.52:1127 | Thread  8: (SAL2ABJ TCA@Parking B->Exterior)Thread 8: [EVENT AUTH TR 005001040719 02/09/2010 09:15:01 1]
02/09/2010 09:15:01.265 | (8) | BEGIN TRANSACTION
02/09/2010 09:15:01.265 | (8) | SELECT NOW()

and the thread never continues, it never comes by the select now().

The only difference from the database point of view, is that PCs transactions, since they are much more complex, usually start with a ROLLBACK (arguable, it will be eventually removed) and for devices they usually just start with a COMMIT.

Do you have any idea on what could be causing a SELECT NOW() to never come back?
I'm dazzled.

The problem gets solved restarting the server application (and by doing so, all connections to the database are dropped, and after that it doesnt present the problem anymore for at least a month). It has happened approximately 4 or 5 times already.

Thank you,
Eduardo.



Re: select now() problem?

От
Tom Lane
Дата:
Eduardo Piombino <drakorg@gmail.com> writes:
> Do you have any idea on what could be causing a SELECT NOW() to never come
> back?

That's truly bizarre.  Can you attach to one of these stuck processes
with a debugger and get a stack trace?

            regards, tom lane

Re: select now() problem?

От
Eduardo Piombino
Дата:
>The only difference from the database point of view, is that PCs transactions, since they are much more complex, usually start with a ROLLBACK (arguable, it will be eventually removed) and for devices they usually just start with a COMMIT.

Correction ... and for devices they usually just start with a BEGIN TRANSACTION.

Real thing is that when this happens, I'm several kilometers away, with no remote access (yet at least), and since it is a production system as soon as something like this happens they need to get it back up asap. I find it extremely difficult to be able to gather that data. Besides that I'm not familiar with the procedure of attaching to someone else's process in windows.

It is a 8.4.2 postgres running in windows 2003 server.
Im tempted in upgrading to 8.4.latest, since it should not require a full backup/restore and practically no server downtime.
But i'm not sure about the real gain of that.

On Thu, Sep 2, 2010 at 2:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Eduardo Piombino <drakorg@gmail.com> writes:
> Do you have any idea on what could be causing a SELECT NOW() to never come
> back?

That's truly bizarre.  Can you attach to one of these stuck processes
with a debugger and get a stack trace?

                       regards, tom lane

Re: select now() problem?

От
Grzegorz Jaśkiewicz
Дата:
I wonder if there's an equivalent of gcore on windows. If there is, it
might be useful.