Обсуждение: Connection terminated by the server causes deadlock in jdbc client side connection

Поиск
Список
Период
Сортировка

Connection terminated by the server causes deadlock in jdbc client side connection

От
Leonardo Frittelli
Дата:
Hi,

We are experiencing very frequent deadlocks in pgsql jdbc connections. The scenario is a replicated database with hot_standby = on

At times of high volumes of queries in both the primary and the replicated server, we sometimes get the following log indicating that the server has terminated the connection in the replicated database.
FATAL:  terminating connection due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.

This is expected and we see no issue with that. What I did not expect, however, is that in the JDBC client side, the connection is deadlocked.

java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
...

Looking at the Postgres JDBC code, I notice that ProtocolConnectionImpl.close() (invoked by the exception handler in QueryExecutorImpl.execute) is trying to 'gracefully close' by sending an 'X' to the server before actually closing the socket.

...
if (logger.logDebug()) logger.debug(" FE=> Terminate"); pgStream.SendChar('X'); pgStream.flush(); pgStream.close(); ...

Does this make sense in a scenario of a connection which has already been terminated by the server side? 

At times of high load, all connections in the pool get eventually locked with exactly the same stack trace. 

I'd appreciate any advice on how to handle this. Could this be a bug in the JDBC driver?

Thanks,

Leonardo

Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
Dave Cramer
Дата:
It could be a problem. What version are you using ?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 12:33, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Hi,

We are experiencing very frequent deadlocks in pgsql jdbc connections. The scenario is a replicated database with hot_standby = on

At times of high volumes of queries in both the primary and the replicated server, we sometimes get the following log indicating that the server has terminated the connection in the replicated database.
FATAL:  terminating connection due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.

This is expected and we see no issue with that. What I did not expect, however, is that in the JDBC client side, the connection is deadlocked.

java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
...

Looking at the Postgres JDBC code, I notice that ProtocolConnectionImpl.close() (invoked by the exception handler in QueryExecutorImpl.execute) is trying to 'gracefully close' by sending an 'X' to the server before actually closing the socket.

...
if (logger.logDebug()) logger.debug(" FE=> Terminate"); pgStream.SendChar('X'); pgStream.flush(); pgStream.close(); ...

Does this make sense in a scenario of a connection which has already been terminated by the server side? 

At times of high load, all connections in the pool get eventually locked with exactly the same stack trace. 

I'd appreciate any advice on how to handle this. Could this be a bug in the JDBC driver?

Thanks,

Leonardo

Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
Dave Cramer
Дата:
Leonardo,

At this point I'm wondering if we get an I/O error if it is feasible to attempt to close it. I am leaning towards no.

What is above this in the stack trace. Do you see any information about the IOException error that caused it

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 14:43, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

Thank you very much for your prompt response.
I am using version postgresql-9.3-1101, but I have checked the latest available 9.4 sources and it appears to have the same code (the code example I pasted was from v2, but I see similar code in v3).

Could it be that in this particular scenario we should use ProtocolConnectionImpl.abort() instead of close()?

Thanks,

Leonardo

On 7 October 2015 at 15:23, Dave Cramer <pg@fastcrypt.com> wrote:
It could be a problem. What version are you using ?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 12:33, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Hi,

We are experiencing very frequent deadlocks in pgsql jdbc connections. The scenario is a replicated database with hot_standby = on

At times of high volumes of queries in both the primary and the replicated server, we sometimes get the following log indicating that the server has terminated the connection in the replicated database.
FATAL:  terminating connection due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.

This is expected and we see no issue with that. What I did not expect, however, is that in the JDBC client side, the connection is deadlocked.

java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
...

Looking at the Postgres JDBC code, I notice that ProtocolConnectionImpl.close() (invoked by the exception handler in QueryExecutorImpl.execute) is trying to 'gracefully close' by sending an 'X' to the server before actually closing the socket.

...
if (logger.logDebug()) logger.debug(" FE=> Terminate"); pgStream.SendChar('X'); pgStream.flush(); pgStream.close(); ...

Does this make sense in a scenario of a connection which has already been terminated by the server side? 

At times of high load, all connections in the pool get eventually locked with exactly the same stack trace. 

I'd appreciate any advice on how to handle this. Could this be a bug in the JDBC driver?

Thanks,

Leonardo



Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
Leonardo Frittelli
Дата:
Dave,

Sadly ExecutorQueryImpl will only print out the exception _after_ closing the connection :)

        catch (IOException e)
        {
            protoConnection.close();
            handler.handleError(new PSQLException(GT.tr("An I/O error occurred while sending to the backend."), PSQLState.CONNECTION_FAILURE, e));
        }


Here is a longer StackTrace in case there is anything else of value. We are using BoneCP and Hibernate. The next line in the stack trace would be our application code.

   java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
at com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:174)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:56)
at org.hibernate.loader.Loader.getResultSet(Loader.java:2040)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1837)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1816)
at org.hibernate.loader.Loader.doQuery(Loader.java:900)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342)
at org.hibernate.loader.Loader.doList(Loader.java:2526)
at org.hibernate.loader.Loader.doList(Loader.java:2512)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342)
at org.hibernate.loader.Loader.list(Loader.java:2337)
at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:124)
at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1662)
at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)

Regards,

Leonardo

On 7 October 2015 at 16:01, Dave Cramer <pg@fastcrypt.com> wrote:
Leonardo,

At this point I'm wondering if we get an I/O error if it is feasible to attempt to close it. I am leaning towards no.

What is above this in the stack trace. Do you see any information about the IOException error that caused it

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 14:43, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

Thank you very much for your prompt response.
I am using version postgresql-9.3-1101, but I have checked the latest available 9.4 sources and it appears to have the same code (the code example I pasted was from v2, but I see similar code in v3).

Could it be that in this particular scenario we should use ProtocolConnectionImpl.abort() instead of close()?

Thanks,

Leonardo

On 7 October 2015 at 15:23, Dave Cramer <pg@fastcrypt.com> wrote:
It could be a problem. What version are you using ?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 12:33, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Hi,

We are experiencing very frequent deadlocks in pgsql jdbc connections. The scenario is a replicated database with hot_standby = on

At times of high volumes of queries in both the primary and the replicated server, we sometimes get the following log indicating that the server has terminated the connection in the replicated database.
FATAL:  terminating connection due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.

This is expected and we see no issue with that. What I did not expect, however, is that in the JDBC client side, the connection is deadlocked.

java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
...

Looking at the Postgres JDBC code, I notice that ProtocolConnectionImpl.close() (invoked by the exception handler in QueryExecutorImpl.execute) is trying to 'gracefully close' by sending an 'X' to the server before actually closing the socket.

...
if (logger.logDebug()) logger.debug(" FE=> Terminate"); pgStream.SendChar('X'); pgStream.flush(); pgStream.close(); ...

Does this make sense in a scenario of a connection which has already been terminated by the server side? 

At times of high load, all connections in the pool get eventually locked with exactly the same stack trace. 

I'd appreciate any advice on how to handle this. Could this be a bug in the JDBC driver?

Thanks,

Leonardo




Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
Dave Cramer
Дата:
OK, let me see if I can change the code and give you a driver that will print out and not call close.

Dave

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 15:08, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

Sadly ExecutorQueryImpl will only print out the exception _after_ closing the connection :)

        catch (IOException e)
        {
            protoConnection.close();
            handler.handleError(new PSQLException(GT.tr("An I/O error occurred while sending to the backend."), PSQLState.CONNECTION_FAILURE, e));
        }


Here is a longer StackTrace in case there is anything else of value. We are using BoneCP and Hibernate. The next line in the stack trace would be our application code.

   java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
at com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:174)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:56)
at org.hibernate.loader.Loader.getResultSet(Loader.java:2040)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1837)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1816)
at org.hibernate.loader.Loader.doQuery(Loader.java:900)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342)
at org.hibernate.loader.Loader.doList(Loader.java:2526)
at org.hibernate.loader.Loader.doList(Loader.java:2512)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342)
at org.hibernate.loader.Loader.list(Loader.java:2337)
at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:124)
at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1662)
at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)

Regards,

Leonardo

On 7 October 2015 at 16:01, Dave Cramer <pg@fastcrypt.com> wrote:
Leonardo,

At this point I'm wondering if we get an I/O error if it is feasible to attempt to close it. I am leaning towards no.

What is above this in the stack trace. Do you see any information about the IOException error that caused it

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 14:43, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

Thank you very much for your prompt response.
I am using version postgresql-9.3-1101, but I have checked the latest available 9.4 sources and it appears to have the same code (the code example I pasted was from v2, but I see similar code in v3).

Could it be that in this particular scenario we should use ProtocolConnectionImpl.abort() instead of close()?

Thanks,

Leonardo

On 7 October 2015 at 15:23, Dave Cramer <pg@fastcrypt.com> wrote:
It could be a problem. What version are you using ?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 12:33, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Hi,

We are experiencing very frequent deadlocks in pgsql jdbc connections. The scenario is a replicated database with hot_standby = on

At times of high volumes of queries in both the primary and the replicated server, we sometimes get the following log indicating that the server has terminated the connection in the replicated database.
FATAL:  terminating connection due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.

This is expected and we see no issue with that. What I did not expect, however, is that in the JDBC client side, the connection is deadlocked.

java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
...

Looking at the Postgres JDBC code, I notice that ProtocolConnectionImpl.close() (invoked by the exception handler in QueryExecutorImpl.execute) is trying to 'gracefully close' by sending an 'X' to the server before actually closing the socket.

...
if (logger.logDebug()) logger.debug(" FE=> Terminate"); pgStream.SendChar('X'); pgStream.flush(); pgStream.close(); ...

Does this make sense in a scenario of a connection which has already been terminated by the server side? 

At times of high load, all connections in the pool get eventually locked with exactly the same stack trace. 

I'd appreciate any advice on how to handle this. Could this be a bug in the JDBC driver?

Thanks,

Leonardo





Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
Leonardo Frittelli
Дата:
I'm actually giving it a go now with abort(). If this works I'll let you know.

On 7 October 2015 at 16:15, Dave Cramer <pg@fastcrypt.com> wrote:
OK, let me see if I can change the code and give you a driver that will print out and not call close.

Dave

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 15:08, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

Sadly ExecutorQueryImpl will only print out the exception _after_ closing the connection :)

        catch (IOException e)
        {
            protoConnection.close();
            handler.handleError(new PSQLException(GT.tr("An I/O error occurred while sending to the backend."), PSQLState.CONNECTION_FAILURE, e));
        }


Here is a longer StackTrace in case there is anything else of value. We are using BoneCP and Hibernate. The next line in the stack trace would be our application code.

   java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
at com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:174)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:56)
at org.hibernate.loader.Loader.getResultSet(Loader.java:2040)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1837)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1816)
at org.hibernate.loader.Loader.doQuery(Loader.java:900)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342)
at org.hibernate.loader.Loader.doList(Loader.java:2526)
at org.hibernate.loader.Loader.doList(Loader.java:2512)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342)
at org.hibernate.loader.Loader.list(Loader.java:2337)
at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:124)
at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1662)
at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)

Regards,

Leonardo

On 7 October 2015 at 16:01, Dave Cramer <pg@fastcrypt.com> wrote:
Leonardo,

At this point I'm wondering if we get an I/O error if it is feasible to attempt to close it. I am leaning towards no.

What is above this in the stack trace. Do you see any information about the IOException error that caused it

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 14:43, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

Thank you very much for your prompt response.
I am using version postgresql-9.3-1101, but I have checked the latest available 9.4 sources and it appears to have the same code (the code example I pasted was from v2, but I see similar code in v3).

Could it be that in this particular scenario we should use ProtocolConnectionImpl.abort() instead of close()?

Thanks,

Leonardo

On 7 October 2015 at 15:23, Dave Cramer <pg@fastcrypt.com> wrote:
It could be a problem. What version are you using ?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 12:33, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Hi,

We are experiencing very frequent deadlocks in pgsql jdbc connections. The scenario is a replicated database with hot_standby = on

At times of high volumes of queries in both the primary and the replicated server, we sometimes get the following log indicating that the server has terminated the connection in the replicated database.
FATAL:  terminating connection due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.

This is expected and we see no issue with that. What I did not expect, however, is that in the JDBC client side, the connection is deadlocked.

java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
...

Looking at the Postgres JDBC code, I notice that ProtocolConnectionImpl.close() (invoked by the exception handler in QueryExecutorImpl.execute) is trying to 'gracefully close' by sending an 'X' to the server before actually closing the socket.

...
if (logger.logDebug()) logger.debug(" FE=> Terminate"); pgStream.SendChar('X'); pgStream.flush(); pgStream.close(); ...

Does this make sense in a scenario of a connection which has already been terminated by the server side? 

At times of high load, all connections in the pool get eventually locked with exactly the same stack trace. 

I'd appreciate any advice on how to handle this. Could this be a bug in the JDBC driver?

Thanks,

Leonardo






Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
Leonardo Frittelli
Дата:
Dave,

I've now replaced all protoConnection.close() by protoConnection.abort() in all IO Exception handlers for QueryExecutorImpl (v2 and v3 just in case) and that appears to have done the trick. 

With exactly the same load test, I would before lock out all 45 connections in my pool, but now they get released nicely. Once the test finishes, they are all released, when before they would have remained deadlocked.

I really think this is a bug then. Would you kindly indicate further steps?

Thanks,

Leonardo



On 7 October 2015 at 16:27, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
I'm actually giving it a go now with abort(). If this works I'll let you know.

On 7 October 2015 at 16:15, Dave Cramer <pg@fastcrypt.com> wrote:
OK, let me see if I can change the code and give you a driver that will print out and not call close.

Dave

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 15:08, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

Sadly ExecutorQueryImpl will only print out the exception _after_ closing the connection :)

        catch (IOException e)
        {
            protoConnection.close();
            handler.handleError(new PSQLException(GT.tr("An I/O error occurred while sending to the backend."), PSQLState.CONNECTION_FAILURE, e));
        }


Here is a longer StackTrace in case there is anything else of value. We are using BoneCP and Hibernate. The next line in the stack trace would be our application code.

   java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
at com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:174)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:56)
at org.hibernate.loader.Loader.getResultSet(Loader.java:2040)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1837)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1816)
at org.hibernate.loader.Loader.doQuery(Loader.java:900)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342)
at org.hibernate.loader.Loader.doList(Loader.java:2526)
at org.hibernate.loader.Loader.doList(Loader.java:2512)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342)
at org.hibernate.loader.Loader.list(Loader.java:2337)
at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:124)
at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1662)
at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)

Regards,

Leonardo

On 7 October 2015 at 16:01, Dave Cramer <pg@fastcrypt.com> wrote:
Leonardo,

At this point I'm wondering if we get an I/O error if it is feasible to attempt to close it. I am leaning towards no.

What is above this in the stack trace. Do you see any information about the IOException error that caused it

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 14:43, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

Thank you very much for your prompt response.
I am using version postgresql-9.3-1101, but I have checked the latest available 9.4 sources and it appears to have the same code (the code example I pasted was from v2, but I see similar code in v3).

Could it be that in this particular scenario we should use ProtocolConnectionImpl.abort() instead of close()?

Thanks,

Leonardo

On 7 October 2015 at 15:23, Dave Cramer <pg@fastcrypt.com> wrote:
It could be a problem. What version are you using ?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 12:33, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Hi,

We are experiencing very frequent deadlocks in pgsql jdbc connections. The scenario is a replicated database with hot_standby = on

At times of high volumes of queries in both the primary and the replicated server, we sometimes get the following log indicating that the server has terminated the connection in the replicated database.
FATAL:  terminating connection due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.

This is expected and we see no issue with that. What I did not expect, however, is that in the JDBC client side, the connection is deadlocked.

java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
...

Looking at the Postgres JDBC code, I notice that ProtocolConnectionImpl.close() (invoked by the exception handler in QueryExecutorImpl.execute) is trying to 'gracefully close' by sending an 'X' to the server before actually closing the socket.

...
if (logger.logDebug()) logger.debug(" FE=> Terminate"); pgStream.SendChar('X'); pgStream.flush(); pgStream.close(); ...

Does this make sense in a scenario of a connection which has already been terminated by the server side? 

At times of high load, all connections in the pool get eventually locked with exactly the same stack trace. 

I'd appreciate any advice on how to handle this. Could this be a bug in the JDBC driver?

Thanks,

Leonardo







Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
Dave Cramer
Дата:
Can you send me a patch or better yet a Pull Request on github ?

Dave

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 16:06, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

I've now replaced all protoConnection.close() by protoConnection.abort() in all IO Exception handlers for QueryExecutorImpl (v2 and v3 just in case) and that appears to have done the trick. 

With exactly the same load test, I would before lock out all 45 connections in my pool, but now they get released nicely. Once the test finishes, they are all released, when before they would have remained deadlocked.

I really think this is a bug then. Would you kindly indicate further steps?

Thanks,

Leonardo



On 7 October 2015 at 16:27, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
I'm actually giving it a go now with abort(). If this works I'll let you know.

On 7 October 2015 at 16:15, Dave Cramer <pg@fastcrypt.com> wrote:
OK, let me see if I can change the code and give you a driver that will print out and not call close.

Dave

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 15:08, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

Sadly ExecutorQueryImpl will only print out the exception _after_ closing the connection :)

        catch (IOException e)
        {
            protoConnection.close();
            handler.handleError(new PSQLException(GT.tr("An I/O error occurred while sending to the backend."), PSQLState.CONNECTION_FAILURE, e));
        }


Here is a longer StackTrace in case there is anything else of value. We are using BoneCP and Hibernate. The next line in the stack trace would be our application code.

   java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
at com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:174)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:56)
at org.hibernate.loader.Loader.getResultSet(Loader.java:2040)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1837)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1816)
at org.hibernate.loader.Loader.doQuery(Loader.java:900)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342)
at org.hibernate.loader.Loader.doList(Loader.java:2526)
at org.hibernate.loader.Loader.doList(Loader.java:2512)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342)
at org.hibernate.loader.Loader.list(Loader.java:2337)
at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:124)
at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1662)
at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)

Regards,

Leonardo

On 7 October 2015 at 16:01, Dave Cramer <pg@fastcrypt.com> wrote:
Leonardo,

At this point I'm wondering if we get an I/O error if it is feasible to attempt to close it. I am leaning towards no.

What is above this in the stack trace. Do you see any information about the IOException error that caused it

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 14:43, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

Thank you very much for your prompt response.
I am using version postgresql-9.3-1101, but I have checked the latest available 9.4 sources and it appears to have the same code (the code example I pasted was from v2, but I see similar code in v3).

Could it be that in this particular scenario we should use ProtocolConnectionImpl.abort() instead of close()?

Thanks,

Leonardo

On 7 October 2015 at 15:23, Dave Cramer <pg@fastcrypt.com> wrote:
It could be a problem. What version are you using ?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 12:33, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Hi,

We are experiencing very frequent deadlocks in pgsql jdbc connections. The scenario is a replicated database with hot_standby = on

At times of high volumes of queries in both the primary and the replicated server, we sometimes get the following log indicating that the server has terminated the connection in the replicated database.
FATAL:  terminating connection due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.

This is expected and we see no issue with that. What I did not expect, however, is that in the JDBC client side, the connection is deadlocked.

java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
...

Looking at the Postgres JDBC code, I notice that ProtocolConnectionImpl.close() (invoked by the exception handler in QueryExecutorImpl.execute) is trying to 'gracefully close' by sending an 'X' to the server before actually closing the socket.

...
if (logger.logDebug()) logger.debug(" FE=> Terminate"); pgStream.SendChar('X'); pgStream.flush(); pgStream.close(); ...

Does this make sense in a scenario of a connection which has already been terminated by the server side? 

At times of high load, all connections in the pool get eventually locked with exactly the same stack trace. 

I'd appreciate any advice on how to handle this. Could this be a bug in the JDBC driver?

Thanks,

Leonardo








Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
Dave Cramer
Дата:
Hi Leonardo,

So this works fine ? I coded around it as well, but I'd like to see your changes

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 17:17, Dave Cramer <pg@fastcrypt.com> wrote:
Can you send me a patch or better yet a Pull Request on github ?

Dave

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 16:06, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

I've now replaced all protoConnection.close() by protoConnection.abort() in all IO Exception handlers for QueryExecutorImpl (v2 and v3 just in case) and that appears to have done the trick. 

With exactly the same load test, I would before lock out all 45 connections in my pool, but now they get released nicely. Once the test finishes, they are all released, when before they would have remained deadlocked.

I really think this is a bug then. Would you kindly indicate further steps?

Thanks,

Leonardo



On 7 October 2015 at 16:27, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
I'm actually giving it a go now with abort(). If this works I'll let you know.

On 7 October 2015 at 16:15, Dave Cramer <pg@fastcrypt.com> wrote:
OK, let me see if I can change the code and give you a driver that will print out and not call close.

Dave

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 15:08, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

Sadly ExecutorQueryImpl will only print out the exception _after_ closing the connection :)

        catch (IOException e)
        {
            protoConnection.close();
            handler.handleError(new PSQLException(GT.tr("An I/O error occurred while sending to the backend."), PSQLState.CONNECTION_FAILURE, e));
        }


Here is a longer StackTrace in case there is anything else of value. We are using BoneCP and Hibernate. The next line in the stack trace would be our application code.

   java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
at com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:174)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:56)
at org.hibernate.loader.Loader.getResultSet(Loader.java:2040)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1837)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1816)
at org.hibernate.loader.Loader.doQuery(Loader.java:900)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342)
at org.hibernate.loader.Loader.doList(Loader.java:2526)
at org.hibernate.loader.Loader.doList(Loader.java:2512)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342)
at org.hibernate.loader.Loader.list(Loader.java:2337)
at org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:124)
at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1662)
at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)

Regards,

Leonardo

On 7 October 2015 at 16:01, Dave Cramer <pg@fastcrypt.com> wrote:
Leonardo,

At this point I'm wondering if we get an I/O error if it is feasible to attempt to close it. I am leaning towards no.

What is above this in the stack trace. Do you see any information about the IOException error that caused it

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 14:43, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Dave,

Thank you very much for your prompt response.
I am using version postgresql-9.3-1101, but I have checked the latest available 9.4 sources and it appears to have the same code (the code example I pasted was from v2, but I see similar code in v3).

Could it be that in this particular scenario we should use ProtocolConnectionImpl.abort() instead of close()?

Thanks,

Leonardo

On 7 October 2015 at 15:23, Dave Cramer <pg@fastcrypt.com> wrote:
It could be a problem. What version are you using ?

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On 7 October 2015 at 12:33, Leonardo Frittelli <leonardo.frittelli@gmail.com> wrote:
Hi,

We are experiencing very frequent deadlocks in pgsql jdbc connections. The scenario is a replicated database with hot_standby = on

At times of high volumes of queries in both the primary and the replicated server, we sometimes get the following log indicating that the server has terminated the connection in the replicated database.
FATAL:  terminating connection due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.

This is expected and we see no issue with that. What I did not expect, however, is that in the JDBC client side, the connection is deadlocked.

java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
...

Looking at the Postgres JDBC code, I notice that ProtocolConnectionImpl.close() (invoked by the exception handler in QueryExecutorImpl.execute) is trying to 'gracefully close' by sending an 'X' to the server before actually closing the socket.

...
if (logger.logDebug()) logger.debug(" FE=> Terminate"); pgStream.SendChar('X'); pgStream.flush(); pgStream.close(); ...

Does this make sense in a scenario of a connection which has already been terminated by the server side? 

At times of high load, all connections in the pool get eventually locked with exactly the same stack trace. 

I'd appreciate any advice on how to handle this. Could this be a bug in the JDBC driver?

Thanks,

Leonardo









Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
"Steffen Heil (Mailinglisten)"
Дата:
Hi


Do have a deadlock, you need at least two threads.
Also they both need to be BLOCKED, not RUNNABLE.
socketWrite should throw an exception, if the connection is closed for sending.

So I assume the server closed the connection for transfer from the server to the client, but the connection remained
half-openfrom the client to the server. 
Then on the other hand, you write should have gone through...


What locks here and why?


Regards,
  Steffen



-----Ursprüngliche Nachricht-----
Von: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc-owner@postgresql.org] Im Auftrag von Leonardo Frittelli
Gesendet: Mittwoch, 7. Oktober 2015 18:33
An: pgsql-jdbc@postgresql.org
Betreff: [JDBC] Connection terminated by the server causes deadlock in jdbc client side connection

Hi,

We are experiencing very frequent deadlocks in pgsql jdbc connections. The scenario is a replicated database with
hot_standby= on 

At times of high volumes of queries in both the primary and the replicated server, we sometimes get the following log
indicatingthat the server has terminated the connection in the replicated database. 
FATAL:  terminating connection due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.

This is expected and we see no issue with that. What I did not expect, however, is that in the JDBC client side, the
connectionis deadlocked. 

java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
...


Looking at the Postgres JDBC code, I notice that ProtocolConnectionImpl.close() (invoked by the exception handler in
QueryExecutorImpl.execute)is trying to 'gracefully close' by sending an 'X' to the server before actually closing the
socket. 


...
if (logger.logDebug()) logger.debug(" FE=> Terminate"); pgStream.SendChar('X'); pgStream.flush(); pgStream.close(); ...

Does this make sense in a scenario of a connection which has already been terminated by the server side?

At times of high load, all connections in the pool get eventually locked with exactly the same stack trace.

I'd appreciate any advice on how to handle this. Could this be a bug in the JDBC driver?

Thanks,

Leonardo

Вложения

Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
Dave Cramer
Дата:

On 8 October 2015 at 08:47, Steffen Heil (Mailinglisten) <lists@steffen-heil.de> wrote:
Hi


Do have a deadlock, you need at least two threads.
Also they both need to be BLOCKED, not RUNNABLE.
socketWrite should throw an exception, if the connection is closed for sending.

So I assume the server closed the connection for transfer from the server to the client, but the connection remained half-open from the client to the server.
Then on the other hand, you write should have gone through...

What happens if the server just discards the connection. In other words it doesn't close it. It just terminates without closing any open connections ?


Dave
 

What locks here and why?


Regards,
  Steffen



-----Ursprüngliche Nachricht-----
Von: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc-owner@postgresql.org] Im Auftrag von Leonardo Frittelli
Gesendet: Mittwoch, 7. Oktober 2015 18:33
An: pgsql-jdbc@postgresql.org
Betreff: [JDBC] Connection terminated by the server causes deadlock in jdbc client side connection

Hi,

We are experiencing very frequent deadlocks in pgsql jdbc connections. The scenario is a replicated database with hot_standby = on

At times of high volumes of queries in both the primary and the replicated server, we sometimes get the following log indicating that the server has terminated the connection in the replicated database.
FATAL:  terminating connection due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.

This is expected and we see no issue with that. What I did not expect, however, is that in the JDBC client side, the connection is deadlocked.

java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
...


Looking at the Postgres JDBC code, I notice that ProtocolConnectionImpl.close() (invoked by the exception handler in QueryExecutorImpl.execute) is trying to 'gracefully close' by sending an 'X' to the server before actually closing the socket.


...
if (logger.logDebug()) logger.debug(" FE=> Terminate"); pgStream.SendChar('X'); pgStream.flush(); pgStream.close(); ...

Does this make sense in a scenario of a connection which has already been terminated by the server side?

At times of high load, all connections in the pool get eventually locked with exactly the same stack trace.

I'd appreciate any advice on how to handle this. Could this be a bug in the JDBC driver?

Thanks,

Leonardo

Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
Leonardo Frittelli
Дата:
Stephen,

You are correct that this is not a concurrency deadlock. 
I was referring to the fact that the TCP connection is indeed left 'half open' with the jdbc client side still assuming it's 'properly' open and thus trying to send a graceful close command over the socket ("X" in v2 or "X4" in v3) . It is this action that indefinitely blocks the client thread, which is then no longer usable by the application.

Dave's point is a good one too. If the server is not properly closing the connection, this might be causing the lock in the client side. 
By the way, our server is version 9.3. I don't know if this would reproduce as is in 9.4.

All of this said, I still think it is conceptually incorrect to attempt a graceful closure after an IO error in the client.

Regards,

Leonardo


On 8 October 2015 at 09:57, Dave Cramer <pg@fastcrypt.com> wrote:

On 8 October 2015 at 08:47, Steffen Heil (Mailinglisten) <lists@steffen-heil.de> wrote:
Hi


Do have a deadlock, you need at least two threads.
Also they both need to be BLOCKED, not RUNNABLE.
socketWrite should throw an exception, if the connection is closed for sending.

So I assume the server closed the connection for transfer from the server to the client, but the connection remained half-open from the client to the server.
Then on the other hand, you write should have gone through...

What happens if the server just discards the connection. In other words it doesn't close it. It just terminates without closing any open connections ?


Dave
 

What locks here and why?


Regards,
  Steffen



-----Ursprüngliche Nachricht-----
Von: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc-owner@postgresql.org] Im Auftrag von Leonardo Frittelli
Gesendet: Mittwoch, 7. Oktober 2015 18:33
An: pgsql-jdbc@postgresql.org
Betreff: [JDBC] Connection terminated by the server causes deadlock in jdbc client side connection

Hi,

We are experiencing very frequent deadlocks in pgsql jdbc connections. The scenario is a replicated database with hot_standby = on

At times of high volumes of queries in both the primary and the replicated server, we sometimes get the following log indicating that the server has terminated the connection in the replicated database.
FATAL:  terminating connection due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.

This is expected and we see no issue with that. What I did not expect, however, is that in the JDBC client side, the connection is deadlocked.

java.lang.Thread.State: RUNNABLE
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(Unknown Source)
at java.net.SocketOutputStream.write(Unknown Source)
at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
at java.io.BufferedOutputStream.flush(Unknown Source)
- locked <0x0000000700742e30> (a java.io.BufferedOutputStream)
at org.postgresql.core.PGStream.flush(PGStream.java:518)
at org.postgresql.core.v3.ProtocolConnectionImpl.close(ProtocolConnectionImpl.java:136)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:280)
- locked <0x0000000700742fd0> (a org.postgresql.core.v3.QueryExecutorImpl)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:547)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:302)
...


Looking at the Postgres JDBC code, I notice that ProtocolConnectionImpl.close() (invoked by the exception handler in QueryExecutorImpl.execute) is trying to 'gracefully close' by sending an 'X' to the server before actually closing the socket.


...
if (logger.logDebug()) logger.debug(" FE=> Terminate"); pgStream.SendChar('X'); pgStream.flush(); pgStream.close(); ...

Does this make sense in a scenario of a connection which has already been terminated by the server side?

At times of high load, all connections in the pool get eventually locked with exactly the same stack trace.

I'd appreciate any advice on how to handle this. Could this be a bug in the JDBC driver?

Thanks,

Leonardo


Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
"Steffen Heil (Mailinglisten)"
Дата:
Hi


> What happens if the server just discards the connection. In other words it
> doesn't close it. It just terminates without closing any open connections ?

That should only happen, if the servers OS crashes. For a process there is no
way to "discard" a connection. (The server could keep the connection open
without sending or reading, but I would consider that a serious bug.)

And if the server OS crashes and the client tries to send something, a timeout
waiting for the sent packets will occur or if the server rebooted in time, a
RST packet will be sent. Either way, the client will detect that and close the
connection.


Regards,
  Steffen



Вложения

Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
"Steffen Heil (Mailinglisten)"
Дата:
Hi


> I was referring to the fact that the TCP connection is indeed left 'half
> open'
> with the jdbc client side still assuming it's 'properly' open and thus
> trying to
> send a graceful close command over the socket ("X" in v2 or "X4" in v3) . It
> is
> this action that indefinitely blocks the client thread, which is then no
> longer
> usable by the application.

But it should not block indefinitely. Why would it?
If the server closed the connection, it should reply with a RST and the client
should detect that and throw an IOException( "Socket closed" ).


> All of this said, I still think it is conceptually incorrect to attempt a
> graceful
> closure after an IO error in the client.

Agreed. IOExceptions on blocking sockets usually indicate that you cannot do
anything good with them anymore.


Regards,
  Steffen



Вложения

Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
Leonardo Frittelli
Дата:
Steffen,

I can't really comment on what is the behavior of the PG server, but at least in the scenario that I am having (replication taking priority over a local query in hot_standby) either the client does not appear to receive an RST or Java 1.7 is still trying to write to that socket even if the reading end is gone, or even worse: the server is not actually closing it but simply no longer reading (Perhaps this is indeed hinting at a bug in the server side too?)

The stack trace does not give much detail either. This is the new trace I get after patching the driver to use abort() instead of close():

...
        at java.lang.Thread.run(Unknown Source)
Caused by: org.postgresql.util.PSQLException: FATAL: terminating connection due to conflict with recovery
  Detail: User query might have needed to see row versions that must be removed.
  Hint: In a moment you should be able to reconnect to the database and repeat your command.
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2198)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1927)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:562)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:420)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:305)
        at com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:174)
        at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:56)
        ... 42 more

As you can see, the IO Exception which caused the PSQL Exception is not coming up in the stack trace. 

Just for your awareness, please note that we are now using a snapshot that Dave created for us with the change above mentioned (to use abort() instead of close()) and we no longer see the issue now.

Regards,

Leonardo



On 8 October 2015 at 13:46, Steffen Heil (Mailinglisten) <lists@steffen-heil.de> wrote:
Hi


> I was referring to the fact that the TCP connection is indeed left 'half
> open'
> with the jdbc client side still assuming it's 'properly' open and thus
> trying to
> send a graceful close command over the socket ("X" in v2 or "X4" in v3) . It
> is
> this action that indefinitely blocks the client thread, which is then no
> longer
> usable by the application.

But it should not block indefinitely. Why would it?
If the server closed the connection, it should reply with a RST and the client
should detect that and throw an IOException( "Socket closed" ).


> All of this said, I still think it is conceptually incorrect to attempt a
> graceful
> closure after an IO error in the client.

Agreed. IOExceptions on blocking sockets usually indicate that you cannot do
anything good with them anymore.


Regards,
  Steffen



Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
Donald
Дата:
I think there maybe multiple bugs going on here and possibly not fixed...
First I'd like to say that even though we are dealing with TCP one must treat it almost like UDP. TCP packets can be consumed by firewalls and proxy servers with no response given to the sender. So with that in mind to create a robust fault tolerant driver, timeout on all TCP communications is a must.

1) So in the first case where an IOException is thrown, the original code that is closing the connection via "close" cannot have a timeout feature enabled and that will still exist in the code with the potential to fail...

2) When an IOException is thrown as the origin of an exception this should not be lost and should always be attached to the final exception. I don't see this cascaded in the updated driver using the "abort" code...

Regards
Donald



On 08/10/2015 18:45, Leonardo Frittelli wrote:
Steffen,

I can't really comment on what is the behavior of the PG server, but at least in the scenario that I am having (replication taking priority over a local query in hot_standby) either the client does not appear to receive an RST or Java 1.7 is still trying to write to that socket even if the reading end is gone, or even worse: the server is not actually closing it but simply no longer reading (Perhaps this is indeed hinting at a bug in the server side too?)

The stack trace does not give much detail either. This is the new trace I get after patching the driver to use abort() instead of close():

...
        at java.lang.Thread.run(Unknown Source)
Caused by: org.postgresql.util.PSQLException: FATAL: terminating connection due to conflict with recovery
  Detail: User query might have needed to see row versions that must be removed.
  Hint: In a moment you should be able to reconnect to the database and repeat your command.
        at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2198)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1927)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:562)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:420)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:305)
        at com.jolbox.bonecp.PreparedStatementHandle.executeQuery(PreparedStatementHandle.java:174)
        at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:56)
        ... 42 more

As you can see, the IO Exception which caused the PSQL Exception is not coming up in the stack trace. 

Just for your awareness, please note that we are now using a snapshot that Dave created for us with the change above mentioned (to use abort() instead of close()) and we no longer see the issue now.

Regards,

Leonardo



On 8 October 2015 at 13:46, Steffen Heil (Mailinglisten) <lists@steffen-heil.de> wrote:
Hi


> I was referring to the fact that the TCP connection is indeed left 'half
> open'
> with the jdbc client side still assuming it's 'properly' open and thus
> trying to
> send a graceful close command over the socket ("X" in v2 or "X4" in v3) . It
> is
> this action that indefinitely blocks the client thread, which is then no
> longer
> usable by the application.

But it should not block indefinitely. Why would it?
If the server closed the connection, it should reply with a RST and the client
should detect that and throw an IOException( "Socket closed" ).


> All of this said, I still think it is conceptually incorrect to attempt a
> graceful
> closure after an IO error in the client.

Agreed. IOExceptions on blocking sockets usually indicate that you cannot do
anything good with them anymore.


Regards,
  Steffen




Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
lfrittelli
Дата:
Donald,

These are good points all. Let me provide more context for my scenario on
each item
> First I'd like to say that even though we are dealing with TCP one must
> treat it almost like UDP. TCP
> packets can be consumed by firewalls and proxy servers with no response
> given to the sender. So with
> that in mind to create a robust fault tolerant driver, timeout on all TCP
> communications is a must.

True in the general case, but I don't think it would have affected the
specific scenario under issue. For context on the scenario, note that there
was no firewall or proxy. We were using local connections
(jdbc:postgresql://localhost...)

> 1) So in the first case where an IOException is thrown, the original code
> that is closing the connection
> via "close" cannot have a timeout feature enabled and that will still
> exist in the code with the potential
> to fail...

As the 'graceful' close() method will write on the socket, would it be
possible to put a timeout on write? I am no expert but have not been able to
find out how.


> 2) When an IOException is thrown as the origin of an exception this should
> not be lost and should
> always be attached to the final exception. I don't see this cascaded in
> the updated driver using the
> "abort" code...

At least in some of the cases it appears that the original IO error is
consumed and converted to an error SQL state, and not re-thrown.
In other cases there is actually a message from the server advising of the
termination and this is somehow interpreted by the client, thus no IO
exception in those cases.

Here is a more detailed log capture (including some Hibernate entries) -
again, this is after the fix:

(SqlExceptionHelper.java:145) - SQL Error: 0, SQLState: 40001
(SqlExceptionHelper.java:147) - FATAL: terminating connection due to
conflict with recovery
  Detail: User query might have needed to see row versions that must be
removed.
  Hint: In a moment you should be able to reconnect to the database and
repeat your command.
org.hibernate.exception.LockAcquisitionException: could not extract
ResultSet
        at
org.hibernate.exception.internal.SQLStateConversionDelegate.convert(SQLStateConversionDelegate.java:140)
        at
org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49)
        at
org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:125)
        at
org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:110)
        at
org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:61)
        at org.hibernate.loader.Loader.getResultSet(Loader.java:2040)
        at
org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1837)
        at
org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1816)
        at org.hibernate.loader.Loader.doQuery(Loader.java:900)
        at
org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:342)
        at org.hibernate.loader.Loader.doList(Loader.java:2526)
        at org.hibernate.loader.Loader.doList(Loader.java:2512)
        at
org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2342)
        at org.hibernate.loader.Loader.list(Loader.java:2337)
        at
org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:124)
        at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1662)
        at org.hibernate.internal.CriteriaImpl.list(CriteriaImpl.java:374)
...
Caused by: org.postgresql.util.PSQLException: FATAL: terminating connection
due to conflict with recovery
  Detail: User query might have needed to see row versions that must be
removed.
  Hint: In a moment you should be able to reconnect to the database and
repeat your command.
        at
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2198)
        at
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1927)
        at
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
        at
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:562)
        at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:420)
        at
org.postgresql.jdbc2.AbstractJdbc2Statement.executeQuery(AbstractJdbc2Statement.java:305)
        ... 46 more

Let me know if there is any additional detail that I could provide.

Regards,

Leonardo




--
View this message in context:
http://postgresql.nabble.com/Connection-terminated-by-the-server-causes-deadlock-in-jdbc-client-side-connection-tp5869131p5870008.html
Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.


Re: Connection terminated by the server causes deadlock in jdbc client side connection

От
Donald
Дата:
Hi Leonardo

On 14/10/2015 22:26, lfrittelli wrote:
> These are good points all. Let me provide more context for my scenario on
> each item
>> >First I'd like to say that even though we are dealing with TCP one must
>> >treat it almost like UDP. TCP
>> >packets can be consumed by firewalls and proxy servers with no response
>> >given to the sender. So with
>> >that in mind to create a robust fault tolerant driver, timeout on all TCP
>> >communications is a must.
> True in the general case, but I don't think it would have affected the
> specific scenario under issue. For context on the scenario, note that there
> was no firewall or proxy. We were using local connections
> (jdbc:postgresql://localhost...)
>
Well in your case the server closed the connection and one can only
assume that it did so by tearing down the socket. That is the shut-down
was not a graceful close so no FIN packet sent to the client and
therefore the client socket has no idea that the server connection has
been terminated. Maybe the server did send a FIN and there is a bug in
the Java socket code, but I doubt it. It really highlights my point that
a timeout should be used for sending data. The socket API supports this
(SO_SNDTIMEO) however there is no such feature within the Oracle Java
implementation, one has to use a third party implementation to find this
feature for example: com/savarese/rocksaw/net/RawSocket. Using Oracle
the only option is to set the SO_KEEPALIVE setting for the socket, for
which there is a PostgreSQL driver parameter setting, and it means the
socket would "eventually" get closed if the server silently closed the
connection. "Eventually" is determined by the default OS settings for
SO_KEEPALIVE.

>> 1) So in the first case where an IOException is thrown, the original code
>> >that is closing the connection
>> >via "close" cannot have a timeout feature enabled and that will still
>> >exist in the code with the potential
>> >to fail...
> As the 'graceful' close() method will write on the socket, would it be
> possible to put a timeout on write? I am no expert but have not been able to
> find out how.
Not possible with the implementation of the current driver and Oracle's
implementation of sockets.

>> 2) When an IOException is thrown as the origin of an exception this should
>> >not be lost and should
>> >always be attached to the final exception. I don't see this cascaded in
>> >the updated driver using the
>> >"abort" code...
> At least in some of the cases it appears that the original IO error is
> consumed and converted to an error SQL state, and not re-thrown.
> In other cases there is actually a message from the server advising of the
> termination and this is somehow interpreted by the client, thus no IO
> exception in those cases.
I'm not sure what is happening, but probably some times you are actually
seeing an IOException generated from the "close" because the real
exception is shown via you latest listing;

> (SqlExceptionHelper.java:145) - SQL Error: 0, SQLState: 40001
> (SqlExceptionHelper.java:147) - FATAL: terminating connection due to
> conflict with recovery
>    Detail: User query might have needed to see row versions that must be
> removed.
>    Hint: In a moment you should be able to reconnect to the database and
> repeat your command.
This shows the real cause of the exception, which is an SQLException,
and we know at this point that the server socket will be closed so
definitely no point in trying to send anything to the server after
receiving this.

So in summary the changes from "close" to "abort" are the right fix,
however until Oracle implement the SO_SNDTIMEO feature we will always
have the possibility of a "hung" socket unless we use the SO_KEEPALIVE
feature, which is not ideal because we have no control over the
intervals that these are sent without altering OS settings somewhere.

Regards
Donald Fraser