Re: Postgres restart during CopyManager.copyIn does not free connection, thread stuck on QueryExecutorImpl.waitOnLock

Поиск
Список
Период
Сортировка
От Alexis Meneses
Тема Re: Postgres restart during CopyManager.copyIn does not free connection, thread stuck on QueryExecutorImpl.waitOnLock
Дата
Msg-id CANPkoZS7jNmPYyrPguw-RHJu1KzXAFtKh7teGNeWQZ_TQGro-A@mail.gmail.com
обсуждение исходный текст
Ответ на Postgres restart during CopyManager.copyIn does not free connection, thread stuck on QueryExecutorImpl.waitOnLock  (Brendan Reekie <breekie@sandvine.com>)
Список pgsql-jdbc
Hi

I think that a similar issue has been seen already (see thread http://www.postgresql.org/message-id/flat/CADGbXSQ--8pJcSPkC7+tR6rsGrk7p=141Bp16VJiOR5mg_SQpQ@mail.gmail.com) but it has not yet been fixed.

Would you have time to work on a patch and submit a pull request on the github project?

Thanks.

Alexis


2015-02-09 19:38 GMT+01:00 Brendan Reekie <breekie@sandvine.com>:

Hi,

 

I’m currently using driver: 9.3.1100-jdbc3.jar with a 9.3.5 server.

 

The behaviour I’m seeing is if the connection to the database is lost due a restart of Postgres and the block of code being executed is a CopyManager.copyIn() method the connection to the database is never freed and the stack trace shows that the thread is still awaiting unlock:

 

                java.lang.Object.$$YJP$$wait(Native Method)

                java.lang.Object.wait(Object.java)

                java.lang.Object.wait(Object.java:503)

                org.postgresql.core.v3.QueryExecutorImpl.waitOnLock(QueryExecutorImpl.java:91)

                org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:228)

                org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:560)

                org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:403)

                org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:395)

 

Debugging through the code it looks like the issue might be in the QueryExecutorImpl.cancelCopy() operation.  When the operation is attempting to flush the pgStream this throws an IOException and the block of code to remove the lock (processCopyResults) is never called and the connection remains open and the lock never freed.

 

 

    /**

     * Finishes a copy operation and unlocks connection discarding any exchanged data.

     * @param op the copy operation presumably currently holding lock on this connection

     * @throws SQLException on any additional failure

     */

    public void cancelCopy(CopyOperationImpl op) throws SQLException {

        if(!hasLock(op))

            throw new PSQLException(GT.tr("Tried to cancel an inactive copy operation"), PSQLState.OBJECT_NOT_IN_STATE);

 

        SQLException error = null;

        int errors = 0;

 

        try {

            if(op instanceof CopyInImpl) {

                synchronized (this) {

                    if (logger.logDebug()) {

                        logger.debug("FE => CopyFail");

                    }

                    final byte[] msg = Utils.encodeUTF8("Copy cancel requested");

                    pgStream.SendChar('f'); // CopyFail

                    pgStream.SendInteger4(5 + msg.length);

                    pgStream.Send(msg);

                    pgStream.SendChar(0);

                    pgStream.flush();

                    do {

                        try {

                            processCopyResults(op, true); // discard rest of input

                        } catch(SQLException se) { // expected error response to failing copy

                            errors++;

                            if( error != null ) {

                                SQLException e = se, next;

                                while( (next = e.getNextException()) != null )

                                    e = next;

                                e.setNextException(error);

                            }

                            error = se;

                        }

                    } while(hasLock(op));

                }

            } else if (op instanceof CopyOutImpl) {

                protoConnection.sendQueryCancel();

            }

 

        } catch(IOException ioe) {

            throw new PSQLException(GT.tr("Database connection failed when canceling copy operation"), PSQLState.CONNECTION_FAILURE, ioe);

        }

 

        if (op instanceof CopyInImpl) {

            if(errors < 1) {

                throw new PSQLException(GT.tr("Missing expected error response to copy cancel request"), PSQLState.COMMUNICATION_ERROR);

            } else if(errors > 1) {

                throw new PSQLException(GT.tr("Got {0} error responses to single copy cancel request", String.valueOf(errors)), PSQLState.COMMUNICATION_ERROR, error);

            }

        }

    }

 

I’ve tried the latest driver 9.4-1200 and observed the same behaviour.  To reproduce this test I’m using a tester that writes to copyIn using a stream of data and set a break point and restart Postgres server while performing the copyIn.

 

Has anyone seen this issue previously?  Is there a work around to this scenario?

 

Thanks in advance,

Brendan


В списке pgsql-jdbc по дате отправления:

Предыдущее
От: Albe Laurenz
Дата:
Сообщение: SSL renegotiation is broken
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: SSL renegotiation is broken