Обсуждение: [BUGS] BUG #14516: misleading error from libpq on out-of-memory

Поиск
Список
Период
Сортировка

[BUGS] BUG #14516: misleading error from libpq on out-of-memory

От
andrew@tao11.riddles.org.uk
Дата:
The following bug has been logged on the website:

Bug reference:      14516
Logged by:          Andrew Gierth
Email address:      andrew@tao11.riddles.org.uk
PostgreSQL version: 9.4.8
Operating system:   any
Description:

This came up on IRC:

Running psql with a restricted memory ulimit to reproduce, one gets this
error message:

postgres=# \copy (select repeat('a',120000000)) to '/dev/null'
lost synchronization with server: got message type "d", length 120000001

Googling this finds some old bugs, but in the reported case, none of those
applied and the only problem was an actual lack of memory; the database
contained a large (60MB) bytea value, and pg_dump would fail with the "lost
synchronization" error.

So the failure is expected, but the fact that the error message doesn't even
hint at "out of memory" being the cause is a problem.


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] BUG #14516: misleading error from libpq on out-of-memory

От
Tom Lane
Дата:
andrew@tao11.riddles.org.uk writes:
> Running psql with a restricted memory ulimit to reproduce, one gets this
> error message:
> postgres=# \copy (select repeat('a',120000000)) to '/dev/null'
> lost synchronization with server: got message type "d", length 120000001

Yeah.  Per the code comments:

            /*
             * Before returning, enlarge the input buffer if needed to hold
             * the whole message.  ...
             */
            if (pqCheckInBufferSpace(conn->inCursor + (size_t) msgLength,
                                     conn))
            {
                /*
                 * XXX add some better recovery code... plan is to skip over
                 * the message using its length, then report an error. For the
                 * moment, just treat this like loss of sync (which indeed it
                 * might be!)
                 */
                handleSyncLoss(conn, id, msgLength);
            }

The point being that if we lost sync with the message stream, the symptom
from this code's point of view would be a garbage header for the next
message, which would include a possibly-bogus message type code and a
definitely-insane-looking length.  There's a sanity check involving
VALID_LONG_MESSAGE_TYPE ahead of this, but with seven such message types,
the odds of getting past that are not negligible.

In short, it's not that easy to do better.  Given the infrequency of
complaints, I'm not sure it's worth spending time on.  (The code in
question has been like that since we started using message length words,
in 2003; cf 5ed27e35f35f6c354b1a7120ec3a3ce57f93e73e.  I don't think
it's come up more than a couple of times since then.)

            regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs