Обсуждение: Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

Поиск
Список
Период
Сортировка

Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

От
Kyotaro Horiguchi
Дата:
At Fri, 17 Jun 2022 20:31:50 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in 
> On 2022-Jun-16, Kyotaro Horiguchi wrote:
> 
> > The attached is a crude patch to separate the state for PIPELINE-IDLE
> > from PGASYNC_IDLE.  I haven't found a better way..
> 
> Ah, yeah, this might be a way to fix this.
> 
> Something very similar to a PIPELINE_IDLE mode was present in Craig's
> initial patch for pipeline mode.  However, I fought very hard to remove
> it, because it seemed to me that failing to handle it correctly
> everywhere would lead to more bugs than not having it.  (Indeed, there
> were some.)
> 
> However, I see now that your patch would not only fix this bug, but also
> let us remove the ugly "notionally not-idle" bit in fe-protocol3.c,
> which makes me ecstatic.  So let's push forward with this.  However,

Yey.

> this means we'll have to go over all places that use asyncStatus to
> ensure that they all handle the new value correctly.

Sure.

> I did found one bug in your patch: in the switch for asyncStatus in
> PQsendQueryStart, you introduce a new error message.  With the current
> tests, that never fires, which is telling us that our coverage is not
> complete.  But with the right sequence of libpq calls, which the
> attached adds (note that it's for REL_14_STABLE), that can be hit, and

# (ah, I wondered why it failed to apply..)

> it's easy to see that throwing an error there is a mistake.  The right
> action to take there is to let the action through.

Yeah.. Actulallly I really did it carelessly.. Thanks!

> Others to think about:
> 
> PQisBusy (I think no changes are needed),

Agreed.

> PQfn (I think it should accept a call in PGASYNC_PIPELINE_IDLE mode;
> fully untested in pipeline mode),

Does a PQ_PIPELINE_OFF path need that? Rather I thought that we need
Assert(!conn->asyncStatus != PGASYNC_PIPELINE_IDLE) there.  But anyway
we might need a test for this path.

> PQexitPipelineMode (I think it needs to return error; needs test case),

Agreed about test case. Currently the function doesn't handle
PGASYNC_IDLE so I thought that PGASYNC_PIPELINE_IDLE also don't need a
care.  If the function treats PGASYNC_PIPELINE_IDLE state as error,
the regression test fails (but I haven't examine it furtuer.)

> PQsendFlushRequest (I think it should let through; ditto).

Does that mean exit without pushing 'H' message?

> I also attach a patch to make the test suite use Test::Differences, if
> available.  It makes the diffs of the traces much easier to read, when
> they fail.  (I wish for a simple way to set the context size, but that
> would need a shim routine that I'm currently too lazy to write.)

Yeah, it was annoying that the script prints expected and result trace
separately. It looks pretty good with the patch.  I don't think
there's much use of context size here.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

От
Kyotaro Horiguchi
Дата:
At Tue, 21 Jun 2022 11:42:59 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> At Fri, 17 Jun 2022 20:31:50 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in 
> > Others to think about:
> > 
> > PQisBusy (I think no changes are needed),
> 
> Agreed.
> 
> > PQfn (I think it should accept a call in PGASYNC_PIPELINE_IDLE mode;
> > fully untested in pipeline mode),
> 
> Does a PQ_PIPELINE_OFF path need that? Rather I thought that we need
> Assert(!conn->asyncStatus != PGASYNC_PIPELINE_IDLE) there.  But anyway
> we might need a test for this path.

In the attached, PQfn() is used while in pipeline mode and
PGASYNC_PIPELINE_IDLE. Both error out and effectivelly no-op.

> > PQexitPipelineMode (I think it needs to return error; needs test case),
> 
> Agreed about test case. Currently the function doesn't handle
> PGASYNC_IDLE so I thought that PGASYNC_PIPELINE_IDLE also don't need a
> care.  If the function treats PGASYNC_PIPELINE_IDLE state as error,
> the regression test fails (but I haven't examine it furtuer.)

PQexitPipelineMode() is called while PGASYNC_PIPELINE_IDLE.

> > PQsendFlushRequest (I think it should let through; ditto).
> 
> Does that mean exit without pushing 'H' message?

I didn't do anything on this in the sttached.

By the way, I noticed that "libpq_pipeline uniqviol" intermittently
fails for uncertain reasons.

> result 574/575: pipeline aborted
> ...........................................................
> done writing
> 
> libpq_pipeline:1531: got unexpected NULL

The "...........done writing" is printed too late in the error cases.

This causes the TAP test fail, but I haven't find what's happnening at
the time.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Вложения

Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

От
Kyotaro Horiguchi
Дата:
At Tue, 21 Jun 2022 14:56:40 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> By the way, I noticed that "libpq_pipeline uniqviol" intermittently
> fails for uncertain reasons.
> 
> > result 574/575: pipeline aborted
> > ...........................................................
> > done writing
> > 
> > libpq_pipeline:1531: got unexpected NULL
> 
> The "...........done writing" is printed too late in the error cases.
> 
> This causes the TAP test fail, but I haven't find what's happnening at
> the time.

Just to make sure, I see this with the master branch

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

От
Kyotaro Horiguchi
Дата:
At Tue, 21 Jun 2022 14:59:07 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> At Tue, 21 Jun 2022 14:56:40 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> > By the way, I noticed that "libpq_pipeline uniqviol" intermittently
> > fails for uncertain reasons.
> > 
> > > result 574/575: pipeline aborted
> > > ...........................................................
> > > done writing
> > > 
> > > libpq_pipeline:1531: got unexpected NULL
> > 
> > The "...........done writing" is printed too late in the error cases.
> > 
> > This causes the TAP test fail, but I haven't find what's happnening at
> > the time.
> 
> Just to make sure, I see this with the master branch

No. It *is* caused by the fix. Sorry for the mistake.  The test module
linked to the wrong binary..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

От
Kyotaro Horiguchi
Дата:
At Tue, 21 Jun 2022 14:56:40 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> By the way, I noticed that "libpq_pipeline uniqviol" intermittently
> fails for uncertain reasons.
> 
> > result 574/575: pipeline aborted
> > ...........................................................
> > done writing
> > 
> > libpq_pipeline:1531: got unexpected NULL

PQsendQueryPrepared() is called after the conection's state has moved
to PGASYNC_IDLE so PQgetResult returns NULL. But actually there are
results.  So, if pqPipelineProcessorQueue() doesn't move the async
state to PGASYNC_IDLE when queue is emtpy, uniqviol can run till the
end. But that change breaks almost all of other test items.

Finally, I found that the change in pqPipelineProcessorQueue() as
attached fixes the uniqviol failure and doesn't break other tests.
However, I don't understand what I did by the change for now... X(
It seems to me something's wrong in the PQ_PIPELINE_ABORTED mode..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Вложения

Re: Using PQexecQuery in pipeline mode produces unexpected Close messages

От
Alvaro Herrera
Дата:
So I wrote some more test scenarios for this, and as I wrote in some
other thread, I realized that there are more problems than just some
NOTICE trouble.  For instance, if you send a query, then read the result
but not the terminating NULL then send another query, everything gets
confused; the next thing you'll read is the result for the second query,
without having read the NULL that terminates the results of the first
query.  Any application that expects the usual flow of results will be
confused.  Kyotaro's patch to add PIPELINE_IDLE fixes this bit too, as
far as I can tell.

However, another problem case, not fixed by PIPELINE_IDLE, occurs if you
exit pipeline mode after PQsendQuery() and then immediately use
PQexec().  The CloseComplete will be received at the wrong time, and a
notice is emitted nevertheless.

I spent a lot of time trying to understand how to fix this last bit, and
the solution I came up with is that PQsendQuery() must add a second
entry to the command queue after the PGQUERY_EXTENDED one, to match the
CloseComplete message being expected; with that entry in the queue,
PQgetResult will really only get to IDLE mode after the Close has been
seen, which is what we want.  I named it PGQUERY_CLOSE.

Sadly, some hacks are needed to make this work fully:

1. the client is never expecting that PQgetResult() would return
   anything for the CloseComplete, so something needs to consume the
   CloseComplete silently (including the queue entry for it) when it is
   received; I chose to do this directly in pqParseInput3.  I tried to
   make PQgetResult itself do it, but it became a pile of hacks until I
   was no longer sure what was going on.  Putting it in fe-protocol3.c
   ends up a lot cleaner.  However, we still need PQgetResult to invoke
   parseInput() at the point where Close is expected.

2. if an error occurs while executing the query, the CloseComplete will
   of course never arrive, so we need to erase it from the queue
   silently if we're returning an error.

I toyed with the idea of having parseInput() produce a PGresult that
encodes the Close message, and have PQgetResult consume and discard
that, then read some further message to have something to return.  But
it seemed inefficient and equally ugly and I'm not sure that flow
control is any simpler.

I think another possibility would be to make PQexitPipelineMode
responsible for /something/, but I'm not sure what that would be.
Checking the queue and seeing if the next message is CloseComplete, then
eating that message before exiting pipeline mode?  That seems way too
magical.  I didn't attempt this.

I attach a patch series that implements the proposed fix (again for
REL_14_STABLE) in steps, to make it easy to read.  I intend to squash
the whole lot into a single commit before pushing.  But while writing
this email it occurred to me that I need to add at least one more test,
to receive a WARNING while waiting for CloseComplete.  AFAICT it should
work, but better make sure.

I produced pipeline_idle.trace file by running the test in the fully
fixed tree, then used it to verify that all tests fail in different ways
when run without the fixes.  The first fix with PIPELINE_IDLE fixes some
of these failures, and the PGQUERY_CLOSE one fixes the remaining one.
Reading the trace file, it looks correct to me.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Doing what he did amounts to sticking his fingers under the hood of the
implementation; if he gets his fingers burnt, it's his problem."  (Tom Lane)

Вложения