Re: assessing parallel-safety

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: assessing parallel-safety
Дата
Msg-id 20150315063934.GA420346@tornado.leadboat.com
обсуждение исходный текст
Ответ на Re: assessing parallel-safety  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: assessing parallel-safety  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Thu, Mar 12, 2015 at 11:21:37AM -0400, Robert Haas wrote:
> On Thu, Feb 19, 2015 at 1:19 AM, Noah Misch <noah@leadboat.com> wrote:
> > Rereading my previous message, I failed to make the bottom line clear: I
> > recommend marking eqsel etc. PROPARALLEL_UNSAFE but *not* checking an
> > estimator's proparallel before calling it in the planner.
> 
> But what do these functions do that is actually unsafe?

They call the oprcode function of any operator naming them as an estimator.
Users can add operators that use eqsel() as an estimator, and we have no bound
on what those operators' oprcode can do.  (In practice, parallel-unsafe
operators using eqsel() as an estimator will be rare.)

> >> Hmm.  Altogether prohibiting XLogInsert() in parallel mode seems
> >> unwise, because it would foreclose heap_page_prune_opt() in workers.
> >> I realize there's separate conversation about whether pruning during
> >> SELECT queries is good policy, but in the interested of separating
> >> mechanism from policy, and in the sure knowledge that allowing at
> >> least some writes in parallel mode is certainly going to be something
> >> people will want, it seems better to look into propagating
> >> XactLastRecEnd.
> >
> > Good points; that works for me.
> 
> The key design decision here seems to be this: How eagerly do we need
> to synchronize XactLastRecEnd?  What exactly goes wrong if it isn't
> synchronized?  For example, if the value were absolutely critical in
> all circumstances, one could imagine storing a shared XactLastRecEnd
> in shared memory.  This doesn't appear to be the case: the main
> requirement is that we definitely need an up-to-date value at commit
> time.  Also, at abort time, we don't really the value for anything
> critical, but it's worth kicking the WAL writer so that any
> accumulated WAL gets flushed.

Exactly.  At entry to RecordTransactionCommit(), XactLastRecEnd must point
past the end of all records whose replay is required to satisfy the user's
expectation of transaction durability.  At other times, harm from its value
being wrong is negligible.  I do suggest adding a comment to its definition
explaining when one can rely on it.

> Here's an incremental patch - which I will incorporate into the
> parallel mode patch if it seems about right to you - that tries to
> tackle all this.

I reviewed this.  Minor things:

> @@ -73,10 +74,15 @@ typedef struct FixedParallelState
>      /* Entrypoint for parallel workers. */
>      parallel_worker_main_type    entrypoint;
>  
> -    /* Track whether workers have attached. */
> +    /* Mutex protects remaining fiedlds. */

Typo.

> @@ -2564,10 +2578,19 @@ AbortTransaction(void)
>       * worker, skip this; the user backend must be the one to write the abort
>       * record.
>       */
> -    if (parallel)
> -        latestXid = InvalidTransactionId;
> -    else
> +    if (!parallel)
>          latestXid = RecordTransactionAbort(false);
> +    else
> +    {
> +        latestXid = InvalidTransactionId;
> +
> +        /*
> +         * Since the parallel master won't get our value of XactLastRecEnd in this
> +         * case, we nudge WAL-writer ourselves in this case.  See related comments in
> +         * RecordTransactionAbort for why this matters.
> +         */
> +        XLogSetAsyncXactLSN(XactLastRecEnd);

RecordTransactionAbort() skips this for subtransaction aborts.  I would omit
it here, because a parallel worker abort is, in this respect, more like a
subtransaction abort than like a top-level transaction abort.


While studying README.parallel from parallel-mode-v7.patch to understand the
concept of worker commit/abort, this part gave me the wrong idea:

> +Transaction Integration
> +=======================
...
> +Transaction commit or abort requires careful coordination between backends.
> +Each backend has its own resource owners: buffer pins, catcache or relcache
> +reference counts, tuple descriptors, and so on are managed separately by each
> +backend, and each backend is separately responsible for releasing such
> +resources.  Generally, the commit or abort of a parallel worker is much like
> +a top-transaction commit or abort, but there are a few exceptions.  Most
> +importantly:
> +
> +  - No commit or abort record is written; the initiating backend is
> +    responsible for this.
> +
> +  - Cleanup of pg_temp namespaces is not done.  The initiating backend is
> +    responsible for this, too.
> +
> +The master kills off all remaining workers as part of commit or abort
> +processing.  It must not only kill the workers but wait for them to actually

I took this to mean that workers normally persist until the master commits a
transaction.  Rather, their lifecycle is like the lifecycle of a buffer pin.
When an error interrupts a parallel operation, transaction abort destroys
workers.  Otherwise, the code driving the specific parallel task destroys them
as early as is practical.  (Strictly to catch bugs, transaction commit does
detect and destroy leaked workers.)

> +exit; otherwise, chaos can ensue.  For example, if the master is
> +rolling back the transaction that created the relation being scanned by
> +a worker, the relation could disappear while the worker is still busy
> +scanning it.  That's not safe.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andreas Karlsson
Дата:
Сообщение: Re: patch : Allow toast tables to be moved to a different tablespace
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: ranlib bleating about dirmod.o being empty