Re: Proposal: "Causal reads" mode for load balancing reads without stale data

Поиск
Список
Период
Сортировка
От Thom Brown
Тема Re: Proposal: "Causal reads" mode for load balancing reads without stale data
Дата
Msg-id CAA-aLv764BM0mp5rk0B-Ooq0tB1LiXfB_ckCAw=DEjqVAuJDew@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Proposal: "Causal reads" mode for load balancing reads without stale data  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: Proposal: "Causal reads" mode for load balancing reads without stale data  (Michael Paquier <michael.paquier@gmail.com>)
Список pgsql-hackers
On 21 February 2016 at 23:18, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Mon, Feb 22, 2016 at 2:10 AM, Thom Brown <thom@linux.com> wrote:
>> On 3 February 2016 at 10:46, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>>> On Wed, Feb 3, 2016 at 10:59 PM, Amit Langote
>>> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>>> There seems to be a copy-pasto there - shouldn't that be:
>>>>
>>>> + if (walsndctl->lsn[SYNC_REP_WAIT_FLUSH] < MyWalSnd->flush)
>>>
>>> Indeed, thanks!  New patch attached.
>>
>> I've given this a test drive, and it works exactly as described.
>
> Thanks for trying it out!
>
>> But one thing which confuses me is when a standby, with causal_reads
>> enabled, has just finished starting up.  I can't connect to it
>> because:
>>
>> FATAL:  standby is not available for causal reads
>>
>> However, this is the same message when I'm successfully connected, but
>> it's lagging, and the primary is still waiting for the standby to
>> catch up:
>>
>> ERROR:  standby is not available for causal reads
>>
>> What is the difference here?  The problem being reported appears to be
>> identical, but in the first case I can't connect, but in the second
>> case I can (although I still can't issue queries).
>
> Right, you get the error at login before it has managed to connect to
> the primary, and for a short time after while it's in 'joining' state,
> or potentially longer if there is a backlog of WAL to apply.
>
> The reason is that when causal_reads = on in postgresql.conf (as
> opposed to being set for an individual session or role), causal reads
> logic is used for snapshots taken during authentication (in fact the
> error is generated when trying to take a snapshot slightly before
> authentication proper begins, in InitPostgres).  I think that's a
> desirable feature: if you have causal reads on and you create/alter a
> database/role (for example setting a new password) and commit, and
> then you immediately try to connect to that database/role on a standby
> where you have causal reads enabled system-wide, then you get the
> causal reads guarantee during authentication: you either see the
> effects of your earlier transaction or you see the error.  As you have
> discovered, there is a small window after a standby comes up where it
> will show the error because it hasn't got a lease yet so it can't let
> you log in yet because it could be seeing a stale catalog (your user
> may not exist on the standby yet, or have been altered in some way, or
> your database may not exist yet, etc).
>
> Does that make sense?

Ah, alles klar.  Yes, that makes sense now.  I've been trying to break
it the past few days, and this was the only thing which I wasn't clear
on.  The parameters all work as described

The replay_lag is particularly cool.  Didn't think it was possible to
glean this information on the primary, but the timings are correct in
my tests.

+1 for this patch.  Looks like this solves the problem that
semi-synchronous replication tries to solve, although arguably in a
more sensible way.

Thom



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Relaxing SSL key permission checks
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: about google summer of code 2016