Обсуждение: Synchronized snapshots versus multiple databases

Поиск
Список
Период
Сортировка

Synchronized snapshots versus multiple databases

От
Tom Lane
Дата:
I've thought of another nasty problem for the sync-snapshots patch.
Consider the following sequence of events:

1. Transaction A, which is about to export a snapshot, is running in  database X.
2. Transaction B is making some changes in database Y.
3. A takes and exports a snapshot showing B's xid as running.
4. Transaction B ends.
5. Autovacuum launches in database Y.  It sees nothing running in Y,  so it decides it can vacuum dead rows right up to
nextXid,including  anything B deleted.
 
6. Transaction C starts in database Y, and imports the snapshot from A.  Now it thinks it can see rows deleted by B ...
butvacuum is busy  removing them, or maybe already finished doing so.
 

The problem here is that A's xmin is ignored by GetOldestXmin when
calculating cutoff XIDs for non-shared tables in database Y, so it
doesn't protect would-be adoptees of the exported snapshot.

I can see a few alternatives, none of them very pleasant:

1. Restrict exported snapshots to be loaded only by transactions running
in the same database as the exporter.  This would fix the problem, but
it cuts out one of the main use-cases for sync snapshots, namely getting
cluster-wide-consistent dumps in pg_dumpall.

2. Allow a snapshot exported from another database to be loaded so long
as this doesn't cause the DB-local value of GetOldestXmin to go
backwards.  However, in scenarios such as the above, C is certain to
fail such a test.  To make it work, pg_dumpall would have to start
"advance guard" transactions in each database before it takes the
intended-to-be-shared snapshot, and probably even wait for these to be
oldest.  Ick.

3. Remove the optimization that lets GetOldestXmin ignore XIDs outside
the current database.  This sounds bad, but OTOH I don't think there's
ever been any proof that this optimization is worth much in real-world
usage.  We've already had to lobotomize that optimization for walsender
processes, anyway.

4. Somehow mark the xmin of a process that has exported a snapshot so
that it will be honored in all DBs not just the current one.  The
difficulty here is that we'd need to know *at the time the snap is
taken* that it's going to be exported.  (Consider the scenario above,
except that A doesn't get around to exporting the snapshot it took in
step 3 until between steps 5 and 6.  If the xmin wasn't already marked
as globally applicable when vacuum looked at it in step 5, we lose.)
This is do-able but it will contort the user-visible API of the sync
snapshots feature.  One way we could do it is to require that
transactions that want to export snapshots set a transaction mode
before they take their first snapshot.

Thoughts, better ideas?
        regards, tom lane


Re: Synchronized snapshots versus multiple databases

От
Florian Pflug
Дата:
On Oct21, 2011, at 17:36 , Tom Lane wrote:
> 1. Restrict exported snapshots to be loaded only by transactions running
> in the same database as the exporter.  This would fix the problem, but
> it cuts out one of the main use-cases for sync snapshots, namely getting
> cluster-wide-consistent dumps in pg_dumpall.

Isn't the use-case getting consistent *parallel* dumps of a single database
rather than consistent dump of multiple databases? Since we don't have atomic
cross-database commits, what does using the same snapshot to dump multiple
databases buy us?

On that grounds, +1 for option 1 here.

> 3. Remove the optimization that lets GetOldestXmin ignore XIDs outside
> the current database.  This sounds bad, but OTOH I don't think there's
> ever been any proof that this optimization is worth much in real-world
> usage.  We've already had to lobotomize that optimization for walsender
> processes, anyway.

Hm, we've told people who wanted cross-database access to tables in the
past to either
 * use dblink or
 * not split their tables over multiple databases in the first place,   and to use schemas instead

If we remove the GetOldestXmin optimization, we're essentially reversing
course on this. Do we really wanna go there?

best regards,
Florian Pflug



Re: Synchronized snapshots versus multiple databases

От
Andrew Dunstan
Дата:

On 10/21/2011 12:05 PM, Florian Pflug wrote:
> On Oct21, 2011, at 17:36 , Tom Lane wrote:
>> 1. Restrict exported snapshots to be loaded only by transactions running
>> in the same database as the exporter.  This would fix the problem, but
>> it cuts out one of the main use-cases for sync snapshots, namely getting
>> cluster-wide-consistent dumps in pg_dumpall.
> Isn't the use-case getting consistent *parallel* dumps of a single database
> rather than consistent dump of multiple databases? Since we don't have atomic
> cross-database commits, what does using the same snapshot to dump multiple
> databases buy us?

That was my understanding of the use case.

cheers

andrew


Re: Synchronized snapshots versus multiple databases

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 10/21/2011 12:05 PM, Florian Pflug wrote:
>> On Oct21, 2011, at 17:36 , Tom Lane wrote:
>>> 1. Restrict exported snapshots to be loaded only by transactions running
>>> in the same database as the exporter.  This would fix the problem, but
>>> it cuts out one of the main use-cases for sync snapshots, namely getting
>>> cluster-wide-consistent dumps in pg_dumpall.

>> Isn't the use-case getting consistent *parallel* dumps of a single database
>> rather than consistent dump of multiple databases? Since we don't have atomic
>> cross-database commits, what does using the same snapshot to dump multiple
>> databases buy us?

> That was my understanding of the use case.

Um, which one are you supporting?

Anyway, the value of using the same snapshot across all of a pg_dumpall
run would be that you could be sure that what you'd dumped concerning
role and tablespace objects was consistent with what you then dump about
database-local objects.  (In principle, anyway --- I'm not sure how
much of that happens under SnapshotNow rules because of use of backend
functions.  But you'll most certainly never be able to guarantee it if
pg_dumpall can't export its snapshot to each subsidiary pg_dump run.)
        regards, tom lane


Re: Synchronized snapshots versus multiple databases

От
Tom Lane
Дата:
Florian Pflug <fgp@phlo.org> writes:
> On Oct21, 2011, at 17:36 , Tom Lane wrote:
>> 3. Remove the optimization that lets GetOldestXmin ignore XIDs outside
>> the current database.  This sounds bad, but OTOH I don't think there's
>> ever been any proof that this optimization is worth much in real-world
>> usage.  We've already had to lobotomize that optimization for walsender
>> processes, anyway.

> Hm, we've told people who wanted cross-database access to tables in the
> past to either

>   * use dblink or

>   * not split their tables over multiple databases in the first place,
>     and to use schemas instead

> If we remove the GetOldestXmin optimization, we're essentially reversing
> course on this. Do we really wanna go there?

Huh?  The behavior of GetOldestXmin is purely a backend-internal matter.
I don't see how it's related to cross-database access --- or at least,
changing this would not represent a significant move towards supporting
that.
        regards, tom lane


Re: Synchronized snapshots versus multiple databases

От
Robert Haas
Дата:
On Fri, Oct 21, 2011 at 11:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I've thought of another nasty problem for the sync-snapshots patch.
>
> 1. Restrict exported snapshots to be loaded only by transactions running
> in the same database as the exporter.  This would fix the problem, but
> it cuts out one of the main use-cases for sync snapshots, namely getting
> cluster-wide-consistent dumps in pg_dumpall.
>
> 2. Allow a snapshot exported from another database to be loaded so long
> as this doesn't cause the DB-local value of GetOldestXmin to go
> backwards.  However, in scenarios such as the above, C is certain to
> fail such a test.  To make it work, pg_dumpall would have to start
> "advance guard" transactions in each database before it takes the
> intended-to-be-shared snapshot, and probably even wait for these to be
> oldest.  Ick.
>
> 3. Remove the optimization that lets GetOldestXmin ignore XIDs outside
> the current database.  This sounds bad, but OTOH I don't think there's
> ever been any proof that this optimization is worth much in real-world
> usage.  We've already had to lobotomize that optimization for walsender
> processes, anyway.
>
> 4. Somehow mark the xmin of a process that has exported a snapshot so
> that it will be honored in all DBs not just the current one.  The
> difficulty here is that we'd need to know *at the time the snap is
> taken* that it's going to be exported.  (Consider the scenario above,
> except that A doesn't get around to exporting the snapshot it took in
> step 3 until between steps 5 and 6.  If the xmin wasn't already marked
> as globally applicable when vacuum looked at it in step 5, we lose.)
> This is do-able but it will contort the user-visible API of the sync
> snapshots feature.  One way we could do it is to require that
> transactions that want to export snapshots set a transaction mode
> before they take their first snapshot.

I am unexcited by #2 on usability grounds.  I agree with you that #3
might end up being a fairly small pessimization in practice, but I'd
be inclined to just do #1 for now and revisit the issue when and if
someone shows an interest in revamping pg_dumpall to do what you're
proposing (and hopefully a bunch of other cleanup too).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Synchronized snapshots versus multiple databases

От
Florian Pflug
Дата:
On Oct21, 2011, at 19:09 , Tom Lane wrote:
> Florian Pflug <fgp@phlo.org> writes:
>> On Oct21, 2011, at 17:36 , Tom Lane wrote:
>>> 3. Remove the optimization that lets GetOldestXmin ignore XIDs outside
>>> the current database.  This sounds bad, but OTOH I don't think there's
>>> ever been any proof that this optimization is worth much in real-world
>>> usage.  We've already had to lobotomize that optimization for walsender
>>> processes, anyway.
> 
>> Hm, we've told people who wanted cross-database access to tables in the
>> past to either
> 
>>  * use dblink or
> 
>>  * not split their tables over multiple databases in the first place,
>>    and to use schemas instead
> 
>> If we remove the GetOldestXmin optimization, we're essentially reversing
>> course on this. Do we really wanna go there?
> 
> Huh?  The behavior of GetOldestXmin is purely a backend-internal matter.
> I don't see how it's related to cross-database access --- or at least,
> changing this would not represent a significant move towards supporting
> that.

AFAIR, the performance hit we'd take by making the vacuum cutoff point
(i.e. GetOldestXmin()) global instead of database-local has been repeatedly
used in the past as an against against cross-database queries. I have to
admit that I currently cannot seem to find an entry in the archives to
back that up, though.

best regards,
Florian Pflug



Re: Synchronized snapshots versus multiple databases

От
Robert Haas
Дата:
On Fri, Oct 21, 2011 at 1:40 PM, Florian Pflug <fgp@phlo.org> wrote:
> AFAIR, the performance hit we'd take by making the vacuum cutoff point
> (i.e. GetOldestXmin()) global instead of database-local has been repeatedly
> used in the past as an against against cross-database queries. I have to
> admit that I currently cannot seem to find an entry in the archives to
> back that up, though.

I think the main argument against cross-database queries is that every
place in the backend that, for example, uses an OID to identify a
table would need to be modified to use a database OID and a table OID.Even if the distributed performance penalty of
sucha change doesn't
 
bother you, the amount of code churn that it would take to make such a
change is mind-boggling.

I haven't seen anyone explain why they really need this feature
anyway, and I think it's going in the wrong direction.  IMHO, anyone
who wants to be doing cross-database queries should be using schemas
instead, and if that's not workable for some reason, then we should
improve the schema implementation until it becomes workable.  I think
that the target use case for separate databases ought to be
multi-tenancy, but what is needed there is actually more isolation
(e.g. wrt/role names, cluster-wide visibility of pg_database contents,
etc.), not less.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Synchronized snapshots versus multiple databases

От
Andrew Dunstan
Дата:

On 10/21/2011 01:06 PM, Tom Lane wrote:
> Andrew Dunstan<andrew@dunslane.net>  writes:
>> On 10/21/2011 12:05 PM, Florian Pflug wrote:
>>> On Oct21, 2011, at 17:36 , Tom Lane wrote:
>>>> 1. Restrict exported snapshots to be loaded only by transactions running
>>>> in the same database as the exporter.  This would fix the problem, but
>>>> it cuts out one of the main use-cases for sync snapshots, namely getting
>>>> cluster-wide-consistent dumps in pg_dumpall.
>>> Isn't the use-case getting consistent *parallel* dumps of a single database
>>> rather than consistent dump of multiple databases? Since we don't have atomic
>>> cross-database commits, what does using the same snapshot to dump multiple
>>> databases buy us?
>> That was my understanding of the use case.
> Um, which one are you supporting?


#1 seemed OK from this POV. Everything else looks ickier and/or more 
fragile, at first glance anyway.

> Anyway, the value of using the same snapshot across all of a pg_dumpall
> run would be that you could be sure that what you'd dumped concerning
> role and tablespace objects was consistent with what you then dump about
> database-local objects.  (In principle, anyway --- I'm not sure how
> much of that happens under SnapshotNow rules because of use of backend
> functions.  But you'll most certainly never be able to guarantee it if
> pg_dumpall can't export its snapshot to each subsidiary pg_dump run.)
>
>             

For someone who is concerned with that, maybe pg_dumpall could have an 
option to take an EXCLUSIVE lock on the shared catalogs?

cheers

andrew


Re: Synchronized snapshots versus multiple databases

От
Florian Pflug
Дата:
On Oct21, 2011, at 19:47 , Robert Haas wrote:
> On Fri, Oct 21, 2011 at 1:40 PM, Florian Pflug <fgp@phlo.org> wrote:
>> AFAIR, the performance hit we'd take by making the vacuum cutoff point
>> (i.e. GetOldestXmin()) global instead of database-local has been repeatedly
>> used in the past as an against against cross-database queries. I have to
>> admit that I currently cannot seem to find an entry in the archives to
>> back that up, though.

> I haven't seen anyone explain why they really need this feature
> anyway, and I think it's going in the wrong direction.  IMHO, anyone
> who wants to be doing cross-database queries should be using schemas
> instead, and if that's not workable for some reason, then we should
> improve the schema implementation until it becomes workable.  I think
> that the target use case for separate databases ought to be
> multi-tenancy, but what is needed there is actually more isolation
> (e.g. wrt/role names, cluster-wide visibility of pg_database contents,
> etc.), not less.

Agreed. I wasn't trying to argue for cross-database queries - quite the opposite,
actually. My point was more that since we've used database isolation as an
argument against cross-database queries in the past, we shouldn't sacrifice
it now for synchronized snapshots.

best regards,
Florian Pflug



Re: Synchronized snapshots versus multiple databases

От
Tom Lane
Дата:
Florian Pflug <fgp@phlo.org> writes:
> AFAIR, the performance hit we'd take by making the vacuum cutoff point
> (i.e. GetOldestXmin()) global instead of database-local has been repeatedly
> used in the past as an against against cross-database queries. I have to
> admit that I currently cannot seem to find an entry in the archives to
> back that up, though.

To my mind, the main problem with cross-database queries is that none of
the backend is set up to deal with more than one set of system catalogs.
        regards, tom lane


Re: Synchronized snapshots versus multiple databases

От
Robert Haas
Дата:
On Fri, Oct 21, 2011 at 2:06 PM, Florian Pflug <fgp@phlo.org> wrote:
> On Oct21, 2011, at 19:47 , Robert Haas wrote:
>> On Fri, Oct 21, 2011 at 1:40 PM, Florian Pflug <fgp@phlo.org> wrote:
>>> AFAIR, the performance hit we'd take by making the vacuum cutoff point
>>> (i.e. GetOldestXmin()) global instead of database-local has been repeatedly
>>> used in the past as an against against cross-database queries. I have to
>>> admit that I currently cannot seem to find an entry in the archives to
>>> back that up, though.
>
>> I haven't seen anyone explain why they really need this feature
>> anyway, and I think it's going in the wrong direction.  IMHO, anyone
>> who wants to be doing cross-database queries should be using schemas
>> instead, and if that's not workable for some reason, then we should
>> improve the schema implementation until it becomes workable.  I think
>> that the target use case for separate databases ought to be
>> multi-tenancy, but what is needed there is actually more isolation
>> (e.g. wrt/role names, cluster-wide visibility of pg_database contents,
>> etc.), not less.
>
> Agreed. I wasn't trying to argue for cross-database queries - quite the opposite,
> actually. My point was more that since we've used database isolation as an
> argument against cross-database queries in the past, we shouldn't sacrifice
> it now for synchronized snapshots.

Right, I agree.  It might be nice to take a cluster-wide dump that is
guaranteed to be transactionally consistent, but I bet a lot of people
would actually be happier to see us go the opposite direction - e.g.
give each database its own XID space, so that activity in one database
doesn't accelerate the need for anti-wraparound vacuums in another
database.  Not sure that could ever actually happen, but the point is
that people probably should not be relying on serializability across
databases too much, because the whole point of the multiple databases
feature is to have multiple, independent databases in one cluster that
are thoroughly isolated from each other, and any future changes we
make should probably lean in that direction.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Synchronized snapshots versus multiple databases

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Oct 21, 2011 at 11:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> 1. Restrict exported snapshots to be loaded only by transactions running
>> in the same database as the exporter. �This would fix the problem, but
>> it cuts out one of the main use-cases for sync snapshots, namely getting
>> cluster-wide-consistent dumps in pg_dumpall.

> I am unexcited by #2 on usability grounds.  I agree with you that #3
> might end up being a fairly small pessimization in practice, but I'd
> be inclined to just do #1 for now and revisit the issue when and if
> someone shows an interest in revamping pg_dumpall to do what you're
> proposing (and hopefully a bunch of other cleanup too).

Seems like that is the consensus view, so that's what I'll do.
        regards, tom lane


Re: Synchronized snapshots versus multiple databases

От
Simon Riggs
Дата:
On Fri, Oct 21, 2011 at 4:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I can see a few alternatives, none of them very pleasant:
>
> 1. Restrict exported snapshots to be loaded only by transactions running
> in the same database as the exporter.  This would fix the problem, but
> it cuts out one of the main use-cases for sync snapshots, namely getting
> cluster-wide-consistent dumps in pg_dumpall.

> 4. Somehow mark the xmin of a process that has exported a snapshot so
> that it will be honored in all DBs not just the current one.  The
> difficulty here is that we'd need to know *at the time the snap is
> taken* that it's going to be exported.  (Consider the scenario above,
> except that A doesn't get around to exporting the snapshot it took in
> step 3 until between steps 5 and 6.  If the xmin wasn't already marked
> as globally applicable when vacuum looked at it in step 5, we lose.)
> This is do-able but it will contort the user-visible API of the sync
> snapshots feature.  One way we could do it is to require that
> transactions that want to export snapshots set a transaction mode
> before they take their first snapshot.

1 *and* 4 please.

So, unless explicitly requested, an exported snapshot is limited to
just one database. If explicitly requested to be transportable, we can
use the snapshot in other databases.

This allows us to do parallel pg_dump in both 1+ databases, as well as
allowing pg_dumpall to be fully consistent across all dbs.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Synchronized snapshots versus multiple databases

От
Tom Lane
Дата:
Simon Riggs <simon@2ndQuadrant.com> writes:
> 1 *and* 4 please.

Given the lack of enthusiasm I'm not going to do anything about #4 now.
Somebody else can add it later.

> So, unless explicitly requested, an exported snapshot is limited to
> just one database. If explicitly requested to be transportable, we can
> use the snapshot in other databases.

Yeah, we could make it work like that when it gets added.
        regards, tom lane