Re: pg_dump --snapshot
От | Stephen Frost |
---|---|
Тема | Re: pg_dump --snapshot |
Дата | |
Msg-id | 20130506214515.GL4361@tamriel.snowman.net обсуждение исходный текст |
Ответ на | Re: pg_dump --snapshot (Simon Riggs <simon@2ndQuadrant.com>) |
Список | pgsql-hackers |
Simon, * Simon Riggs (simon@2ndQuadrant.com) wrote: > On 6 May 2013 19:35, Stephen Frost <sfrost@snowman.net> wrote: > > It certainly sounds interesting and I like the idea of it, but perhaps > > we need a different mechanism than just passing in a raw snapshot, to > > address the concerns that Tom raised. > > It does *not* pass in a raw snapshot. It wasn't my intent to impart anything more specific than what pg_export_snapshot() returns when I said 'raw snapshot'. What would you call it? Snapshot identifier? All I was trying to say is that I agree with Tom that pg_dump really needs more to happen than simply having the results of pg_export_snapshot() passed to it- pg_dump wants all the necessary locks taken immediately after the transaction opens and pg_export_snapshot() simply doesn't do that. > All it does is to allow pg_dump > to use an API that is already exposed by the backend for this very > purpose, one that has been in Postgres since 9.2. In doing so it opens a much larger hole through which this approach can break, namely that objects could disappear between the snapshot being taken and appropriate locks being set up. That issue already exists in pg_dump, but at least it's a small window through one code path- and it all happens before any parallelization or snapshot-sharing happens, as best as I can tell. If I understand correctly, right now we have: connect to DB start a transaction run around and grab locks get our snapshot ID fork, connect in, glob onto the same snapshot Assuming I'm following along correctly, this change would be: someone in a far off land creates the snapshot time passes.. then: connect to DB set the who-knows-how-old snapshot ID run around and try to grab locks fork, connect in, glob onto the same snapshot One thing that, I believe, addresses this a bit is that we should at least bomb out with an error while we're trying to acquire the locks, should an object be dropped between transaction start and when we go to lock it, right? We'll still see the old version of pg_class from the start of the transaction and therefore we'll try to lock everything from the older viewpoint...? For my part, I'm a bit less worried about error cases around this, provided that they're made very clear and that they're quite obvious to the end user, and very worried about us possibly missing some object that we were expected to capture. In any case, would a function along the lines of "pg_export_and_lock_snapshot()", which basically starts a transaction, acquires locks on every object in the DB, and then returns the snapshot ID, address this? Perhaps even pg_dump could use that, on the belief that it'd be faster for a backend function to acquire those locks than for pg_dump to do it anyway? I'm not sure that we want to support every different combination of filtering options that pg_dump supports for this, but we could start with the "all-database" option since that's, by definition, the largest set which could be requested by pg_dump. Or perhaps even a new lock type for this... I'd love to see the "locking-all-objects" portion of time disappear from my nightly backups.. Thanks, Stephen
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Tom LaneДата:
Сообщение: Re: [COMMITTERS] pgsql: Fix permission tests for views/tables proven empty by constraint