Re: Commits 8de72b and 5457a1 (COPY FREEZE)

Поиск
Список
Период
Сортировка
От Pavan Deolasee
Тема Re: Commits 8de72b and 5457a1 (COPY FREEZE)
Дата
Msg-id CABOikdPouU5mCTAZttnCnasxFGfpcrgGpEx5KWwyrZ_riYoENg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Commits 8de72b and 5457a1 (COPY FREEZE)  (Stephen Frost <sfrost@snowman.net>)
Список pgsql-hackers
On Mon, Dec 10, 2012 at 7:02 PM, Stephen Frost <sfrost@snowman.net> wrote:

>
> I continue to hold that this could end up being a slippery slope for us
> to go down wrt 'correctness' vs. 'do whatever the user wants'.  If we
> keep this to only COPY and where the table has to be truncated/created
> in the same transaction (which requires the user to have sufficient
> privileges to do non-MVCC-safe things on the table already), perhaps
> it's alright.  It'll definitely reduce the interest in finding a real
> solution though, which is unfortunate.
>

I wonder if something more complete can be done by forcing COPY FREEZE
or whatever we call it to take an exclusive lock on the table and run
loading as an append-only operation. By taking a strong lock, we will
block out any concurrent read/writes to the table. If an error occurs
while loading the data, the table will be truncated at the previously
recorded size. We may need some additional book keeping and WAL
logging to handle crash recovery.

To solve the visibility issue for old snapshots that should not see
the new data, we can store some additional visibility information in
the pg_class itself. For example, we can store the size of the table
before the COPY FREEZE started and the XID of the COPY FREEZE
operation. An old snapshot that can not see the XID, can not see the
tuples inserted in the new blocks either. Once the COPY FREEZE
finishes and the lock on the relation is released, new transactions
can start writing to the table and write past the old size of the
table. But that should be OK. If an old snapshot can't see the  tuples
inserted by COPY FREEZE, AFAIK it can't see any of those other tuples
as well.

I'm sure there will still be challenges with this approach. But I
wonder if we can guarantee correctness by proper use of
synchronization and still avoid multiple writes for most common data
loading scenarios.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: [PATCH] pg_upgrade -o/-O regression in 9.2.2
Следующее
От: Josh Kupershmidt
Дата:
Сообщение: allowing multiple PQclear() calls