Re: directory archive format for pg_dump

Поиск
Список
Период
Сортировка
От Joachim Wieland
Тема Re: directory archive format for pg_dump
Дата
Msg-id AANLkTim8vp_iEvzmWwnOG9--3m3sw3gr_zuMWyK31PkP@mail.gmail.com
обсуждение исходный текст
Ответ на Re: directory archive format for pg_dump  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Ответы Re: directory archive format for pg_dump  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Re: directory archive format for pg_dump  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> As soon as we have parallel pg_dump, the next big thing is going to be
> parallel dump of the same table using multiple processes. Perhaps we should
> prepare for that in the directory archive format, by allowing the data of a
> single table to be split into multiple files. That way parallel pg_dump is
> simple, you just split the table in chunks of roughly the same size, say
> 10GB each, and launch a process for each chunk, writing to a separate file.

How exactly would you "just split the table in chunks of roughly the
same size" ? Which queries should pg_dump send to the backend? If it
just sends a bunch of WHERE queries, the server would still scan the
same data several times since each pg_dump client would result in a
seqscan over the full table.

Ideally pg_dump should be able to query for all data in only one
relation segment so that each segment is scanned by only one backend
process. However this requires backend support and we would be sending
queries that we'd not want clients other than pg_dump to send...

If you were thinking about WHERE queries to get equally sized
partitions, how would we deal with unindexed and/or non-numerical data
in a large table?


Joachim


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [PATCH] V3: Idle in transaction cancellation
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: directory archive format for pg_dump