synchronize_seqscans' description is a bit misleading

Поиск
Список
Период
Сортировка
От Gurjeet Singh
Тема synchronize_seqscans' description is a bit misleading
Дата
Msg-id CABwTF4VwxS+jjT2RZSzHny5LArW+jFjFn5uiGH8cTRCXETGNag@mail.gmail.com
обсуждение исходный текст
Ответы Re: [DOCS] synchronize_seqscans' description is a bit misleading  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
If I'm reading the code right [1], this GUC does not actually *synchronize* the scans, but instead just makes sure that a new scan starts from a block that was reported by some other backend performing a scan on the same relation.

Since the backends scanning the relation may be processing the relation at different speeds, even though each one took the hint when starting the scan, they may end up being out of sync with each other. Even in a single query, there may be different scan nodes scanning different parts of the same relation, and even they don't synchronize with each other (and for good reason).

Imagining that all scans on a table are always synchronized, may make some wrongly believe that adding more backends scanning the same table will not incur any extra I/O; that is, only one stream of blocks will be read from disk no matter how many backends you add to the mix. I noticed this when I was creating partition tables, and each of those was a CREATE TABLE AS SELECT FROM original_table (to avoid WAL generation), and running more than 3 such transactions caused the disk read throughput to behave unpredictably, sometimes even dipping below 1 MB/s for a few seconds at a stretch.

Please note that I am not complaining about the implementation, which I think is the best we can do without making backends wait for each other. It's just that the documentation [2] implies that the scans are synchronized through the entire run, which is clearly not the case. So I'd like the docs to be improved to reflect that.

How about something like:

<doc>
synchronize_seqscans (boolean)
    This allows sequential scans of large tables to start from a point in the table that is already being read by another backend. This increases the probability that concurrent scans read the same block at about the same time and hence share the I/O workload. Note that, due to the difference in speeds of processing the table, the backends may eventually get out of sync, and hence stop sharing the I/O workload.

    When this is enabled, ... The default is on.
</doc>

Best regards,

[1] src/backend/access/heap/heapam.c
[2] http://www.postgresql.org/docs/9.2/static/runtime-config-compatible.html#GUC-SYNCHRONIZE-SEQSCANS

--
Gurjeet Singh

http://gurjeet.singh.im/

EnterpriseDB Inc.

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: SIGHUP not received by custom bgworkers if postmaster is notified
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [DOCS] synchronize_seqscans' description is a bit misleading