Re: Sync Scan update

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: Sync Scan update
Дата
Msg-id 1166553441.24294.30.camel@dogma.v10.wvs
обсуждение исходный текст
Ответ на Re: Sync Scan update  (Gregory Stark <stark@enterprisedb.com>)
Ответы Re: Sync Scan update  ("Jim C. Nasby" <jim@nasby.net>)
Список pgsql-hackers
On Tue, 2006-12-19 at 18:05 +0000, Gregory Stark wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
> 
> > Like to see some tests with 2 parallel threads, since that is the most
> > common case. I'd also like to see some tests with varying queries,
> > rather than all use select count(*). My worry is that these tests all
> > progress along their scans at exactly the same rate, so are likely to
> > stay in touch. What happens when we have significantly more CPU work to
> > do on one scan - does it fall behind??
> 
> If it's just CPU then I would expect the cache to help the followers keep up
> pretty easily. What concerns me is queries that involve more I/O. For example
> if the leader is doing a straight sequential scan and the follower is doing a
> nested loop join driven by the sequential scan. Or worse, what happens if the

That would be one painful query: scanning two tables in a nested loop,
neither of which fit into physical memory! ;)

If one table does fit into memory, it's likely to stay there since a
nested loop will keep the pages so hot.

I can't think of a way to test two big tables in a nested loop because
it would take so long. However, it would be worth trying it with an
index, because that would cause random I/O during the scan.

> leader is doing a nested loop and the follower which is just doing a straight
> sequential scan is being held back?
> 

The follower will never be held back in my current implementation.

My current implementation relies on the scans to stay close together
once they start close together. If one falls seriously behind, it will
fall outside of the main "cache trail" and cause the performance to
degrade due to disk seeking and lower cache efficiency.

I think Simon is concerned about CPU because that will be a common case:
if one scan is CPU bound and another is I/O bound, they will progress at
different rates. That's bound to cause seeking and poor cache
efficiency.

Although I don't think either of these cases will be worse than current
behavior, it warrants more testing.

Regards,Jeff Davis



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Gregory Stark
Дата:
Сообщение: Re: Sync Scan update
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Companies Contributing to Open Source