Re: old synchronized scan patch

Поиск
Список
Период
Сортировка
От Jim C. Nasby
Тема Re: old synchronized scan patch
Дата
Msg-id 20061208065938.GF44124@nasby.net
обсуждение исходный текст
Ответ на Re: old synchronized scan patch  ("Heikki Linnakangas" <heikki@enterprisedb.com>)
Список pgsql-hackers
On Thu, Dec 07, 2006 at 04:14:54PM +0000, Heikki Linnakangas wrote:
> >BTW, it seems to me that this is all based on the assumption that 
> >followers will have no problem keeping up with the pack leader.  Suppose 
> >my process does a lot of other processing and can't keep up with the 
> >pack despite the fact that it's getting all it's data from the buffer. 
> >Now we have effectively have two different seq scans going on.  Does my 
> >process need to recognize that it's not keeping up and not report it's 
> >blocks?
> 
> That's what I was wondering about all these schemes as well. What we 
> could do, is that instead of doing a sequential scan, each backend keeps 
> a bitmap of pages it has processed during the scan, and read the pages 
> in the order they're available in cache. If a backend misses a page in 
> the synchronized scan, for example, it could issue a random I/O after 
> reading all the other pages to fetch it separately, instead of starting 
> another seq scan at a different location and "derailing the train". I 
> don't know what the exact algorithm used to make decisions on when and 
> how to fetch each page would be, but the bitmaps would be in backend 
> private memory. And presumably it could be used with bitmap heap scans 
> as well.

First, I think that we're getting ahead of ourselves by worrying about
how to deal with diverging scans right now. Having said that...

There's 3 ranges of seqscan speeds we need to be concerned with [1],
with 2 break-points between them:

1) Seqscan is I/O bound; it can completely keep up with incoming blocks
-- Maximum single scan rate - the I/O system is saturated
2) Seqscan is CPU-bound if there's nothing else competing for I/O
-- Maximum double-scan rate - maximum rate that 2 scans in different places
can achieve.
3) Seqscan is slower than 2 competing category 1 scans are today

[1] I'm assuming only 1 slow scan and any number of fast scans. If there
are actually 2 slow scans, things will differ.

If every scan is running in the first category, then there is no issue.
Since all scans are completely I/O bound, they'll all process blocks as
soon as they're in from the drive.

If there is a scan in the 3rd category, we don't want to try and hold
faster scans down to the rate of the slow scan, because that's actually
slower than if we let a synchronized set of fast scans compete with the
slow scan, even though there's a lot of seeking involved.

The problem is if we have a scan in the 2nd category. As soon as it
falls far enough behind that the system is forced to issue physical
reads to disk for it, the performance of all the scans will plummet. And
as the slow scan falls further and further behind, seek times will get
longer and longer.

To put real numbers to this, on my machine, having 2 dd's running on one
file far enough apart that caching isn't helping cuts the rate from
~35MB/s to ~31MB/s (incidentally, starting a second dd about 10 seconds
after the first gives the second scan a rate of 40MB/s). So on my
machine, if a slow scan is doing more than 22MB/s, it's better to
restrict all scans to it's speed, rather than have 2 divergent scans.
OTOH, if the slow scan is doing less than 22MB/s, it's better not to
hold the fast scans back.

Of course, having people run that check themselves might not be terribly
practical, and if there's more than one slow scan involved those
measurements are likely to be meaningless. So I'm wondering if there's
some technique we can use that will make this selection process more
'automatic'. The only thought I've come up with is to apply a delay to
every read from the fast scans if there is a slow scan that's in danger
of falling past the cache size. My theory is that if the delay is set
right, then a category 2 scan will eventually catch back up, at which
point we could either just let it drive the scan, or we could
un-throttle the fast scans until the slow scan is in danger of falling
behind again.

If instead the slow scan is category 3, then even with the delay it will
still fall behind. If the database can detect that situation, it can
then un-throttle the fast scans and just let things diverge. Ideally, if
another category 3 scan came along, we would sync it to the existing cat
3 scan.

Unfortunately, that still means needing to come up with what that delay
setting should be. Perhaps if we're lucky, there's a pretty fixed
relationship between the maximum speed of a scan and the speed of two
competing scans. If that's the case, I think we could set the delay
to be a fraction of the average length of a scan.
-- 
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Pavel Stehule"
Дата:
Сообщение: Re: SQL/PSM implemenation for PostgreSQL (roadmap)
Следующее
От: "Jim C. Nasby"
Дата:
Сообщение: Re: Load distributed checkpoint